A Sequence_Map is a collection of related sequences of T objects. Sequences supported by a Sequence_Map can not contain optional members. The template argument, H is the name of a class object with an operator()(T) that produces and integer value which is a 'hash' of T. More...
#include <sequence_map.h>
Classes | |
struct | H1 |
struct | node |
class | search |
A 'search' is a context for comparing an arbitrary sequence with those stored in the Sequence_Map. The idea of course is to determine whether or not a given sequence exactly matches one found in the map -- and also what the ID of that sequence is. A search is a kind of iterator that scans through the structure of the Sequence_Map. You have to call operator++ to get to the next location in the sequence. If you want to know if you are sitting at the end of the sequence, call finished. A search does not exactly model an iterator so it has no comparison functions, and the Sequence_Map has no begin or end methods. More... | |
class | sequence |
A sequence is a handle object that lets you create patterns in the tree You construct a sequence object given a Sequence_Map object. You then add members to the sequence. When you are done, you call the complete() function to see if this is a valid (complete) sequence or if you have mistakenly create a subsequence of an existing sequence. More... | |
Public Types | |
typedef hash_list< node *, ID, H1 > | id_map_type |
storage type for complete sequence ID's. | |
typedef hash_list< T, node *, H > | rest_type |
description of the rest of a given sequence assuming you are already in the sequence somewhere | |
typedef Sequence_Map< T, ID, H > | self |
helper typedef that refers to the sequence map | |
Public Member Functions | |
template<class Iterator > | |
bool | find (Iterator first, Iterator last, ID *id) |
Compare a range to see if is an exact match for an named pattern. If so, return true and the id of the matched pattern. If not return false, and don't change the id. | |
bool | find (T const *first, size_t count, ID *id) |
Compare an array to see if it is complete sequence and get the sequence id. | |
Sequence_Map () | |
Construct an empty sequence map. | |
~Sequence_Map () | |
The _destructor_! What will they think of next! | |
Private Member Functions | |
void | destroy_tree () |
private method that recursively frees up all memory used by the tree. | |
Private Attributes | |
id_map_type | ids_ |
list of all sequences in this sequence map | |
rest_type | root_ |
the list of T's that begin sequences | |
Friends | |
class | search |
class | sequence |
A Sequence_Map is a collection of related sequences of T objects. Sequences supported by a Sequence_Map can not contain optional members. The template argument, H is the name of a class object with an operator()(T) that produces and integer value which is a 'hash' of T.
Each Sequence has a given name, specified by the ID class. One purpose of a sequence map is to provide you with a mechanism for analyzing sequences of T's to determine which named sequence it maches, if any. That is, you define a bunch of sequence of T's. You identify each sequence by giving it a unique ID. You can then compare any random sequence to see if it has a match in the Sequence_Map and if so, what is its ID.
You can also use the Sequence_Map to aid in parsing random sequences and bomb out early when you realize that you don't have a valid sequence. A Sequence_Map can help you implement syntax directed parsing. For example, suppose you are parsing output from a computer program and you want to detect and act upon certain strings but not all. The emacs gud-mode does this. A sequence map can help with this. Another use is in parsing function keys from terminals. Sometimes unix buffering causes function key sequences to be split across multiple 'read()' calls. You can't guarantee that you'll get all 5 characters in the f12 sequence, for example, in a single read() operation. Yet you don't want to hard read for 5 characters all the time because the key sequences are often shorter than 5 characters -- typically they are 1, 3, or 5 characters long, but not always.
The design for Sequence_Map is roughly based on a Huffman coding scheme -- although Huffman would probably spin in his grave if he saw how big and complex this mechanism is -- he was interested in compressing bit patterns for com networks. Here we are interested in a simple, universally applicable interface that provides often need tools for simple pattern recognition.
The algorithmic complexity of a Sequence_Map is as follows: Once the sequences are inserted into the map, searching the map for a given item in the sequence is O(1). That is, if have N items in your test sequence, it will take roughly O(N) steps to validate whether or not it is a good sequence -- and get its ID value. Inserting a new sequence into the map is approximately O(N) where N is the length of the sequence.
Here's how you use a sequence map:
To perform a search, you ask the Sequence_Map for an iterator like object which contains the context of a search. You can then iterate over your test sequence, one item at a time, and ask the context object if this is a member of any valid sequence in the sequence map. It will give you the following information:
Here is an example:
struct H { int operator() (int t) { return t; } }; // int hasher typedef Sequence_Map<char,int,H> SEQMAP; // or accept the default hasher which is // Hasher<char> typedef SM::sequence SEQUENCE; typedef SM::search SEARCH; SEQMAP tree; SEQUENCE s1(tree, 1); // fred s1 += 'f'; s1 += 'r'; s1 += 'e' ; s1 += 'd'; if(! s1.complete() ) error("subsequence of longer sequence"); SEQUENCE s2(tree, 2); // frat s2 += 'f'; s2 += 'r'; s2 += 'a' ; s2 += 't'; if(! s2.complete() ) error("subsequence of longer sequence"); // tree now has 2 sequences in it. int id; bool found = tree.find("frat", 4, &id); // id is 2, found is true char const *search="fred"; char const *scan = search; SEARCH c(tree); while(*scan != 0 && c(*scan)) { ++scan; ++c; } if(*scan == 0 && c.finished(&id) ) { // id will have 1 in it because the context, c, indicates // that the subsequence has been completed. }
Notes:
Definition at line 48 of file sequence_map.h.
typedef hash_list<node*,ID,H1> id_map_type |
storage type for complete sequence ID's.
Definition at line 216 of file sequence_map.h.
description of the rest of a given sequence assuming you are already in the sequence somewhere
data structure representing the rest of a all possible sequences after the current sub-sequence of T's.
Definition at line 185 of file sequence_map.h.
typedef Sequence_Map<T,ID,H> self |
helper typedef that refers to the sequence map
Definition at line 183 of file sequence_map.h.
Sequence_Map | ( | ) |
Construct an empty sequence map.
Definition at line 218 of file sequence_map.h.
~Sequence_Map | ( | ) |
The _destructor_! What will they think of next!
Definition at line 222 of file sequence_map.h.
void destroy_tree | ( | ) | [private] |
private method that recursively frees up all memory used by the tree.
Definition at line 511 of file sequence_map.h.
bool find | ( | Iterator | first, | |
Iterator | last, | |||
ID * | id | |||
) |
Compare a range to see if is an exact match for an named pattern. If so, return true and the id of the matched pattern. If not return false, and don't change the id.
Definition at line 255 of file sequence_map.h.
bool find | ( | T const * | first, | |
size_t | count, | |||
ID * | id | |||
) |
Compare an array to see if it is complete sequence and get the sequence id.
Definition at line 226 of file sequence_map.h.
friend class search [friend] |
Definition at line 506 of file sequence_map.h.
friend class sequence [friend] |
Definition at line 505 of file sequence_map.h.
id_map_type ids_ [private] |
list of all sequences in this sequence map
Definition at line 509 of file sequence_map.h.
the list of T's that begin sequences
Definition at line 508 of file sequence_map.h.