Extending Yelt

Yelt Home Page
The Yelt streaming text editor is based on SED's behavior but not on its implementation.

Yelt is a C++ program written entirely from scratch -- except for the use of the GNU regular expression package. It was written both to be useful and to be made publicly available. Please see its copyright notice

Yelt's behavior is relative straightforward:

  1. yelt executes a user defined script on all of its input files
  2. the script can be defined in several ways, but all of them ultimately result in a single text string which is then parsed into the executable form of the script.
Most of yelt's source code is found in the file, main.cpp, in the yelt distribution. This file is an intermingling of the following basic types of C++ code: is an intermingling of the following basic types of C++ code:

int main(int, char**) { ... } The main function of the yelt program whose job is to interpret command line arguments.
parseXYZStatement() { ... } functions that parse script language statements and return pointers to objects derived from class Statement.
struct XYZStatement
: public Statement {...}
structs which override the generic class Statement's virtual methods to actually implement the work of specific kind of statement
helper() {...} helper functions used by the parsers or the statement implementation
string stringRegisters[10]; various global variables

Adding a new Command

To add a new command, you need to do three things:
  1. derive a new statement type from struct Statement and implement its execute_v(), output() methods. You may need to add constructors or other help functions.
  2. create a new parseXYZStatement() function that gobbles up the text of the statement out of the script string presented to it.
  3. modify the parseStatement() function to have it call your new parseXYZStatement() method and return your new Statement type.
Note that when implementing the execute_v() method of a statement, the stringRegisters[] array is a global variable -- so you just use the variables as you see fit.

The best approach for creating new statements is to find an existing statement which is very similar in syntax to the one you want to add and clone it.

When writing statements, do not demand that they end in a semicolon -- but if one does occur, treat that as the end of the statement. Also, check for open/close braces -- but don't demand that they exist, just stop if you find them.

Also be very picky about the syntax -- it will save everyone time and trouble later -- and explain in great detail what the user did wrong. See the many parse statements for examples on how to print the errors in the standard way.

When modifying the parseStatement() method, remember that this is a trivial language designed for terse statement syntax. Because of this, the parseStatement() function often makes decisions based only on the first letter of statement name. When you add a new statement, you can use multi-character statement types, but you need to be careful about where you put your statement detection logic: if you name your new statement "tumble", for example, you will need to make sure the test for this keyword is done before the test for the 't' statement in parseStatement() otherwise it will hide your new statement.

Also, it is slightly faster to compare the first character of a statement separately before comparing the whole text of a statement name, so do this:


 if( (firstWordChar == 't') && (token == tumbleStatementName) )
  {
     return parseTumbleStatement(token, first, last);
  }

where tumbleStatementName is a string defined near the top of the file, somewhere.

Implementing commands

The main task of implementing new yelt commands is implementing the execute_v() method of a Statement structure. It is, of course, also important that you parse the statement syntax and that you properly initialize the statement object. Most statements have one or more integers in their constructor invocations that define which registers the statement will manipulate.

The execute_v() method has the following parameters:

  1. istream &s
  2. WhileContext *wc
When implement statements which have nested statements, make sure that you send these two parameters to the child statements when you execute them -- see WhileStatement, BlockStatement, FStatement, etc.

The input stream reference should be just passed through to child statements, if any. Normally it should not be manipulated at all by normal statements. It is currently modified only by the n command. If you read from this stream, you will mess up the current line number variable for the stream -- so if you MUST create a new command that reads from the stream, will want to figure out how to keep the number correct. The line number is stored in a a global variable named "inputLineNumber".

The WhileContext object might theoretically be usable. Right now it has only one use -- the two pattern version of the patternConditionalStatement keeps a data element associated with the while loop of which it is a part. The WhileContext is basically the scope of that data element. When the loop exits and is re-entered, a new initialization of the variable occurs.

If you want to invent a syntax that implements true curly brace local variables, you'll have to implement not a single WhileContext object but rather a search path for these variables with the closest nested curly brace block being the top of the stack for the search. This is really not consistent with yelt's overall syntax though.

Most of the time, the execute_v() method should return "ok". This enumeration value is defined in the Statement class. It basically means that this statement does not intend to terminate the script or the while loop in which the statement is executed. The only non-ok returning statements should be:

    Q -- terminates the function or script in which it is executed
    b -- breaks out the current while loop
    d -- continues the current while loop without executing any more statements

Regular Expression Tools

Yelt is implemented in terms of the GNU regular expression library. However, all calls to it are completely incapsulated in the SimpleRegex class. If you want to get rid of the GPL dependence, and you have another toolkit available, you need only modify the implementation of SimpleRegex.cpp and .h. You have my blessing!