T-gen a Translator Generator for Smalltalk originally developed under the supervision of Justin Graver. ***WARNING*** This constitues a quick and dirty first port, and all that that entails. The code passes the basic regression tests, but has by no means been looked at carefully. When in doubt, minimal changes were made to the original sources. T-gen depends upon ANSI exception support, so it is expected to work only with Squeak 2.6 and beyond. Original documentation is in postscript in the file, usersGuide.ps. This is a good read, has references and should be printed out by the interested reader. Overview --------------- T-gen is a tool for the automatic generation of string to object translators. It supports the generation of top down parsers, LL(1), as well as bottom up parsers, SLR(1) LALR(1) LR(1). T-gen is able to transform a context free grammar (CFG) into a parser that recognizes input that conforms to that grammar. A grammar is specified with a lexical specification that describes the terminals of the language and a grammar specification that describes the language productions. The lexical specifications (regular expressions) are translated into a scanner class that recognizes tokens. The grammar specification (productions) are translated into a parser class that recognizes the specified language. The generated parser collaborates with a tree builder to create derivations trees, abstract syntax trees (ASTs) or any structured object the builder knows how to construct. T-gen comes with four tree builders. The first builds a direct derivation tree based upon the order in which grammatical constructs are recognized. The second does the same, but outputs a trace of the parsers actions to the T-gen transcript. The last two build an AST which provides a "simpler" syntax tree from the source. Installation ------------------ 1. File in tgen.14Nov1237pm.cs into a 2.6 or later image. 2. Copy the files exampleN.* to your squeak directory. 3. To verify installation, open a Transcript window and in a workspace evaluate. TranslatorGenerator runAllTests Quick Start ------------------ To begin exploration . . . In a morphic world, execute TGenUI open A SystemWindow will be opened upon a translator generator. It will attempt to build the "simplest" compiler that handles the given context free grammar. Tgen considers top down parsers to be "simpler" than bottom up parsers. Tgen will also attempt to refactor ambiguous grammars into unambiguous grammars. The layout of the interface closely resembles that of the original Smalltalk implementation. From the right button menu in the top right text morph, choose 'load specs'. When prompted for the name of a translator, enter 'example10'. (without the quotes) This will load a token spec in the upper left, a grammar spec in the lower left and sample input in the lower right. The text in each pane will be accepted in turn causing a new parser to be generated and the input will be parsed. Inspect the result and leave the inspector open. Push the button in the upper right (currently labeled, derivation) and select 'sham AST'. Add a space to the input text and accept the change. This will parse the text with a sham AST tree builder. The sham AST builder will use the compiler directives enclosed in curly braces. Inspect the result. Compare the raw derivation tree to the sham AST. This port is being released early, and in an unfinished state in order to get the bulk of the code in as many hands as possible. This is a great tool for students studying traditional compiler construction. Tasks left to be done. ------------------------- 1. Build a modern UI. Maybe a book that lets the user flip through the relevant text morphs. 2. Much of this port was done in a hurry and currently "appears to work". All of the code needs a thorough going over. A few key sections could use a judicious rewrite. 3. Need to factor the release into a runtime and development change set. This will be more important after the port has been properly validated. Last but not least, welcome to the world of language hacking.