Gray is a new parsing algorithm and related tooling for exploring ambiguity, recursion, ordering and associativity based on Multi-Ordered Grammars. After succefully building a recognizer, you can use ASL (the Abstract Syntax Language) to easily define the executional and structural semantics of your language.
Here is the full list of MOG operators supported by Gray and their relationship with other formalisms:
|* , + , ()||MOG,RE,EBNF,PEG||Standard semantics inspired by reg-expressions for expressing zero/one-or-more relationships and grouping.|
||||MOG,BNF,EBNF||Unordered (not-deterministic) choice. In MOGs unordered choice can be tainted (see below).|
|||||MOG (scoped), PEG||Ord. choice. Appears as / in PEGs. In MOGs this operator also starts a new recursive scope for dealing with recursive ordering. All ordered rules in MOGs taint (ie order) their alternatives.|
|/ , \||MOG-only||Recursive Ord. choice (MOGs only) comes in two flavors: self-recursive (/) and simply-recursive (\). Has ordering semantics similar to || unless the rule is recursively invoked. In this case parsing continues from previously seen (self-recursive) or next (simply recursive) alternatives.|
|& , !||MOG,PEG||Recognize but do not consume operators. Eg. Determine if A is/not followed by B.|
Here is how you can parse a simple expression language using Gray, both in unordered and mixed parsing modes, to explore ambiguity:
parseview: will allow you to navigate all possible parsing trees, in order to experiment with different ordering operators and disambiguate:
For example this is the unambiguous MOG solution to the expression language (notice the recursive and scoped) ordering:
And the resultive parse-tree that you can check with
You are now ready to produce an intermediate executable representation for your expression language as follows:
Here is what the intermediate representation (AST) that the above ASL rules produce looks like (for the Pharo platform in this case):
Apart from exploring, visualizing (
parseview:) and generating ASTs with your parser (
parseval:), you can also go ahead and execute lively the results (
compileval:) and test them:
Of course you can express the same grammar and parser in a more organized way by subclassing the MOGRecognizer:
Then subclass the recognizer to define your parser through ASL actions:
Explore, Visualize, Generate, Live-Execute and Test your class-based parsers as you did previously:
Multiple sources of ambiguity
Your grammars may exhibit multiple sources of ambiguity that you need to handle. The MOG and Gray workflow allows you to incrementally disambiguate as before in the simpler expression case.
Here is an example from a simple procedural language (Calc), exhibiting both the expression and the dangling-else ambiguities:
Notice how easy it is to define the conditional semantics of Calc with ASL, and live-test the resulting byte-compiled language:
This is a variation of the Collatz sequence computation in Calc. It is specifically designed to test the correct executional semantics of the dangling-else cases.
The ASL Postcard
We will see more examples of ASL later on, but for now take a closer look at the classic postcard example of Smalltalk and its equivalent representation in ASL, facilitating compile-time meta-programming:
The combination of MOG rules with ASL is quite powerfull, allowing you to describe real-word language semantics with clarity. Here is how you can describe the syntax and semantics of msg-sends in Pharo using MOGs and ASL:
The same rule is cumbersome to express in other formalisms (CFGs, PEGs or hand-written top-down parsers), since the complicated ordering of msg-sends has to be manually hard-coded (this is the case for eg for other Smalltalk parsers build with SmaCC/LALR, PetitParser/PEG or RBParser/Hand-written recursive-descent).
Here is the Gray algorithm exploring, parsing, byte-compiling and syntax-highlighting Pharo itself:
A note on Gray’s implementation:
Internally Gray starts with a parsing base similar to that of a 3-op Earley parser (scan, predict, complete) extended with standard EBNF operators (+,*,()) and the empty rule. On top of this chart base we implement two additional operations (backtrack and fork) to handle the scoped ordered choice (||) and two lookahead operators (& , !). Then finally we override the standard predict operators to accommodate for recursive ordering (/ , \) and the mixing of order and unordered choice. To optimize scanning and memoization, we precompute all first,follow and predict sets to pre-filter unwanted alternatives.