Class OPParser
- java.lang.Object
-
- com.tangosol.coherence.dsltools.precedence.OPParser
-
public class OPParser extends Object
OPParser is the parser class for the Top Down Operator Presidence Parsing framework. Top-down Operator Precedence Parsing is a technique invented by Vaughan Pratt in the 1970's. The fundamental idea is to associate semantics with tokens embodied in objects rather than using grammar rules embodied in procedures. The token's job is to transform pieces of syntax into Abstract Syntax Trees (AST). Each token has a "binding power" that determines operator precedence and two functions, "nud" (null denotation) and "led" (left denotation). The "nud" function is typically used to process tokens that have a role of "literal" or "variable". The "led" function is used for tokens that are typically in the role of an operator such as "+" or "*". The parsing process is driven by a very simple algorithm embodied in a method named "expression" which takes a right binding power as argument. The expression method starts by processing the first token (usually a literal or a variable) by invoking its "nud" method. The expression method then loops fetching the next token and building an AST by dispatching to the current token's "led" methods so long as the passed right binding power is less than the next token's left binding power. These "led" methods may eat as many tokens as is warranted by their semantics and they may make recursive calls back to the parsers expression method using their binding power as the right binding power argument. For example, to parse the expression "1 + 2", the parser's expression method would call "nud" on the token for the literal "1". The "nud" of a literal simply returns an atomic term. Then the parser checks if the right binding power is less than the left binding power of the next token and if true, it calls "led" on that next token. In this case the right binding power is 0 and the binding power of the "+" operator token is 50. So the parser calls "led" on the "+" operator token passing itself and the left AST node term ("1"). The "led" implementation of the "+" operator recursively calls the parser's expression method with a right binding power of 50. The expression calls "nud" on the next token, the literal token "2" which returns an atomic term for "2". The loop in expression is avoided because the next token is a special token used to represent the end of stream. It has a left binding power of -1 which is lower than any possible right binding power. Therefore, the parser returns the atomic term "2". The caller ("+" operator "led") returns an AST term for the "+" operator that has left and right terms. In this case the left term is the atomic "1" and the right is the atomic "2". Had we been parsing a slightly more complicated example such as "1 + 2 * 3" when the "led" for "+" recursively called expression with its binding power of 50 as the right binding power then the three tokens "2", "*", and "3" will be consumed because "*" has a left binding power of 60 which lets expression's loop calls the "led" of "*" ultimately returning an AST that represents "*(2,3) "to be used as the right side term for "+".- Author:
- djl 2009.03.14
- See Also:
OPToken
-
-
Constructor Summary
Constructors Constructor Description OPParser(OPScanner scanner)
Construct a new OPParser initialized to gthe given ScannerOPParser(Reader reader, TokenTable table, Set<CharSequence> setOperators)
Construct a new OPParser that parses the given Reader in the language defined by the given TokenTable.OPParser(String sExper, TokenTable table, Set<CharSequence> setOperators)
Construct a new OPParser that parses the given expression in the language defined by the given TokenTable.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Term
expression(int nRightBindingPower)
Parse the next expression into an Abstract Syntax Tree using the given right binding power.OPScanner
getScanner()
Obtain the OPScanner that backs this parserTerm[]
nodeList()
Parse a comma separated sequence of expressions to the of the tokens.Term[]
nodeList(String sEndMarker)
Parse a comma separated sequence of expressions upto the given end marker.Term[]
nodeList(String sEndMarker, boolean fEndStreamAllowed)
Parse a comma separated sequence of expressions upto the given end marker.<T extends Term>
Tparse()
Parse the next expression into an Abstract Syntax Tree.Term[]
readNestedCommaSeparatedList(NestedBaseTokens nest)
Build an array of ASTNodes by processing the this tokens nest as a comma separated list.
-
-
-
Constructor Detail
-
OPParser
public OPParser(String sExper, TokenTable table, Set<CharSequence> setOperators)
Construct a new OPParser that parses the given expression in the language defined by the given TokenTable.- Parameters:
sExper
- string expression to parsetable
- the TokenTable that defines the language to parsesetOperators
- the operators this parser will use
-
OPParser
public OPParser(Reader reader, TokenTable table, Set<CharSequence> setOperators)
Construct a new OPParser that parses the given Reader in the language defined by the given TokenTable.- Parameters:
reader
- Reader with the chars to parsetable
- the TokenTable that defines the language to parsesetOperators
- the operators this parser will use
-
OPParser
public OPParser(OPScanner scanner)
Construct a new OPParser initialized to gthe given Scanner- Parameters:
scanner
- OPScanner that is already initialized
-
-
Method Detail
-
getScanner
public OPScanner getScanner()
Obtain the OPScanner that backs this parser- Returns:
- the OPScanner that acts as a token source
-
parse
public <T extends Term> T parse()
Parse the next expression into an Abstract Syntax Tree.- Type Parameters:
T
- the term type- Returns:
- an ASTNode that represents the structured meaning of the expression being parsed
-
expression
public Term expression(int nRightBindingPower)
Parse the next expression into an Abstract Syntax Tree using the given right binding power. Parsing will continue so long as the right binding power is less than the current tokens left binding power. This is the central algorithm of Vaughan Pratt's Top Down Operator Precedence Parser.- Parameters:
nRightBindingPower
- defines how strong token must bind to the left before halting parsing- Returns:
- an ASTNode that represents the structured meaning of the expression being parsed
-
nodeList
public Term[] nodeList(String sEndMarker)
Parse a comma separated sequence of expressions upto the given end marker. Return an array of ASTNodes- Parameters:
sEndMarker
- defines the symbol that ends the sequence- Returns:
- an ASTNode array that represents the sequence
-
nodeList
public Term[] nodeList(String sEndMarker, boolean fEndStreamAllowed)
Parse a comma separated sequence of expressions upto the given end marker. Given flag controll if end of stream overides the end marker. Return an array of ASTNodes- Parameters:
sEndMarker
- defines the symbol that ends the sequencefEndStreamAllowed
- flag to overide testing for sEndMarker if at the end of stream- Returns:
- an ASTNode array that represents the sequence
-
nodeList
public Term[] nodeList()
Parse a comma separated sequence of expressions to the of the tokens. Return an array of ASTNodes- Returns:
- an ASTNode array that represents the sequence
-
readNestedCommaSeparatedList
public Term[] readNestedCommaSeparatedList(NestedBaseTokens nest)
Build an array of ASTNodes by processing the this tokens nest as a comma separated list. of BaseTokens.- Parameters:
nest
- the nest of BaseTokens to process- Returns:
- an array of AstNodes
-
-