com.tangosol.coherence.dsltools.precedence.OPParser

public class OPParser extends Object

OPParser is the parser class for the Top Down Operator Presidence Parsing framework. Top-down Operator Precedence Parsing is a technique invented by Vaughan Pratt in the 1970's. The fundamental idea is to associate semantics with tokens embodied in objects rather than using grammar rules embodied in procedures. The token's job is to transform pieces of syntax into Abstract Syntax Trees (AST). Each token has a "binding power" that determines operator precedence and two functions, "nud" (null denotation) and "led" (left denotation). The "nud" function is typically used to process tokens that have a role of "literal" or "variable". The "led" function is used for tokens that are typically in the role of an operator such as "+" or "*". The parsing process is driven by a very simple algorithm embodied in a method named "expression" which takes a right binding power as argument. The expression method starts by processing the first token (usually a literal or a variable) by invoking its "nud" method. The expression method then loops fetching the next token and building an AST by dispatching to the current token's "led" methods so long as the passed right binding power is less than the next token's left binding power. These "led" methods may eat as many tokens as is warranted by their semantics and they may make recursive calls back to the parsers expression method using their binding power as the right binding power argument. For example, to parse the expression "1 + 2", the parser's expression method would call "nud" on the token for the literal "1". The "nud" of a literal simply returns an atomic term. Then the parser checks if the right binding power is less than the left binding power of the next token and if true, it calls "led" on that next token. In this case the right binding power is 0 and the binding power of the "+" operator token is 50. So the parser calls "led" on the "+" operator token passing itself and the left AST node term ("1"). The "led" implementation of the "+" operator recursively calls the parser's expression method with a right binding power of 50. The expression calls "nud" on the next token, the literal token "2" which returns an atomic term for "2". The loop in expression is avoided because the next token is a special token used to represent the end of stream. It has a left binding power of -1 which is lower than any possible right binding power. Therefore, the parser returns the atomic term "2". The caller ("+" operator "led") returns an AST term for the "+" operator that has left and right terms. In this case the left term is the atomic "1" and the right is the atomic "2". Had we been parsing a slightly more complicated example such as "1 + 2 * 3" when the "led" for "+" recursively called expression with its binding power of 50 as the right binding power then the three tokens "2", "*", and "3" will be consumed because "*" has a left binding power of 60 which lets expression's loop calls the "led" of "*" ultimately returning an AST that represents "*(2,3) "to be used as the right side term for "+".

Author:

djl 2009.03.14

See Also:

Constructor Summary

Constructors

Constructor

Description

OPParser(OPScanner scanner)

Construct a new OPParser initialized to gthe given Scanner

OPParser(Reader reader, TokenTable table, Set<CharSequence> setOperators)

Construct a new OPParser that parses the given Reader in the language defined by the given TokenTable.

OPParser(String sExper, TokenTable table, Set<CharSequence> setOperators)

Construct a new OPParser that parses the given expression in the language defined by the given TokenTable.
Method Summary

Modifier and Type

Method

Description

Term

expression(int nRightBindingPower)

Parse the next expression into an Abstract Syntax Tree using the given right binding power.

OPScanner

getScanner()

Obtain the OPScanner that backs this parser

Term[]

nodeList()

Parse a comma separated sequence of expressions to the of the tokens.

Term[]

nodeList(String sEndMarker)

Parse a comma separated sequence of expressions upto the given end marker.

Term[]

nodeList(String sEndMarker, boolean fEndStreamAllowed)

Parse a comma separated sequence of expressions upto the given end marker.

<T extends Term> T

parse()

Parse the next expression into an Abstract Syntax Tree.

Term[]

readNestedCommaSeparatedList(NestedBaseTokens nest)

Build an array of ASTNodes by processing the this tokens nest as a comma separated list.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- OPParser
  
  public OPParser(String sExper, TokenTable table, Set<CharSequence> setOperators)
  
  Construct a new OPParser that parses the given expression in the language defined by the given TokenTable.
  
  Parameters:
  
  sExper - string expression to parse
  
  table - the TokenTable that defines the language to parse
  
  setOperators - the operators this parser will use
- OPParser
  
  public OPParser(Reader reader, TokenTable table, Set<CharSequence> setOperators)
  
  Construct a new OPParser that parses the given Reader in the language defined by the given TokenTable.
  
  Parameters:
  
  reader - Reader with the chars to parse
  
  table - the TokenTable that defines the language to parse
  
  setOperators - the operators this parser will use
- OPParser
  
  public OPParser(OPScanner scanner)
  
  Construct a new OPParser initialized to gthe given Scanner
  
  Parameters:
  
  scanner - OPScanner that is already initialized
Method Details
- getScanner
  
  public OPScanner getScanner()
  
  Obtain the OPScanner that backs this parser
  
  Returns:
  
  the OPScanner that acts as a token source
- parse
  
  public <T extends Term> T parse()
  
  Parse the next expression into an Abstract Syntax Tree.
  
  Type Parameters:
  
  T - the term type
  
  Returns:
  
  an ASTNode that represents the structured meaning of the expression being parsed
- expression
  
  public Term expression(int nRightBindingPower)
  
  Parse the next expression into an Abstract Syntax Tree using the given right binding power. Parsing will continue so long as the right binding power is less than the current tokens left binding power. This is the central algorithm of Vaughan Pratt's Top Down Operator Precedence Parser.
  
  Parameters:
  
  nRightBindingPower - defines how strong token must bind to the left before halting parsing
  
  Returns:
  
  an ASTNode that represents the structured meaning of the expression being parsed
- nodeList
  
  public Term[] nodeList(String sEndMarker)
  
  Parse a comma separated sequence of expressions upto the given end marker. Return an array of ASTNodes
  
  Parameters:
  
  sEndMarker - defines the symbol that ends the sequence
  
  Returns:
  
  an ASTNode array that represents the sequence
- nodeList
  
  public Term[] nodeList(String sEndMarker, boolean fEndStreamAllowed)
  
  Parse a comma separated sequence of expressions upto the given end marker. Given flag controll if end of stream overides the end marker. Return an array of ASTNodes
  
  Parameters:
  
  sEndMarker - defines the symbol that ends the sequence
  
  fEndStreamAllowed - flag to overide testing for sEndMarker if at the end of stream
  
  Returns:
  
  an ASTNode array that represents the sequence
- nodeList
  
  public Term[] nodeList()
  
  Parse a comma separated sequence of expressions to the of the tokens. Return an array of ASTNodes
  
  Returns:
  
  an ASTNode array that represents the sequence
- readNestedCommaSeparatedList
  
  public Term[] readNestedCommaSeparatedList(NestedBaseTokens nest)
  
  Build an array of ASTNodes by processing the this tokens nest as a comma separated list. of BaseTokens.
  
  Parameters:
  
  nest - the nest of BaseTokens to process
  
  Returns:
  
  an array of AstNodes

Class OPParser

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

OPParser

OPParser

OPParser

Method Details

getScanner

parse

expression

nodeList

nodeList

nodeList

readNestedCommaSeparatedList