Class BaseTokenScanner
- java.lang.Object
-
- com.tangosol.coherence.dsltools.base.BaseTokenScanner
-
public class BaseTokenScanner extends Object
BaseTokenScanner gives clients a streaming api that returns a next BaseToken by processing either a java.lang.String or a java.io.Reader. Clients may process the the underlying Reader or String all at one time by using scan() which will a BaseToken that is typically return a SequenceBaseToken. This all at once conversion from Chars to Tokens is standard for very low level tokenizers. BaseTokenScanner also processes nested tokens (things between (..), {..}, etc.) into a sequence of token represented by a composit token. Nested tokenizing relieves a client parser of bracketing concerns. This nest processing comes from the Dylan tokenizer.- Author:
- djl 2009.03.02
-
-
Field Summary
Fields Modifier and Type Field Description protected char
m_chCurrent
The current working char.protected int
m_coffset
Offset of the current character.protected boolean
m_fIsEnd
The flag set when end of file is detected.protected int
m_iPos
The current position in the underlying Reader.protected int
m_iStartPos
The saved start position for a token.protected int
m_lineNumber
The current line number.protected Reader
m_reader
The underlying Reader holding the characters being tokenized.protected String
m_sNests
The characters used to begin nesting.protected String
m_sPunctuation
The characters used for punctuation.protected String
m_sUnnests
The characters used to end nesting.protected StringBuffer
m_tokenBuffer
The temporary buffer used to build the String representing a token.
-
Constructor Summary
Constructors Constructor Description BaseTokenScanner(Reader reader)
Construct a new BaseTokenScanner with the given Reader.BaseTokenScanner(String s)
Construct a new BaseTokenScanner with the given String.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
advance()
Advance to the next character.protected RuntimeException
floatingPointFormatError()
A problem has been detected in a floating point number.protected char
getCurrentChar()
Answer the current char of the underlying Readerboolean
isEnd()
Test whether the receiver has reached the end of the underlying Readerprotected boolean
isNest(char ch)
Test if the given char is a character that starts nesting/protected boolean
isPunctuation(char ch)
Test if the given char is a punctuation character.protected LiteralBaseToken
literalFloat()
A floating point literal has been detected so create.BaseToken
next()
Answer the next token from the underlying reader.protected char
nextChar()
Advance to the next character and return it.protected void
notePos()
Note the offset within the underlying Readervoid
reset()
Reset the receiver so that tokenizing may begin again.protected void
resetTokenString()
Reset the buffer used for holding the current tokenBaseToken
scan()
Tokenize the entire expression at once.protected BaseToken
scanIdentifier()
Attemt to tokenize an Identifier.protected BaseToken
scanLiteral()
Attemt to tokenize a literal.protected BaseToken
scanNest(char ch)
Tokenize the characters between the beginning nest chararcer and the character that ends the nest.protected boolean
scanOperator()
Attemt to tokenize an Operator.void
setNesting(String sNests, String sUnnests)
Set the Strings used as nesting characters.void
setPunctuation(String s)
Set the String used to determine punctuation characters.void
skipWhiteSpace()
Skip over any whitespace characters.protected void
takeCurrentChar()
Add the current character to buffer that builds the current token.protected void
takeCurrentCharAndAdvance()
Add the current character to buffer that builds the current token and advance.protected String
tokenString()
Convert the buffered characters into a String representing a token.
-
-
-
Field Detail
-
m_iPos
protected int m_iPos
The current position in the underlying Reader.
-
m_iStartPos
protected int m_iStartPos
The saved start position for a token.
-
m_sPunctuation
protected String m_sPunctuation
The characters used for punctuation.
-
m_sNests
protected String m_sNests
The characters used to begin nesting.
-
m_sUnnests
protected String m_sUnnests
The characters used to end nesting.
-
m_tokenBuffer
protected StringBuffer m_tokenBuffer
The temporary buffer used to build the String representing a token.
-
m_chCurrent
protected char m_chCurrent
The current working char.
-
m_reader
protected Reader m_reader
The underlying Reader holding the characters being tokenized.
-
m_fIsEnd
protected boolean m_fIsEnd
The flag set when end of file is detected.
-
m_coffset
protected int m_coffset
Offset of the current character.
-
m_lineNumber
protected int m_lineNumber
The current line number.
-
-
Constructor Detail
-
BaseTokenScanner
public BaseTokenScanner(String s)
Construct a new BaseTokenScanner with the given String.- Parameters:
s
- the string to be tokenized
-
BaseTokenScanner
public BaseTokenScanner(Reader reader)
Construct a new BaseTokenScanner with the given Reader.- Parameters:
reader
- the Reader that is the source of chars to be tokenized
-
-
Method Detail
-
scan
public BaseToken scan()
Tokenize the entire expression at once.- Returns:
- a BaseToken, typically a SequenceBaseToken
-
reset
public void reset()
Reset the receiver so that tokenizing may begin again.
-
next
public BaseToken next()
Answer the next token from the underlying reader.- Returns:
- the next token
-
skipWhiteSpace
public void skipWhiteSpace()
Skip over any whitespace characters.
-
isEnd
public boolean isEnd()
Test whether the receiver has reached the end of the underlying Reader- Returns:
- the boolean result
-
setPunctuation
public void setPunctuation(String s)
Set the String used to determine punctuation characters.- Parameters:
s
- the String of characters to use as punctuation
-
setNesting
public void setNesting(String sNests, String sUnnests)
Set the Strings used as nesting characters.- Parameters:
sNests
- the chars that start nestingsUnnests
- the chars that end nesting
-
isPunctuation
protected boolean isPunctuation(char ch)
Test if the given char is a punctuation character.- Parameters:
ch
- the char to test- Returns:
- the boolean result of punctuation testing
-
isNest
protected boolean isNest(char ch)
Test if the given char is a character that starts nesting/- Parameters:
ch
- the char to test- Returns:
- the boolean result of nest testing
-
scanNest
protected BaseToken scanNest(char ch)
Tokenize the characters between the beginning nest chararcer and the character that ends the nest.- Parameters:
ch
- the character that begins nesting- Returns:
- a NestedBaseTokens that holds the nested tokens
- Throws:
BaseTokenScannerException
- if we reach the end of stream before the matching end of nest character is reached
-
scanLiteral
protected BaseToken scanLiteral()
Attemt to tokenize a literal.- Returns:
- a LiteralBaseToken if one is found otherwise return null
-
literalFloat
protected LiteralBaseToken literalFloat()
A floating point literal has been detected so create.- Returns:
- the literal representation of a floating point number
-
scanIdentifier
protected BaseToken scanIdentifier()
Attemt to tokenize an Identifier.- Returns:
- an IdentifierBaseToken if one is found otherwise return null
-
floatingPointFormatError
protected RuntimeException floatingPointFormatError()
A problem has been detected in a floating point number. Signal an error.- Returns:
- the RuntimeException for float format errors
-
scanOperator
protected boolean scanOperator()
Attemt to tokenize an Operator.- Returns:
- an OperatorBaseToken if one is found otherwise return null
-
getCurrentChar
protected char getCurrentChar()
Answer the current char of the underlying Reader- Returns:
- the current char
-
advance
protected void advance()
Advance to the next character.
-
nextChar
protected char nextChar()
Advance to the next character and return it.- Returns:
- the next character
-
notePos
protected void notePos()
Note the offset within the underlying Reader
-
tokenString
protected String tokenString()
Convert the buffered characters into a String representing a token.- Returns:
- the String representing the buffered token
-
resetTokenString
protected void resetTokenString()
Reset the buffer used for holding the current token
-
takeCurrentChar
protected void takeCurrentChar()
Add the current character to buffer that builds the current token.
-
takeCurrentCharAndAdvance
protected void takeCurrentCharAndAdvance()
Add the current character to buffer that builds the current token and advance.
-
-