Class BaseTokenScanner
java.lang.Object
com.tangosol.coherence.dsltools.base.BaseTokenScanner
BaseTokenScanner gives clients a streaming api that returns a next
BaseToken by processing either a java.lang.String or a java.io.Reader.
Clients may process the the underlying Reader or String all at one time by
using scan() which will a BaseToken that is typically return a
SequenceBaseToken. This all at once conversion from Chars to Tokens is
standard for very low level tokenizers. BaseTokenScanner also processes
nested tokens (things between (..), {..}, etc.) into a sequence of token
represented by a composit token. Nested tokenizing relieves a client
parser of bracketing concerns. This nest processing comes from the Dylan
tokenizer.
- Author:
- djl 2009.03.02
-
Field Summary
Modifier and TypeFieldDescriptionprotected char
The current working char.protected int
Offset of the current character.protected boolean
The flag set when end of file is detected.protected int
The current position in the underlying Reader.protected int
The saved start position for a token.protected int
The current line number.protected Reader
The underlying Reader holding the characters being tokenized.protected String
The characters used to begin nesting.protected String
The characters used for punctuation.protected String
The characters used to end nesting.protected StringBuffer
The temporary buffer used to build the String representing a token. -
Constructor Summary
ConstructorDescriptionBaseTokenScanner
(Reader reader) Construct a new BaseTokenScanner with the given Reader.Construct a new BaseTokenScanner with the given String. -
Method Summary
Modifier and TypeMethodDescriptionprotected void
advance()
Advance to the next character.protected RuntimeException
A problem has been detected in a floating point number.protected char
Answer the current char of the underlying Readerboolean
isEnd()
Test whether the receiver has reached the end of the underlying Readerprotected boolean
isNest
(char ch) Test if the given char is a character that starts nesting/protected boolean
isPunctuation
(char ch) Test if the given char is a punctuation character.protected LiteralBaseToken
A floating point literal has been detected so create.next()
Answer the next token from the underlying reader.protected char
nextChar()
Advance to the next character and return it.protected void
notePos()
Note the offset within the underlying Readervoid
reset()
Reset the receiver so that tokenizing may begin again.protected void
Reset the buffer used for holding the current tokenscan()
Tokenize the entire expression at once.protected BaseToken
Attemt to tokenize an Identifier.protected BaseToken
Attemt to tokenize a literal.protected BaseToken
scanNest
(char ch) Tokenize the characters between the beginning nest chararcer and the character that ends the nest.protected boolean
Attemt to tokenize an Operator.void
setNesting
(String sNests, String sUnnests) Set the Strings used as nesting characters.void
Set the String used to determine punctuation characters.void
Skip over any whitespace characters.protected void
Add the current character to buffer that builds the current token.protected void
Add the current character to buffer that builds the current token and advance.protected String
Convert the buffered characters into a String representing a token.
-
Field Details
-
m_iPos
protected int m_iPosThe current position in the underlying Reader. -
m_iStartPos
protected int m_iStartPosThe saved start position for a token. -
m_sPunctuation
The characters used for punctuation. -
m_sNests
The characters used to begin nesting. -
m_sUnnests
The characters used to end nesting. -
m_tokenBuffer
The temporary buffer used to build the String representing a token. -
m_chCurrent
protected char m_chCurrentThe current working char. -
m_reader
The underlying Reader holding the characters being tokenized. -
m_fIsEnd
protected boolean m_fIsEndThe flag set when end of file is detected. -
m_coffset
protected int m_coffsetOffset of the current character. -
m_lineNumber
protected int m_lineNumberThe current line number.
-
-
Constructor Details
-
BaseTokenScanner
Construct a new BaseTokenScanner with the given String.- Parameters:
s
- the string to be tokenized
-
BaseTokenScanner
Construct a new BaseTokenScanner with the given Reader.- Parameters:
reader
- the Reader that is the source of chars to be tokenized
-
-
Method Details
-
scan
Tokenize the entire expression at once.- Returns:
- a BaseToken, typically a SequenceBaseToken
-
reset
public void reset()Reset the receiver so that tokenizing may begin again. -
next
Answer the next token from the underlying reader.- Returns:
- the next token
-
skipWhiteSpace
public void skipWhiteSpace()Skip over any whitespace characters. -
isEnd
public boolean isEnd()Test whether the receiver has reached the end of the underlying Reader- Returns:
- the boolean result
-
setPunctuation
Set the String used to determine punctuation characters.- Parameters:
s
- the String of characters to use as punctuation
-
setNesting
Set the Strings used as nesting characters.- Parameters:
sNests
- the chars that start nestingsUnnests
- the chars that end nesting
-
isPunctuation
protected boolean isPunctuation(char ch) Test if the given char is a punctuation character.- Parameters:
ch
- the char to test- Returns:
- the boolean result of punctuation testing
-
isNest
protected boolean isNest(char ch) Test if the given char is a character that starts nesting/- Parameters:
ch
- the char to test- Returns:
- the boolean result of nest testing
-
scanNest
Tokenize the characters between the beginning nest chararcer and the character that ends the nest.- Parameters:
ch
- the character that begins nesting- Returns:
- a NestedBaseTokens that holds the nested tokens
- Throws:
BaseTokenScannerException
- if we reach the end of stream before the matching end of nest character is reached
-
scanLiteral
Attemt to tokenize a literal.- Returns:
- a LiteralBaseToken if one is found otherwise return null
-
literalFloat
A floating point literal has been detected so create.- Returns:
- the literal representation of a floating point number
-
scanIdentifier
Attemt to tokenize an Identifier.- Returns:
- an IdentifierBaseToken if one is found otherwise return null
-
floatingPointFormatError
A problem has been detected in a floating point number. Signal an error.- Returns:
- the RuntimeException for float format errors
-
scanOperator
protected boolean scanOperator()Attemt to tokenize an Operator.- Returns:
- an OperatorBaseToken if one is found otherwise return null
-
getCurrentChar
protected char getCurrentChar()Answer the current char of the underlying Reader- Returns:
- the current char
-
advance
protected void advance()Advance to the next character. -
nextChar
protected char nextChar()Advance to the next character and return it.- Returns:
- the next character
-
notePos
protected void notePos()Note the offset within the underlying Reader -
tokenString
Convert the buffered characters into a String representing a token.- Returns:
- the String representing the buffered token
-
resetTokenString
protected void resetTokenString()Reset the buffer used for holding the current token -
takeCurrentChar
protected void takeCurrentChar()Add the current character to buffer that builds the current token. -
takeCurrentCharAndAdvance
protected void takeCurrentCharAndAdvance()Add the current character to buffer that builds the current token and advance.
-