Class BaseTokenScanner

java.lang.Object
com.tangosol.coherence.dsltools.base.BaseTokenScanner

public class BaseTokenScanner extends Object
BaseTokenScanner gives clients a streaming api that returns a next BaseToken by processing either a java.lang.String or a java.io.Reader. Clients may process the the underlying Reader or String all at one time by using scan() which will a BaseToken that is typically return a SequenceBaseToken. This all at once conversion from Chars to Tokens is standard for very low level tokenizers. BaseTokenScanner also processes nested tokens (things between (..), {..}, etc.) into a sequence of token represented by a composit token. Nested tokenizing relieves a client parser of bracketing concerns. This nest processing comes from the Dylan tokenizer.
Author:
djl 2009.03.02
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected char
    The current working char.
    protected int
    Offset of the current character.
    protected boolean
    The flag set when end of file is detected.
    protected int
    The current position in the underlying Reader.
    protected int
    The saved start position for a token.
    protected int
    The current line number.
    protected Reader
    The underlying Reader holding the characters being tokenized.
    protected String
    The characters used to begin nesting.
    protected String
    The characters used for punctuation.
    protected String
    The characters used to end nesting.
    protected StringBuffer
    The temporary buffer used to build the String representing a token.
  • Constructor Summary

    Constructors
    Constructor
    Description
    Construct a new BaseTokenScanner with the given Reader.
    Construct a new BaseTokenScanner with the given String.
  • Method Summary

    Modifier and Type
    Method
    Description
    protected void
    Advance to the next character.
    A problem has been detected in a floating point number.
    protected char
    Answer the current char of the underlying Reader
    boolean
    Test whether the receiver has reached the end of the underlying Reader
    protected boolean
    isNest(char ch)
    Test if the given char is a character that starts nesting/
    protected boolean
    isPunctuation(char ch)
    Test if the given char is a punctuation character.
    A floating point literal has been detected so create.
    Answer the next token from the underlying reader.
    protected char
    Advance to the next character and return it.
    protected void
    Note the offset within the underlying Reader
    void
    Reset the receiver so that tokenizing may begin again.
    protected void
    Reset the buffer used for holding the current token
    Tokenize the entire expression at once.
    protected BaseToken
    Attemt to tokenize an Identifier.
    protected BaseToken
    Attemt to tokenize a literal.
    protected BaseToken
    scanNest(char ch)
    Tokenize the characters between the beginning nest chararcer and the character that ends the nest.
    protected boolean
    Attemt to tokenize an Operator.
    void
    setNesting(String sNests, String sUnnests)
    Set the Strings used as nesting characters.
    void
    Set the String used to determine punctuation characters.
    void
    Skip over any whitespace characters.
    protected void
    Add the current character to buffer that builds the current token.
    protected void
    Add the current character to buffer that builds the current token and advance.
    protected String
    Convert the buffered characters into a String representing a token.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • m_iPos

      protected int m_iPos
      The current position in the underlying Reader.
    • m_iStartPos

      protected int m_iStartPos
      The saved start position for a token.
    • m_sPunctuation

      protected String m_sPunctuation
      The characters used for punctuation.
    • m_sNests

      protected String m_sNests
      The characters used to begin nesting.
    • m_sUnnests

      protected String m_sUnnests
      The characters used to end nesting.
    • m_tokenBuffer

      protected StringBuffer m_tokenBuffer
      The temporary buffer used to build the String representing a token.
    • m_chCurrent

      protected char m_chCurrent
      The current working char.
    • m_reader

      protected Reader m_reader
      The underlying Reader holding the characters being tokenized.
    • m_fIsEnd

      protected boolean m_fIsEnd
      The flag set when end of file is detected.
    • m_coffset

      protected int m_coffset
      Offset of the current character.
    • m_lineNumber

      protected int m_lineNumber
      The current line number.
  • Constructor Details

    • BaseTokenScanner

      public BaseTokenScanner(String s)
      Construct a new BaseTokenScanner with the given String.
      Parameters:
      s - the string to be tokenized
    • BaseTokenScanner

      public BaseTokenScanner(Reader reader)
      Construct a new BaseTokenScanner with the given Reader.
      Parameters:
      reader - the Reader that is the source of chars to be tokenized
  • Method Details

    • scan

      public BaseToken scan()
      Tokenize the entire expression at once.
      Returns:
      a BaseToken, typically a SequenceBaseToken
    • reset

      public void reset()
      Reset the receiver so that tokenizing may begin again.
    • next

      public BaseToken next()
      Answer the next token from the underlying reader.
      Returns:
      the next token
    • skipWhiteSpace

      public void skipWhiteSpace()
      Skip over any whitespace characters.
    • isEnd

      public boolean isEnd()
      Test whether the receiver has reached the end of the underlying Reader
      Returns:
      the boolean result
    • setPunctuation

      public void setPunctuation(String s)
      Set the String used to determine punctuation characters.
      Parameters:
      s - the String of characters to use as punctuation
    • setNesting

      public void setNesting(String sNests, String sUnnests)
      Set the Strings used as nesting characters.
      Parameters:
      sNests - the chars that start nesting
      sUnnests - the chars that end nesting
    • isPunctuation

      protected boolean isPunctuation(char ch)
      Test if the given char is a punctuation character.
      Parameters:
      ch - the char to test
      Returns:
      the boolean result of punctuation testing
    • isNest

      protected boolean isNest(char ch)
      Test if the given char is a character that starts nesting/
      Parameters:
      ch - the char to test
      Returns:
      the boolean result of nest testing
    • scanNest

      protected BaseToken scanNest(char ch)
      Tokenize the characters between the beginning nest chararcer and the character that ends the nest.
      Parameters:
      ch - the character that begins nesting
      Returns:
      a NestedBaseTokens that holds the nested tokens
      Throws:
      BaseTokenScannerException - if we reach the end of stream before the matching end of nest character is reached
    • scanLiteral

      protected BaseToken scanLiteral()
      Attemt to tokenize a literal.
      Returns:
      a LiteralBaseToken if one is found otherwise return null
    • literalFloat

      protected LiteralBaseToken literalFloat()
      A floating point literal has been detected so create.
      Returns:
      the literal representation of a floating point number
    • scanIdentifier

      protected BaseToken scanIdentifier()
      Attemt to tokenize an Identifier.
      Returns:
      an IdentifierBaseToken if one is found otherwise return null
    • floatingPointFormatError

      protected RuntimeException floatingPointFormatError()
      A problem has been detected in a floating point number. Signal an error.
      Returns:
      the RuntimeException for float format errors
    • scanOperator

      protected boolean scanOperator()
      Attemt to tokenize an Operator.
      Returns:
      an OperatorBaseToken if one is found otherwise return null
    • getCurrentChar

      protected char getCurrentChar()
      Answer the current char of the underlying Reader
      Returns:
      the current char
    • advance

      protected void advance()
      Advance to the next character.
    • nextChar

      protected char nextChar()
      Advance to the next character and return it.
      Returns:
      the next character
    • notePos

      protected void notePos()
      Note the offset within the underlying Reader
    • tokenString

      protected String tokenString()
      Convert the buffered characters into a String representing a token.
      Returns:
      the String representing the buffered token
    • resetTokenString

      protected void resetTokenString()
      Reset the buffer used for holding the current token
    • takeCurrentChar

      protected void takeCurrentChar()
      Add the current character to buffer that builds the current token.
    • takeCurrentCharAndAdvance

      protected void takeCurrentCharAndAdvance()
      Add the current character to buffer that builds the current token and advance.