Class BaseTokenScanner


  • public class BaseTokenScanner
    extends Object
    BaseTokenScanner gives clients a streaming api that returns a next BaseToken by processing either a java.lang.String or a java.io.Reader. Clients may process the the underlying Reader or String all at one time by using scan() which will a BaseToken that is typically return a SequenceBaseToken. This all at once conversion from Chars to Tokens is standard for very low level tokenizers. BaseTokenScanner also processes nested tokens (things between (..), {..}, etc.) into a sequence of token represented by a composit token. Nested tokenizing relieves a client parser of bracketing concerns. This nest processing comes from the Dylan tokenizer.
    Author:
    djl 2009.03.02
    • Field Detail

      • m_iPos

        protected int m_iPos
        The current position in the underlying Reader.
      • m_iStartPos

        protected int m_iStartPos
        The saved start position for a token.
      • m_sPunctuation

        protected String m_sPunctuation
        The characters used for punctuation.
      • m_sNests

        protected String m_sNests
        The characters used to begin nesting.
      • m_sUnnests

        protected String m_sUnnests
        The characters used to end nesting.
      • m_tokenBuffer

        protected StringBuffer m_tokenBuffer
        The temporary buffer used to build the String representing a token.
      • m_chCurrent

        protected char m_chCurrent
        The current working char.
      • m_reader

        protected Reader m_reader
        The underlying Reader holding the characters being tokenized.
      • m_fIsEnd

        protected boolean m_fIsEnd
        The flag set when end of file is detected.
      • m_coffset

        protected int m_coffset
        Offset of the current character.
      • m_lineNumber

        protected int m_lineNumber
        The current line number.
    • Constructor Detail

      • BaseTokenScanner

        public BaseTokenScanner​(String s)
        Construct a new BaseTokenScanner with the given String.
        Parameters:
        s - the string to be tokenized
      • BaseTokenScanner

        public BaseTokenScanner​(Reader reader)
        Construct a new BaseTokenScanner with the given Reader.
        Parameters:
        reader - the Reader that is the source of chars to be tokenized
    • Method Detail

      • scan

        public BaseToken scan()
        Tokenize the entire expression at once.
        Returns:
        a BaseToken, typically a SequenceBaseToken
      • reset

        public void reset()
        Reset the receiver so that tokenizing may begin again.
      • next

        public BaseToken next()
        Answer the next token from the underlying reader.
        Returns:
        the next token
      • skipWhiteSpace

        public void skipWhiteSpace()
        Skip over any whitespace characters.
      • isEnd

        public boolean isEnd()
        Test whether the receiver has reached the end of the underlying Reader
        Returns:
        the boolean result
      • setPunctuation

        public void setPunctuation​(String s)
        Set the String used to determine punctuation characters.
        Parameters:
        s - the String of characters to use as punctuation
      • setNesting

        public void setNesting​(String sNests,
                               String sUnnests)
        Set the Strings used as nesting characters.
        Parameters:
        sNests - the chars that start nesting
        sUnnests - the chars that end nesting
      • isPunctuation

        protected boolean isPunctuation​(char ch)
        Test if the given char is a punctuation character.
        Parameters:
        ch - the char to test
        Returns:
        the boolean result of punctuation testing
      • isNest

        protected boolean isNest​(char ch)
        Test if the given char is a character that starts nesting/
        Parameters:
        ch - the char to test
        Returns:
        the boolean result of nest testing
      • scanNest

        protected BaseToken scanNest​(char ch)
        Tokenize the characters between the beginning nest chararcer and the character that ends the nest.
        Parameters:
        ch - the character that begins nesting
        Returns:
        a NestedBaseTokens that holds the nested tokens
        Throws:
        BaseTokenScannerException - if we reach the end of stream before the matching end of nest character is reached
      • scanLiteral

        protected BaseToken scanLiteral()
        Attemt to tokenize a literal.
        Returns:
        a LiteralBaseToken if one is found otherwise return null
      • literalFloat

        protected LiteralBaseToken literalFloat()
        A floating point literal has been detected so create.
        Returns:
        the literal representation of a floating point number
      • scanIdentifier

        protected BaseToken scanIdentifier()
        Attemt to tokenize an Identifier.
        Returns:
        an IdentifierBaseToken if one is found otherwise return null
      • floatingPointFormatError

        protected RuntimeException floatingPointFormatError()
        A problem has been detected in a floating point number. Signal an error.
        Returns:
        the RuntimeException for float format errors
      • scanOperator

        protected boolean scanOperator()
        Attemt to tokenize an Operator.
        Returns:
        an OperatorBaseToken if one is found otherwise return null
      • getCurrentChar

        protected char getCurrentChar()
        Answer the current char of the underlying Reader
        Returns:
        the current char
      • advance

        protected void advance()
        Advance to the next character.
      • nextChar

        protected char nextChar()
        Advance to the next character and return it.
        Returns:
        the next character
      • notePos

        protected void notePos()
        Note the offset within the underlying Reader
      • tokenString

        protected String tokenString()
        Convert the buffered characters into a String representing a token.
        Returns:
        the String representing the buffered token
      • resetTokenString

        protected void resetTokenString()
        Reset the buffer used for holding the current token
      • takeCurrentChar

        protected void takeCurrentChar()
        Add the current character to buffer that builds the current token.
      • takeCurrentCharAndAdvance

        protected void takeCurrentCharAndAdvance()
        Add the current character to buffer that builds the current token and advance.