Class StandardTokenizerImpl34

java.lang.Object
org.apache.lucene.analysis.standard.std34.StandardTokenizerImpl34
All Implemented Interfaces:
StandardTokenizerInterface

@Deprecated public final class StandardTokenizerImpl34 extends Object implements StandardTokenizerInterface
Deprecated.
This class is only for exact backwards compatibility
This class implements StandardTokenizer using Unicode 6.0.0.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    Deprecated.
     
    static final int
    Deprecated.
     
    static final int
    Deprecated.
     
    static final int
    Deprecated.
     
    static final int
    Deprecated.
    Numbers
    static final int
    Deprecated.
    Chars in class \p{Line_Break = Complex_Context} are from South East Asian scripts (Thai, Lao, Myanmar, Khmer, etc.).
    static final int
    Deprecated.
    Alphanumeric sequences
    static final int
    Deprecated.
    This character denotes the end of file
    static final int
    Deprecated.
    lexical states
  • Constructor Summary

    Constructors
    Constructor
    Description
    Deprecated.
    Creates a new scanner
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    Deprecated.
    Resumes scanning until the next regular expression is matched, the end of input is encountered or an I/O-Error occurs.
    final void
    Deprecated.
    Fills CharTermAttribute with the current token text.
    final void
    yybegin(int newState)
    Deprecated.
    Enters a new lexical state
    final int
    Deprecated.
    Returns the current position.
    final char
    yycharat(int pos)
    Deprecated.
    Returns the character at position pos from the matched text.
    final void
    Deprecated.
    Closes the input stream.
    final int
    Deprecated.
    Returns the length of the matched text region.
    void
    yypushback(int number)
    Deprecated.
    Pushes the specified amount of characters back into the input stream.
    final void
    yyreset(Reader reader)
    Deprecated.
    Resets the scanner to read from a new input stream.
    final int
    Deprecated.
    Returns the current lexical state.
    final String
    Deprecated.
    Returns the text matched by the current regular expression.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • YYEOF

      public static final int YYEOF
      Deprecated.
      This character denotes the end of file
      See Also:
    • YYINITIAL

      public static final int YYINITIAL
      Deprecated.
      lexical states
      See Also:
    • WORD_TYPE

      public static final int WORD_TYPE
      Deprecated.
      Alphanumeric sequences
      See Also:
    • NUMERIC_TYPE

      public static final int NUMERIC_TYPE
      Deprecated.
      Numbers
      See Also:
    • SOUTH_EAST_ASIAN_TYPE

      public static final int SOUTH_EAST_ASIAN_TYPE
      Deprecated.
      Chars in class \p{Line_Break = Complex_Context} are from South East Asian scripts (Thai, Lao, Myanmar, Khmer, etc.). Sequences of these are kept together as as a single token rather than broken up, because the logic required to break them at word boundaries is too complex for UAX#29.

      See Unicode Line Breaking Algorithm: http://www.unicode.org/reports/tr14/#SA

      See Also:
    • IDEOGRAPHIC_TYPE

      public static final int IDEOGRAPHIC_TYPE
      Deprecated.
      See Also:
    • HIRAGANA_TYPE

      public static final int HIRAGANA_TYPE
      Deprecated.
      See Also:
    • KATAKANA_TYPE

      public static final int KATAKANA_TYPE
      Deprecated.
      See Also:
    • HANGUL_TYPE

      public static final int HANGUL_TYPE
      Deprecated.
      See Also:
  • Constructor Details

    • StandardTokenizerImpl34

      public StandardTokenizerImpl34(Reader in)
      Deprecated.
      Creates a new scanner
      Parameters:
      in - the java.io.Reader to read input from.
  • Method Details

    • yychar

      public final int yychar()
      Deprecated.
      Description copied from interface: StandardTokenizerInterface
      Returns the current position.
      Specified by:
      yychar in interface StandardTokenizerInterface
    • getText

      public final void getText(CharTermAttribute t)
      Deprecated.
      Fills CharTermAttribute with the current token text.
      Specified by:
      getText in interface StandardTokenizerInterface
    • yyclose

      public final void yyclose() throws IOException
      Deprecated.
      Closes the input stream.
      Throws:
      IOException
    • yyreset

      public final void yyreset(Reader reader)
      Deprecated.
      Resets the scanner to read from a new input stream. Does not close the old reader. All internal variables are reset, the old input stream cannot be reused (internal buffer is discarded and lost). Lexical state is set to ZZ_INITIAL. Internal scan buffer is resized down to its initial length, if it has grown.
      Specified by:
      yyreset in interface StandardTokenizerInterface
      Parameters:
      reader - the new input stream
    • yystate

      public final int yystate()
      Deprecated.
      Returns the current lexical state.
    • yybegin

      public final void yybegin(int newState)
      Deprecated.
      Enters a new lexical state
      Parameters:
      newState - the new lexical state
    • yytext

      public final String yytext()
      Deprecated.
      Returns the text matched by the current regular expression.
    • yycharat

      public final char yycharat(int pos)
      Deprecated.
      Returns the character at position pos from the matched text. It is equivalent to yytext().charAt(pos), but faster
      Parameters:
      pos - the position of the character to fetch. A value from 0 to yylength()-1.
      Returns:
      the character at position pos
    • yylength

      public final int yylength()
      Deprecated.
      Returns the length of the matched text region.
      Specified by:
      yylength in interface StandardTokenizerInterface
    • yypushback

      public void yypushback(int number)
      Deprecated.
      Pushes the specified amount of characters back into the input stream. They will be read again by then next call of the scanning method
      Parameters:
      number - the number of characters to be read again. This number must not be greater than yylength()!
    • getNextToken

      public int getNextToken() throws IOException
      Deprecated.
      Resumes scanning until the next regular expression is matched, the end of input is encountered or an I/O-Error occurs.
      Specified by:
      getNextToken in interface StandardTokenizerInterface
      Returns:
      the next token
      Throws:
      IOException - if any I/O-Error occurs