Class PatternTokenizer

java.lang.Object
com.ibm.icu.impl.PatternTokenizer

public class PatternTokenizer extends Object
A simple parsing class for patterns and rules. Handles '...' quotations, \\uxxxx and \\Uxxxxxxxx, and symple syntax. The '' (two quotes) is treated as a single quote, inside or outside a quote
  • Any ignorable characters are ignored in parsing.
  • Any syntax characters are broken into separate tokens
  • Quote characters can be specified: '...', "...", and \x
  • Other characters are treated as literals
  • Field Details

  • Constructor Details

    • PatternTokenizer

      public PatternTokenizer()
  • Method Details

    • getIgnorableCharacters

      public UnicodeSet getIgnorableCharacters()
    • setIgnorableCharacters

      public PatternTokenizer setIgnorableCharacters(UnicodeSet ignorableCharacters)
      Sets the characters to be ignored in parsing, eg new UnicodeSet("[:pattern_whitespace:]");
      Parameters:
      ignorableCharacters - Characters to be ignored.
      Returns:
      A PatternTokenizer object in which characters are specified as ignored characters.
    • getSyntaxCharacters

      public UnicodeSet getSyntaxCharacters()
    • getExtraQuotingCharacters

      public UnicodeSet getExtraQuotingCharacters()
    • setSyntaxCharacters

      public PatternTokenizer setSyntaxCharacters(UnicodeSet syntaxCharacters)
      Sets the characters to be interpreted as syntax characters in parsing, eg new UnicodeSet("[:pattern_syntax:]")
      Parameters:
      syntaxCharacters - Characters to be set as syntax characters.
      Returns:
      A PatternTokenizer object in which characters are specified as syntax characters.
    • setExtraQuotingCharacters

      public PatternTokenizer setExtraQuotingCharacters(UnicodeSet syntaxCharacters)
      Sets the extra characters to be quoted in literals
      Parameters:
      syntaxCharacters - Characters to be set as extra quoting characters.
      Returns:
      A PatternTokenizer object in which characters are specified as extra quoting characters.
    • getEscapeCharacters

      public UnicodeSet getEscapeCharacters()
    • setEscapeCharacters

      public PatternTokenizer setEscapeCharacters(UnicodeSet escapeCharacters)
      Set characters to be escaped in literals, in quoteLiteral and normalize, eg new UnicodeSet("[^\\u0020-\\u007E]");
      Parameters:
      escapeCharacters - Characters to be set as escape characters.
      Returns:
      A PatternTokenizer object in which characters are specified as escape characters.
    • isUsingQuote

      public boolean isUsingQuote()
    • setUsingQuote

      public PatternTokenizer setUsingQuote(boolean usingQuote)
    • isUsingSlash

      public boolean isUsingSlash()
    • setUsingSlash

      public PatternTokenizer setUsingSlash(boolean usingSlash)
    • getLimit

      public int getLimit()
    • setLimit

      public PatternTokenizer setLimit(int limit)
    • getStart

      public int getStart()
    • setStart

      public PatternTokenizer setStart(int start)
    • setPattern

      public PatternTokenizer setPattern(CharSequence pattern)
    • setPattern

      public PatternTokenizer setPattern(String pattern)
    • quoteLiteral

      public String quoteLiteral(CharSequence string)
    • quoteLiteral

      public String quoteLiteral(String string)
      Quote a literal string, using the available settings. Thus syntax characters, quote characters, and ignorable characters will be put into quotes.
      Parameters:
      string - String passed to quote a literal string.
      Returns:
      A string using the available settings will place syntax, quote, or ignorable characters into quotes.
    • appendEscaped

      private void appendEscaped(StringBuffer result, int cp)
    • normalize

      public String normalize()
    • next

      public int next(StringBuffer buffer)