java.lang.Object
com.optimaize.langdetect.cybozu.util.NGram

public class NGram extends Object
TODO document. Users don't use this class directly. TODO this class treats a word as "upper case" if the first 2 characters are upper case. That seems like a simplification, would need documentation.
  • Field Details

    • N_GRAM

      public static final int N_GRAM
      ngrams are created from 1gram to this amount, currently 2grams and 3grams.
      See Also:
    • grams_

      private StringBuilder grams_
    • capitalword_

      private boolean capitalword_
  • Constructor Details

    • NGram

      public NGram()
  • Method Details

    • addChar

      public void addChar(char ch)
    • get

      @Nullable public @Nullable String get(int n)
      TODO this method has some weird, undocumented behavior to ignore ngrams with upper case. Get n-Gram
      Parameters:
      n - length of n-gram
      Returns:
      n-Gram String (null if it is invalid)