Class Edits

java.lang.Object
com.ibm.icu.text.Edits

public final class Edits extends Object
Records lengths of string edits but not replacement text. Supports replacements, insertions, deletions in linear progression. Does not support moving/reordering of text.

There are two types of edits: change edits and no-change edits. Add edits to instances of this class using addReplace(int, int) (for change edits) and addUnchanged(int) (for no-change edits). Change edits are retained with full granularity, whereas adjacent no-change edits are always merged together. In no-change edits, there is a one-to-one mapping between code points in the source and destination strings.

After all edits have been added, instances of this class should be considered immutable, and an Edits.Iterator can be used for queries.

There are four flavors of Edits.Iterator:

  • getFineIterator() retains full granularity of change edits.
  • getFineChangesIterator() retains full granularity of change edits, and when calling next() on the iterator, skips over no-change edits (unchanged regions).
  • getCoarseIterator() treats adjacent change edits as a single edit. (Adjacent no-change edits are automatically merged during the construction phase.)
  • getCoarseChangesIterator() treats adjacent change edits as a single edit, and when calling next() on the iterator, skips over no-change edits (unchanged regions).

For example, consider the string "abcßDeF", which case-folds to "abcssdef". This string has the following fine edits:

  • abc ⇨ abc (no-change)
  • ß ⇨ ss (change)
  • D ⇨ d (change)
  • e ⇨ e (no-change)
  • F ⇨ f (change)
and the following coarse edits (note how adjacent change edits get merged together):
  • abc ⇨ abc (no-change)
  • ßD ⇨ ssd (change)
  • e ⇨ e (no-change)
  • F ⇨ f (change)

The "fine changes" and "coarse changes" iterators will step through only the change edits when their Edits.Iterator.next() methods are called. They are identical to the non-change iterators when their Edits.Iterator.findSourceIndex(int) or Edits.Iterator.findDestinationIndex(int) methods are used to walk through the string.

For examples of how to use this class, see the test TestCaseMapEditsIteratorDocs in UCharacterCaseTest.java.

  • Field Details

    • MAX_UNCHANGED_LENGTH

      private static final int MAX_UNCHANGED_LENGTH
      See Also:
    • MAX_UNCHANGED

      private static final int MAX_UNCHANGED
      See Also:
    • MAX_SHORT_CHANGE_OLD_LENGTH

      private static final int MAX_SHORT_CHANGE_OLD_LENGTH
      See Also:
    • MAX_SHORT_CHANGE_NEW_LENGTH

      private static final int MAX_SHORT_CHANGE_NEW_LENGTH
      See Also:
    • SHORT_CHANGE_NUM_MASK

      private static final int SHORT_CHANGE_NUM_MASK
      See Also:
    • MAX_SHORT_CHANGE

      private static final int MAX_SHORT_CHANGE
      See Also:
    • LENGTH_IN_1TRAIL

      private static final int LENGTH_IN_1TRAIL
      See Also:
    • LENGTH_IN_2TRAIL

      private static final int LENGTH_IN_2TRAIL
      See Also:
    • STACK_CAPACITY

      private static final int STACK_CAPACITY
      See Also:
    • array

      private char[] array
    • length

      private int length
    • delta

      private int delta
    • numChanges

      private int numChanges
  • Constructor Details

    • Edits

      public Edits()
      Constructs an empty object.
  • Method Details

    • reset

      public void reset()
      Resets the data but may not release memory.
    • setLastUnit

      private void setLastUnit(int last)
    • lastUnit

      private int lastUnit()
    • addUnchanged

      public void addUnchanged(int unchangedLength)
      Adds a no-change edit: a record for an unchanged segment of text. Normally called from inside ICU string transformation functions, not user code.
    • addReplace

      public void addReplace(int oldLength, int newLength)
      Adds a change edit: a record for a text replacement/insertion/deletion. Normally called from inside ICU string transformation functions, not user code.
    • append

      private void append(int r)
    • growArray

      private boolean growArray()
    • lengthDelta

      public int lengthDelta()
      How much longer is the new text compared with the old text?
      Returns:
      new length minus old length
    • hasChanges

      public boolean hasChanges()
      Returns:
      true if there are any change edits
    • numberOfChanges

      public int numberOfChanges()
      Returns:
      the number of change edits
    • getCoarseChangesIterator

      public Edits.Iterator getCoarseChangesIterator()
      Returns an Iterator for coarse-grained change edits (adjacent change edits are treated as one). Can be used to perform simple string updates. Skips no-change edits.
      Returns:
      an Iterator that merges adjacent changes.
    • getCoarseIterator

      public Edits.Iterator getCoarseIterator()
      Returns an Iterator for coarse-grained change and no-change edits (adjacent change edits are treated as one). Can be used to perform simple string updates. Adjacent change edits are treated as one edit.
      Returns:
      an Iterator that merges adjacent changes.
    • getFineChangesIterator

      public Edits.Iterator getFineChangesIterator()
      Returns an Iterator for fine-grained change edits (full granularity of change edits is retained). Can be used for modifying styled text. Skips no-change edits.
      Returns:
      an Iterator that separates adjacent changes.
    • getFineIterator

      public Edits.Iterator getFineIterator()
      Returns an Iterator for fine-grained change and no-change edits (full granularity of change edits is retained). Can be used for modifying styled text.
      Returns:
      an Iterator that separates adjacent changes.
    • mergeAndAppend

      public Edits mergeAndAppend(Edits ab, Edits bc)
      Merges the two input Edits and appends the result to this object.

      Consider two string transformations (for example, normalization and case mapping) where each records Edits in addition to writing an output string.
      Edits ab reflect how substrings of input string a map to substrings of intermediate string b.
      Edits bc reflect how substrings of intermediate string b map to substrings of output string c.
      This function merges ab and bc such that the additional edits recorded in this object reflect how substrings of input string a map to substrings of output string c.

      If unrelated Edits are passed in where the output string of the first has a different length than the input string of the second, then an IllegalArgumentException is thrown.

      Parameters:
      ab - reflects how substrings of input string a map to substrings of intermediate string b.
      bc - reflects how substrings of intermediate string b map to substrings of output string c.
      Returns:
      this, with the merged edits appended