com.ibm.icu.text
Class Bidi

java.lang.Object
  extended by com.ibm.icu.text.Bidi

public class Bidi
extends Object

Bidi algorithm for ICU

This is an implementation of the Unicode Bidirectional Algorithm. The algorithm is defined in the Unicode Standard Annex #9.

Note: Libraries that perform a bidirectional algorithm and reorder strings accordingly are sometimes called "Storage Layout Engines". ICU's Bidi and shaping (ArabicShaping) classes can be used at the core of such "Storage Layout Engines".

General remarks about the API:

The "limit" of a sequence of characters is the position just after their last character, i.e., one more than that position.

Some of the API methods provide access to "runs". Such a "run" is defined as a sequence of characters that are at the same embedding level after performing the Bidi algorithm.

Basic concept: paragraph

A piece of text can be divided into several paragraphs by characters with the Bidi class Block Separator. For handling of paragraphs, see:

Basic concept: text direction

The direction of a piece of text may be:

Basic concept: levels

Levels in this API represent embedding levels according to the Unicode Bidirectional Algorithm. Their low-order bit (even/odd value) indicates the visual direction.

Levels can be abstract values when used for the paraLevel and embeddingLevels arguments of setPara(); there:

The related constants are not real, valid level values. DEFAULT_XXX can be used to specify a default for the paragraph level for when the setPara() method shall determine it but there is no strongly typed character in the input.

Note that the value for LEVEL_DEFAULT_LTR is even and the one for LEVEL_DEFAULT_RTL is odd, just like with normal LTR and RTL level values - these special values are designed that way. Also, the implementation assumes that MAX_EXPLICIT_LEVEL is odd.

Basic concept: Reordering Mode

Reordering mode values indicate which variant of the Bidi algorithm to use.

Basic concept: Reordering Options

Reordering options can be applied during Bidi text transformations.

Author:
Simon Montagu, Matitiahu Allouche (ported from C code written by Markus W. Scherer)
Status:
Stable ICU 3.8

Sample code for the ICU Bidi API

Rendering a paragraph with the ICU Bidi API
This is (hypothetical) sample code that illustrates how the ICU Bidi API could be used to render a paragraph of text. Rendering code depends highly on the graphics system, therefore this sample code must make a lot of assumptions, which may or may not match any existing graphics system's properties.

The basic assumptions are:


  package com.ibm.icu.dev.test.bidi;

  import com.ibm.icu.text.Bidi;
  import com.ibm.icu.text.BidiRun;

  public class Sample {

      static final int styleNormal = 0;
      static final int styleSelected = 1;
      static final int styleBold = 2;
      static final int styleItalics = 4;
      static final int styleSuper=8;
      static final int styleSub = 16;

      static class StyleRun {
          int limit;
          int style;

          public StyleRun(int limit, int style) {
              this.limit = limit;
              this.style = style;
          }
      }

      static class Bounds {
          int start;
          int limit;

          public Bounds(int start, int limit) {
              this.start = start;
              this.limit = limit;
          }
      }

      static int getTextWidth(String text, int start, int limit,
                              StyleRun[] styleRuns, int styleRunCount) {
          // simplistic way to compute the width
          return limit - start;
      }

      // set limit and StyleRun limit for a line
      // from text[start] and from styleRuns[styleRunStart]
      // using Bidi.getLogicalRun(...)
      // returns line width
      static int getLineBreak(String text, Bounds line, Bidi para,
                              StyleRun styleRuns[], Bounds styleRun) {
          // dummy return
          return 0;
      }

      // render runs on a line sequentially, always from left to right

      // prepare rendering a new line
      static void startLine(byte textDirection, int lineWidth) {
          System.out.println();
      }

      // render a run of text and advance to the right by the run width
      // the text[start..limit-1] is always in logical order
      static void renderRun(String text, int start, int limit,
                            byte textDirection, int style) {
      }

      // We could compute a cross-product
      // from the style runs with the directional runs
      // and then reorder it.
      // Instead, here we iterate over each run type
      // and render the intersections -
      // with shortcuts in simple (and common) cases.
      // renderParagraph() is the main function.

      // render a directional run with
      // (possibly) multiple style runs intersecting with it
      static void renderDirectionalRun(String text, int start, int limit,
                                       byte direction, StyleRun styleRuns[],
                                       int styleRunCount) {
          int i;

          // iterate over style runs
          if (direction == Bidi.LTR) {
              int styleLimit;
              for (i = 0; i < styleRunCount; ++i) {
                  styleLimit = styleRuns[i].limit;
                  if (start < styleLimit) {
                      if (styleLimit > limit) {
                          styleLimit = limit;
                      }
                      renderRun(text, start, styleLimit,
                                direction, styleRuns[i].style);
                      if (styleLimit == limit) {
                          break;
                      }
                      start = styleLimit;
                  }
              }
          } else {
              int styleStart;

              for (i = styleRunCount-1; i >= 0; --i) {
                  if (i > 0) {
                      styleStart = styleRuns[i-1].limit;
                  } else {
                      styleStart = 0;
                  }
                  if (limit >= styleStart) {
                      if (styleStart < start) {
                          styleStart = start;
                      }
                      renderRun(text, styleStart, limit, direction,
                                styleRuns[i].style);
                      if (styleStart == start) {
                          break;
                      }
                      limit = styleStart;
                  }
              }
          }
      }

      // the line object represents text[start..limit-1]
      static void renderLine(Bidi line, String text, int start, int limit,
                             StyleRun styleRuns[], int styleRunCount) {
          byte direction = line.getDirection();
          if (direction != Bidi.MIXED) {
              // unidirectional
              if (styleRunCount <= 1) {
                  renderRun(text, start, limit, direction, styleRuns[0].style);
              } else {
                  renderDirectionalRun(text, start, limit, direction,
                                       styleRuns, styleRunCount);
              }
          } else {
              // mixed-directional
              int count, i;
              BidiRun run;

              try {
                  count = line.countRuns();
              } catch (IllegalStateException e) {
                  e.printStackTrace();
                  return;
              }
              if (styleRunCount <= 1) {
                  int style = styleRuns[0].style;

                  // iterate over directional runs
                  for (i = 0; i < count; ++i) {
                      run = line.getVisualRun(i);
                      renderRun(text, run.getStart(), run.getLimit(),
                                run.getDirection(), style);
                  }
              } else {
                  // iterate over both directional and style runs
                  for (i = 0; i < count; ++i) {
                      run = line.getVisualRun(i);
                      renderDirectionalRun(text, run.getStart(),
                                           run.getLimit(), run.getDirection(),
                                           styleRuns, styleRunCount);
                  }
              }
          }
      }

      static void renderParagraph(String text, byte textDirection,
                                  StyleRun styleRuns[], int styleRunCount,
                                  int lineWidth) {
          int length = text.length();
          Bidi para = new Bidi();
          try {
              para.setPara(text,
                           textDirection != 0 ? Bidi.LEVEL_DEFAULT_RTL
                                              : Bidi.LEVEL_DEFAULT_LTR,
                           null);
          } catch (Exception e) {
              e.printStackTrace();
              return;
          }
          byte paraLevel = (byte)(1 & para.getParaLevel());
          StyleRun styleRun = new StyleRun(length, styleNormal);

          if (styleRuns == null || styleRunCount <= 0) {
              styleRuns = new StyleRun[1];
              styleRunCount = 1;
              styleRuns[0] = styleRun;
          }
          // assume styleRuns[styleRunCount-1].limit>=length

          int width = getTextWidth(text, 0, length, styleRuns, styleRunCount);
          if (width <= lineWidth) {
              // everything fits onto one line

              // prepare rendering a new line from either left or right
              startLine(paraLevel, width);

              renderLine(para, text, 0, length, styleRuns, styleRunCount);
          } else {
              // we need to render several lines
              Bidi line = new Bidi(length, 0);
              int start = 0, limit;
              int styleRunStart = 0, styleRunLimit;

              for (;;) {
                  limit = length;
                  styleRunLimit = styleRunCount;
                  width = getLineBreak(text, new Bounds(start, limit),
                                       para, styleRuns,
                                       new Bounds(styleRunStart, styleRunLimit));
                  try {
                      line = para.setLine(start, limit);
                  } catch (Exception e) {
                      e.printStackTrace();
                      return;
                  }
                  // prepare rendering a new line
                  // from either left or right
                  startLine(paraLevel, width);

                  if (styleRunStart > 0) {
                      int newRunCount = styleRuns.length - styleRunStart;
                      StyleRun[] newRuns = new StyleRun[newRunCount];
                      System.arraycopy(styleRuns, styleRunStart, newRuns, 0,
                                       newRunCount);
                      renderLine(line, text, start, limit, newRuns,
                                 styleRunLimit - styleRunStart);
                  } else {
                      renderLine(line, text, start, limit, styleRuns,
                                 styleRunLimit - styleRunStart);
                  }
                  if (limit == length) {
                      break;
                  }
                  start = limit;
                  styleRunStart = styleRunLimit - 1;
                  if (start >= styleRuns[styleRunStart].limit) {
                      ++styleRunStart;
                  }
              }
          }
      }

      public static void main(String[] args)
      {
          renderParagraph("Some Latin text...", Bidi.LTR, null, 0, 80);
          renderParagraph("Some Hebrew text...", Bidi.RTL, null, 0, 60);
      }
  }

 
.

Field Summary
static int CLASS_DEFAULT
          Value returned by BidiClassifier when there is no need to override the standard Bidi class for a given code point.
static int DIRECTION_DEFAULT_LEFT_TO_RIGHT
          Constant indicating that the base direction depends on the first strong directional character in the text according to the Unicode Bidirectional Algorithm.
static int DIRECTION_DEFAULT_RIGHT_TO_LEFT
          Constant indicating that the base direction depends on the first strong directional character in the text according to the Unicode Bidirectional Algorithm.
static int DIRECTION_LEFT_TO_RIGHT
          Constant indicating base direction is left-to-right.
static int DIRECTION_RIGHT_TO_LEFT
          Constant indicating base direction is right-to-left.
static short DO_MIRRORING
          option bit for writeReordered(): replace characters with the "mirrored" property in RTL runs by their mirror-image mappings
static short INSERT_LRM_FOR_NUMERIC
          option bit for writeReordered(): surround the run with LRMs if necessary; this is part of the approximate "inverse Bidi" algorithm This option does not imply corresponding adjustment of the index mappings.
static short KEEP_BASE_COMBINING
          option bit for writeReordered(): keep combining characters after their base characters in RTL runs
static byte LEVEL_DEFAULT_LTR
          Paragraph level setting Constant indicating that the base direction depends on the first strong directional character in the text according to the Unicode Bidirectional Algorithm.
static byte LEVEL_DEFAULT_RTL
          Paragraph level setting Constant indicating that the base direction depends on the first strong directional character in the text according to the Unicode Bidirectional Algorithm.
static byte LEVEL_OVERRIDE
          Bit flag for level input.
static byte LTR
          Left-to-right text.
static int MAP_NOWHERE
          Special value which can be returned by the mapping methods when a logical index has no corresponding visual index or vice-versa.
static byte MAX_EXPLICIT_LEVEL
          Maximum explicit embedding level.
static byte MIXED
          Mixed-directional text.
static byte NEUTRAL
          No strongly directional text.
static int OPTION_DEFAULT
          Option value for setReorderingOptions: disable all the options which can be set with this method
static int OPTION_INSERT_MARKS
          Option bit for setReorderingOptions: insert Bidi marks (LRM or RLM) when needed to ensure correct result of a reordering to a Logical order This option must be set or reset before calling setPara.
static int OPTION_REMOVE_CONTROLS
          Option bit for setReorderingOptions: remove Bidi control characters This option must be set or reset before calling setPara.
static int OPTION_STREAMING
          Option bit for setReorderingOptions: process the output as part of a stream to be continued This option must be set or reset before calling setPara.
static short OUTPUT_REVERSE
          option bit for writeReordered(): write the output in reverse order This has the same effect as calling writeReordered() first without this option, and then calling writeReverse() without mirroring.
static short REMOVE_BIDI_CONTROLS
          option bit for writeReordered(): remove Bidi control characters (this does not affect INSERT_LRM_FOR_NUMERIC) This option does not imply corresponding adjustment of the index mappings.
static short REORDER_DEFAULT
          Reordering mode: Regular Logical to Visual Bidi algorithm according to Unicode.
static short REORDER_GROUP_NUMBERS_WITH_R
          Reordering mode: Logical to Visual algorithm grouping numbers with adjacent R characters (reversible algorithm).
static short REORDER_INVERSE_FOR_NUMBERS_SPECIAL
          Reordering mode: Inverse Bidi (Visual to Logical) algorithm for the REORDER_NUMBERS_SPECIAL Bidi algorithm.
static short REORDER_INVERSE_LIKE_DIRECT
          Reordering mode: Visual to Logical algorithm equivalent to the regular Logical to Visual algorithm.
static short REORDER_INVERSE_NUMBERS_AS_L
          Reordering mode: Visual to Logical algorithm which handles numbers like L (same algorithm as selected by setInverse(true).
static short REORDER_NUMBERS_SPECIAL
          Reordering mode: Logical to Visual algorithm which handles numbers in a way which mimicks the behavior of Windows XP.
static short REORDER_RUNS_ONLY
          Reordering mode: Reorder runs only to transform a Logical LTR string to the logical RTL string with the same display, or vice-versa.
static byte RTL
          Right-to-left text.
 
Constructor Summary
Bidi()
          Allocate a Bidi object.
Bidi(AttributedCharacterIterator paragraph)
          Create Bidi from the given paragraph of text.
Bidi(char[] text, int textStart, byte[] embeddings, int embStart, int paragraphLength, int flags)
          Create Bidi from the given text, embedding, and direction information.
Bidi(int maxLength, int maxRunCount)
          Allocate a Bidi object with preallocated memory for internal structures.
Bidi(String paragraph, int flags)
          Create Bidi from the given paragraph of text and base direction.
 
Method Summary
 boolean baseIsLeftToRight()
          Return true if the base direction is left-to-right
 int countParagraphs()
          Get the number of paragraphs.
 int countRuns()
          Get the number of runs.
 Bidi createLineBidi(int lineStart, int lineLimit)
          Create a Bidi object representing the bidi information on a line of text within the paragraph represented by the current Bidi.
static byte getBaseDirection(CharSequence paragraph)
          Get the base direction of the text provided according to the Unicode Bidirectional Algorithm.
 int getBaseLevel()
          Return the base level (0 if left-to-right, 1 if right-to-left).
 BidiClassifier getCustomClassifier()
          Gets the current custom class classifier used for Bidi class determination.
 int getCustomizedClass(int c)
          Retrieves the Bidi class for a given code point.
 byte getDirection()
          Get the directionality of the text.
 int getLength()
          Get the length of the text.
 byte getLevelAt(int charIndex)
          Get the level for one character.
 byte[] getLevels()
          Get an array of levels for each character.
 int getLogicalIndex(int visualIndex)
          Get the logical text position from a visual position.
 int[] getLogicalMap()
          Get a logical-to-visual index map (array) for the characters in the Bidi (paragraph or line) object.
 BidiRun getLogicalRun(int logicalPosition)
          Get a logical run.
 BidiRun getParagraph(int charIndex)
          Get a paragraph, given a position within the text.
 BidiRun getParagraphByIndex(int paraIndex)
          Get a paragraph, given the index of this paragraph.
 int getParagraphIndex(int charIndex)
          Get the index of a paragraph, given a position within the text.
 byte getParaLevel()
          Get the paragraph level of the text.
 int getProcessedLength()
          Get the length of the source text processed by the last call to setPara().
 int getReorderingMode()
          What is the requested reordering mode for a given Bidi object?
 int getReorderingOptions()
          What are the reordering options applied to a given Bidi object?
 int getResultLength()
          Get the length of the reordered text resulting from the last call to setPara().
 int getRunCount()
          Return the number of level runs.
 int getRunLevel(int run)
          Return the level of the nth logical run in this line.
 int getRunLimit(int run)
          Return the index of the character past the end of the nth logical run in this line, as an offset from the start of the line.
 int getRunStart(int run)
          Return the index of the character at the start of the nth logical run in this line, as an offset from the start of the line.
 char[] getText()
          Get the text.
 String getTextAsString()
          Get the text.
 int getVisualIndex(int logicalIndex)
          Get the visual position from a logical text position.
 int[] getVisualMap()
          Get a visual-to-logical index map (array) for the characters in the Bidi (paragraph or line) object.
 BidiRun getVisualRun(int runIndex)
          Get a BidiRun object according to its index.
static int[] invertMap(int[] srcMap)
          Invert an index map.
 boolean isInverse()
          Is this Bidi object set to perform the inverse Bidi algorithm?
 boolean isLeftToRight()
          Return true if the line is all left-to-right text and the base direction is left-to-right.
 boolean isMixed()
          Return true if the line is not left-to-right or right-to-left.
 boolean isOrderParagraphsLTR()
          Is this Bidi object set to allocate level 0 to block separators so that successive paragraphs progress from left to right?
 boolean isRightToLeft()
          Return true if the line is all right-to-left text, and the base direction is right-to-left
 void orderParagraphsLTR(boolean ordarParaLTR)
          Specify whether block separators must be allocated level zero, so that successive paragraphs will progress from left to right.
static int[] reorderLogical(byte[] levels)
          This is a convenience method that does not use a Bidi object.
static int[] reorderVisual(byte[] levels)
          This is a convenience method that does not use a Bidi object.
static void reorderVisually(byte[] levels, int levelStart, Object[] objects, int objectStart, int count)
          Reorder the objects in the array into visual order based on their levels.
static boolean requiresBidi(char[] text, int start, int limit)
          Return true if the specified text requires bidi analysis.
 void setCustomClassifier(BidiClassifier classifier)
          Set a custom Bidi classifier used by the UBA implementation for Bidi class determination.
 void setInverse(boolean isInverse)
          Modify the operation of the Bidi algorithm such that it approximates an "inverse Bidi" algorithm.
 Bidi setLine(int start, int limit)
          setLine() returns a Bidi object to contain the reordering information, especially the resolved levels, for all the characters in a line of text.
 void setPara(AttributedCharacterIterator paragraph)
          Perform the Unicode Bidi algorithm on a given paragraph, as defined in the Unicode Standard Annex #9, version 13, also described in The Unicode Standard, Version 4.0 .
 void setPara(char[] chars, byte paraLevel, byte[] embeddingLevels)
          Perform the Unicode Bidi algorithm.
 void setPara(String text, byte paraLevel, byte[] embeddingLevels)
          Perform the Unicode Bidi algorithm.
 void setReorderingMode(int reorderingMode)
          Modify the operation of the Bidi algorithm such that it implements some variant to the basic Bidi algorithm or approximates an "inverse Bidi" algorithm, depending on different values of the "reordering mode".
 void setReorderingOptions(int options)
          Specify which of the reordering options should be applied during Bidi transformations.
 String writeReordered(int options)
          Take a Bidi object containing the reordering information for a piece of text (one or more paragraphs) set by setPara() or for a line of text set by setLine() and return a string containing the reordered text.
static String writeReverse(String src, int options)
          Reverse a Right-To-Left run of Unicode text.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LEVEL_DEFAULT_LTR

public static final byte LEVEL_DEFAULT_LTR
Paragraph level setting

Constant indicating that the base direction depends on the first strong directional character in the text according to the Unicode Bidirectional Algorithm. If no strong directional character is present, then set the paragraph level to 0 (left-to-right).

If this value is used in conjunction with reordering modes REORDER_INVERSE_LIKE_DIRECT or REORDER_INVERSE_FOR_NUMBERS_SPECIAL, the text to reorder is assumed to be visual LTR, and the text after reordering is required to be the corresponding logical string with appropriate contextual direction. The direction of the result string will be RTL if either the righmost or leftmost strong character of the source text is RTL or Arabic Letter, the direction will be LTR otherwise.

If reordering option OPTION_INSERT_MARKS is set, an RLM may be added at the beginning of the result string to ensure round trip (that the result string, when reordered back to visual, will produce the original source text).

See Also:
REORDER_INVERSE_LIKE_DIRECT, REORDER_INVERSE_FOR_NUMBERS_SPECIAL, Constant Field Values
Status:
Stable ICU 3.8.

LEVEL_DEFAULT_RTL

public static final byte LEVEL_DEFAULT_RTL
Paragraph level setting

Constant indicating that the base direction depends on the first strong directional character in the text according to the Unicode Bidirectional Algorithm. If no strong directional character is present, then set the paragraph level to 1 (right-to-left).

If this value is used in conjunction with reordering modes REORDER_INVERSE_LIKE_DIRECT or REORDER_INVERSE_FOR_NUMBERS_SPECIAL, the text to reorder is assumed to be visual LTR, and the text after reordering is required to be the corresponding logical string with appropriate contextual direction. The direction of the result string will be RTL if either the righmost or leftmost strong character of the source text is RTL or Arabic Letter, or if the text contains no strong character; the direction will be LTR otherwise.

If reordering option OPTION_INSERT_MARKS is set, an RLM may be added at the beginning of the result string to ensure round trip (that the result string, when reordered back to visual, will produce the original source text).

See Also:
REORDER_INVERSE_LIKE_DIRECT, REORDER_INVERSE_FOR_NUMBERS_SPECIAL, Constant Field Values
Status:
Stable ICU 3.8.

MAX_EXPLICIT_LEVEL

public static final byte MAX_EXPLICIT_LEVEL
Maximum explicit embedding level. (The maximum resolved level can be up to MAX_EXPLICIT_LEVEL+1).

See Also:
Constant Field Values
Status:
Stable ICU 3.8.

LEVEL_OVERRIDE

public static final byte LEVEL_OVERRIDE
Bit flag for level input. Overrides directional properties.

See Also:
Constant Field Values
Status:
Stable ICU 3.8.

MAP_NOWHERE

public static final int MAP_NOWHERE
Special value which can be returned by the mapping methods when a logical index has no corresponding visual index or vice-versa. This may happen for the logical-to-visual mapping of a Bidi control when option OPTION_REMOVE_CONTROLS is specified. This can also happen for the visual-to-logical mapping of a Bidi mark (LRM or RLM) inserted by option OPTION_INSERT_MARKS.

See Also:
getVisualIndex(int), getVisualMap(), getLogicalIndex(int), getLogicalMap(), OPTION_INSERT_MARKS, OPTION_REMOVE_CONTROLS, Constant Field Values
Status:
Stable ICU 3.8.

LTR

public static final byte LTR
Left-to-right text.

See Also:
Constant Field Values
Status:
Stable ICU 3.8.

RTL

public static final byte RTL
Right-to-left text.

See Also:
Constant Field Values
Status:
Stable ICU 3.8.

MIXED

public static final byte MIXED
Mixed-directional text.

As return value for getDirection(), it means that the source string contains both left-to-right and right-to-left characters.

See Also:
Constant Field Values
Status:
Stable ICU 3.8.

NEUTRAL

public static final byte NEUTRAL
No strongly directional text.

As return value for getBaseDirection(), it means that the source string is missing or empty, or contains neither left-to-right nor right-to-left characters.

See Also:
Constant Field Values
Status:
Draft ICU 4.6.

KEEP_BASE_COMBINING

public static final short KEEP_BASE_COMBINING
option bit for writeReordered(): keep combining characters after their base characters in RTL runs

See Also:
writeReordered(int), Constant Field Values
Status:
Stable ICU 3.8.

DO_MIRRORING

public static final short DO_MIRRORING
option bit for writeReordered(): replace characters with the "mirrored" property in RTL runs by their mirror-image mappings

See Also:
writeReordered(int), Constant Field Values
Status:
Stable ICU 3.8.

INSERT_LRM_FOR_NUMERIC

public static final short INSERT_LRM_FOR_NUMERIC
option bit for writeReordered(): surround the run with LRMs if necessary; this is part of the approximate "inverse Bidi" algorithm

This option does not imply corresponding adjustment of the index mappings.

See Also:
setInverse(boolean), writeReordered(int), Constant Field Values
Status:
Stable ICU 3.8.

REMOVE_BIDI_CONTROLS

public static final short REMOVE_BIDI_CONTROLS
option bit for writeReordered(): remove Bidi control characters (this does not affect INSERT_LRM_FOR_NUMERIC)

This option does not imply corresponding adjustment of the index mappings.

See Also:
writeReordered(int), INSERT_LRM_FOR_NUMERIC, Constant Field Values
Status:
Stable ICU 3.8.

OUTPUT_REVERSE

public static final short OUTPUT_REVERSE
option bit for writeReordered(): write the output in reverse order

This has the same effect as calling writeReordered() first without this option, and then calling writeReverse() without mirroring. Doing this in the same step is faster and avoids a temporary buffer. An example for using this option is output to a character terminal that is designed for RTL scripts and stores text in reverse order.

See Also:
writeReordered(int), Constant Field Values
Status:
Stable ICU 3.8.

REORDER_DEFAULT

public static final short REORDER_DEFAULT
Reordering mode: Regular Logical to Visual Bidi algorithm according to Unicode.

See Also:
setReorderingMode(int), Constant Field Values
Status:
Stable ICU 3.8.

REORDER_NUMBERS_SPECIAL

public static final short REORDER_NUMBERS_SPECIAL
Reordering mode: Logical to Visual algorithm which handles numbers in a way which mimicks the behavior of Windows XP.

See Also:
setReorderingMode(int), Constant Field Values
Status:
Stable ICU 3.8.

REORDER_GROUP_NUMBERS_WITH_R

public static final short REORDER_GROUP_NUMBERS_WITH_R
Reordering mode: Logical to Visual algorithm grouping numbers with adjacent R characters (reversible algorithm).

See Also:
setReorderingMode(int), Constant Field Values
Status:
Stable ICU 3.8.

REORDER_RUNS_ONLY

public static final short REORDER_RUNS_ONLY
Reordering mode: Reorder runs only to transform a Logical LTR string to the logical RTL string with the same display, or vice-versa.
If this mode is set together with option OPTION_INSERT_MARKS, some Bidi controls in the source text may be removed and other controls may be added to produce the minimum combination which has the required display.

See Also:
OPTION_INSERT_MARKS, setReorderingMode(int), Constant Field Values
Status:
Stable ICU 3.8.

REORDER_INVERSE_NUMBERS_AS_L

public static final short REORDER_INVERSE_NUMBERS_AS_L
Reordering mode: Visual to Logical algorithm which handles numbers like L (same algorithm as selected by setInverse(true).

See Also:
setInverse(boolean), setReorderingMode(int), Constant Field Values
Status:
Stable ICU 3.8.

REORDER_INVERSE_LIKE_DIRECT

public static final short REORDER_INVERSE_LIKE_DIRECT
Reordering mode: Visual to Logical algorithm equivalent to the regular Logical to Visual algorithm.

See Also:
setReorderingMode(int), Constant Field Values
Status:
Stable ICU 3.8.

REORDER_INVERSE_FOR_NUMBERS_SPECIAL

public static final short REORDER_INVERSE_FOR_NUMBERS_SPECIAL
Reordering mode: Inverse Bidi (Visual to Logical) algorithm for the REORDER_NUMBERS_SPECIAL Bidi algorithm.

See Also:
setReorderingMode(int), Constant Field Values
Status:
Stable ICU 3.8.

OPTION_DEFAULT

public static final int OPTION_DEFAULT
Option value for setReorderingOptions: disable all the options which can be set with this method

See Also:
setReorderingOptions(int), Constant Field Values
Status:
Stable ICU 3.8.

OPTION_INSERT_MARKS

public static final int OPTION_INSERT_MARKS
Option bit for setReorderingOptions: insert Bidi marks (LRM or RLM) when needed to ensure correct result of a reordering to a Logical order

This option must be set or reset before calling setPara.

This option is significant only with reordering modes which generate a result with Logical order, specifically.

If this option is set in conjunction with reordering mode REORDER_INVERSE_NUMBERS_AS_L or with calling setInverse(true), it implies option INSERT_LRM_FOR_NUMERIC in calls to method writeReordered().

For other reordering modes, a minimum number of LRM or RLM characters will be added to the source text after reordering it so as to ensure round trip, i.e. when applying the inverse reordering mode on the resulting logical text with removal of Bidi marks (option OPTION_REMOVE_CONTROLS set before calling setPara() or option REMOVE_BIDI_CONTROLS in writeReordered), the result will be identical to the source text in the first transformation.

This option will be ignored if specified together with option OPTION_REMOVE_CONTROLS. It inhibits option REMOVE_BIDI_CONTROLS in calls to method writeReordered() and it implies option INSERT_LRM_FOR_NUMERIC in calls to method writeReordered() if the reordering mode is REORDER_INVERSE_NUMBERS_AS_L.

See Also:
setReorderingMode(int), setReorderingOptions(int), INSERT_LRM_FOR_NUMERIC, REMOVE_BIDI_CONTROLS, OPTION_REMOVE_CONTROLS, REORDER_RUNS_ONLY, REORDER_INVERSE_NUMBERS_AS_L, REORDER_INVERSE_LIKE_DIRECT, REORDER_INVERSE_FOR_NUMBERS_SPECIAL, Constant Field Values
Status:
Stable ICU 3.8.

OPTION_REMOVE_CONTROLS

public static final int OPTION_REMOVE_CONTROLS
Option bit for setReorderingOptions: remove Bidi control characters

This option must be set or reset before calling setPara.

This option nullifies option OPTION_INSERT_MARKS. It inhibits option INSERT_LRM_FOR_NUMERIC in calls to method writeReordered() and it implies option REMOVE_BIDI_CONTROLS in calls to that method.

See Also:
setReorderingMode(int), setReorderingOptions(int), OPTION_INSERT_MARKS, INSERT_LRM_FOR_NUMERIC, REMOVE_BIDI_CONTROLS, Constant Field Values
Status:
Stable ICU 3.8.

OPTION_STREAMING

public static final int OPTION_STREAMING
Option bit for setReorderingOptions: process the output as part of a stream to be continued

This option must be set or reset before calling setPara.

This option specifies that the caller is interested in processing large text object in parts. The results of the successive calls are expected to be concatenated by the caller. Only the call for the last part will have this option bit off.

When this option bit is on, setPara() may process less than the full source text in order to truncate the text at a meaningful boundary. The caller should call getProcessedLength() immediately after calling setPara() in order to determine how much of the source text has been processed. Source text beyond that length should be resubmitted in following calls to setPara. The processed length may be less than the length of the source text if a character preceding the last character of the source text constitutes a reasonable boundary (like a block separator) for text to be continued.
If the last character of the source text constitutes a reasonable boundary, the whole text will be processed at once.
If nowhere in the source text there exists such a reasonable boundary, the processed length will be zero.
The caller should check for such an occurrence and do one of the following:

In all cases, this option should be turned off before processing the last part of the text.

When the OPTION_STREAMING option is used, it is recommended to call orderParagraphsLTR(true) before calling setPara() so that later paragraphs may be concatenated to previous paragraphs on the right.

See Also:
setReorderingMode(int), setReorderingOptions(int), getProcessedLength(), Constant Field Values
Status:
Stable ICU 3.8.

CLASS_DEFAULT

public static final int CLASS_DEFAULT
Value returned by BidiClassifier when there is no need to override the standard Bidi class for a given code point.

See Also:
BidiClassifier, Constant Field Values
Status:
Stable ICU 3.8.

DIRECTION_LEFT_TO_RIGHT

public static final int DIRECTION_LEFT_TO_RIGHT
Constant indicating base direction is left-to-right.

See Also:
Constant Field Values
Status:
Stable ICU 3.8.

DIRECTION_RIGHT_TO_LEFT

public static final int DIRECTION_RIGHT_TO_LEFT
Constant indicating base direction is right-to-left.

See Also:
Constant Field Values
Status:
Stable ICU 3.8.

DIRECTION_DEFAULT_LEFT_TO_RIGHT

public static final int DIRECTION_DEFAULT_LEFT_TO_RIGHT
Constant indicating that the base direction depends on the first strong directional character in the text according to the Unicode Bidirectional Algorithm. If no strong directional character is present, the base direction is left-to-right.

See Also:
Constant Field Values
Status:
Stable ICU 3.8.

DIRECTION_DEFAULT_RIGHT_TO_LEFT

public static final int DIRECTION_DEFAULT_RIGHT_TO_LEFT
Constant indicating that the base direction depends on the first strong directional character in the text according to the Unicode Bidirectional Algorithm. If no strong directional character is present, the base direction is right-to-left.

See Also:
Constant Field Values
Status:
Stable ICU 3.8.
Constructor Detail

Bidi

public Bidi()
Allocate a Bidi object. Such an object is initially empty. It is assigned the Bidi properties of a piece of text containing one or more paragraphs by setPara() or the Bidi properties of a line within a paragraph by setLine().

This object can be reused.

setPara() and setLine() will allocate additional memory for internal structures as necessary.

Status:
Stable ICU 3.8.

Bidi

public Bidi(int maxLength,
            int maxRunCount)
Allocate a Bidi object with preallocated memory for internal structures. This method provides a Bidi object like the default constructor but it also preallocates memory for internal structures according to the sizings supplied by the caller.

The preallocation can be limited to some of the internal memory by setting some values to 0 here. That means that if, e.g., maxRunCount cannot be reasonably predetermined and should not be set to maxLength (the only failproof value) to avoid wasting memory, then maxRunCount could be set to 0 here and the internal structures that are associated with it will be allocated on demand, just like with the default constructor.

Parameters:
maxLength - is the maximum text or line length that internal memory will be preallocated for. An attempt to associate this object with a longer text will fail, unless this value is 0, which leaves the allocation up to the implementation.
maxRunCount - is the maximum anticipated number of same-level runs that internal memory will be preallocated for. An attempt to access visual runs on an object that was not preallocated for as many runs as the text was actually resolved to will fail, unless this value is 0, which leaves the allocation up to the implementation.

The number of runs depends on the actual text and maybe anywhere between 1 and maxLength. It is typically small.
Throws:
IllegalArgumentException - if maxLength or maxRunCount is less than 0
Status:
Stable ICU 3.8.

Bidi

public Bidi(String paragraph,
            int flags)
Create Bidi from the given paragraph of text and base direction.

Parameters:
paragraph - a paragraph of text
flags - a collection of flags that control the algorithm. The algorithm understands the flags DIRECTION_LEFT_TO_RIGHT, DIRECTION_RIGHT_TO_LEFT, DIRECTION_DEFAULT_LEFT_TO_RIGHT, and DIRECTION_DEFAULT_RIGHT_TO_LEFT. Other values are reserved.
See Also:
DIRECTION_LEFT_TO_RIGHT, DIRECTION_RIGHT_TO_LEFT, DIRECTION_DEFAULT_LEFT_TO_RIGHT, DIRECTION_DEFAULT_RIGHT_TO_LEFT
Status:
Stable ICU 3.8.

Bidi

public Bidi(AttributedCharacterIterator paragraph)
Create Bidi from the given paragraph of text.

The RUN_DIRECTION attribute in the text, if present, determines the base direction (left-to-right or right-to-left). If not present, the base direction is computed using the Unicode Bidirectional Algorithm, defaulting to left-to-right if there are no strong directional characters in the text. This attribute, if present, must be applied to all the text in the paragraph.

The BIDI_EMBEDDING attribute in the text, if present, represents embedding level information. Negative values from -1 to -62 indicate overrides at the absolute value of the level. Positive values from 1 to 62 indicate embeddings. Where values are zero or not defined, the base embedding level as determined by the base direction is assumed.

The NUMERIC_SHAPING attribute in the text, if present, converts European digits to other decimal digits before running the bidi algorithm. This attribute, if present, must be applied to all the text in the paragraph.

Note: this constructor calls setPara() internally.

Parameters:
paragraph - a paragraph of text with optional character and paragraph attribute information
Status:
Stable ICU 3.8.

Bidi

public Bidi(char[] text,
            int textStart,
            byte[] embeddings,
            int embStart,
            int paragraphLength,
            int flags)
Create Bidi from the given text, embedding, and direction information. The embeddings array may be null. If present, the values represent embedding level information. Negative values from -1 to -61 indicate overrides at the absolute value of the level. Positive values from 1 to 61 indicate embeddings. Where values are zero, the base embedding level as determined by the base direction is assumed.

Note: this constructor calls setPara() internally.

Parameters:
text - an array containing the paragraph of text to process.
textStart - the index into the text array of the start of the paragraph.
embeddings - an array containing embedding values for each character in the paragraph. This can be null, in which case it is assumed that there is no external embedding information.
embStart - the index into the embedding array of the start of the paragraph.
paragraphLength - the length of the paragraph in the text and embeddings arrays.
flags - a collection of flags that control the algorithm. The algorithm understands the flags DIRECTION_LEFT_TO_RIGHT, DIRECTION_RIGHT_TO_LEFT, DIRECTION_DEFAULT_LEFT_TO_RIGHT, and DIRECTION_DEFAULT_RIGHT_TO_LEFT. Other values are reserved.
Throws:
IllegalArgumentException - if the values in embeddings are not within the allowed range
See Also:
DIRECTION_LEFT_TO_RIGHT, DIRECTION_RIGHT_TO_LEFT, DIRECTION_DEFAULT_LEFT_TO_RIGHT, DIRECTION_DEFAULT_RIGHT_TO_LEFT
Status:
Stable ICU 3.8.
Method Detail

setInverse

public void setInverse(boolean isInverse)
Modify the operation of the Bidi algorithm such that it approximates an "inverse Bidi" algorithm. This method must be called before setPara().

The normal operation of the Bidi algorithm as described in the Unicode Technical Report is to take text stored in logical (keyboard, typing) order and to determine the reordering of it for visual rendering. Some legacy systems store text in visual order, and for operations with standard, Unicode-based algorithms, the text needs to be transformed to logical order. This is effectively the inverse algorithm of the described Bidi algorithm. Note that there is no standard algorithm for this "inverse Bidi" and that the current implementation provides only an approximation of "inverse Bidi".

With isInversed set to true, this method changes the behavior of some of the subsequent methods in a way that they can be used for the inverse Bidi algorithm. Specifically, runs of text with numeric characters will be treated in a special way and may need to be surrounded with LRM characters when they are written in reordered sequence.

Output runs should be retrieved using getVisualRun(). Since the actual input for "inverse Bidi" is visually ordered text and getVisualRun() gets the reordered runs, these are actually the runs of the logically ordered output.

Calling this method with argument isInverse set to true is equivalent to calling setReorderingMode with argument reorderingMode set to REORDER_INVERSE_NUMBERS_AS_L.
Calling this method with argument isInverse set to false is equivalent to calling setReorderingMode with argument reorderingMode set to REORDER_DEFAULT.

Parameters:
isInverse - specifies "forward" or "inverse" Bidi operation.
See Also:
setPara(java.lang.String, byte, byte[]), writeReordered(int), setReorderingMode(int), REORDER_INVERSE_NUMBERS_AS_L, REORDER_DEFAULT
Status:
Stable ICU 3.8.

isInverse

public boolean isInverse()
Is this Bidi object set to perform the inverse Bidi algorithm?

Note: calling this method after setting the reordering mode with setReorderingMode will return true if the reordering mode was set to REORDER_INVERSE_NUMBERS_AS_L, false for all other values.

Returns:
true if the Bidi object is set to perform the inverse Bidi algorithm by handling numbers as L.
See Also:
setInverse(boolean), setReorderingMode(int), REORDER_INVERSE_NUMBERS_AS_L
Status:
Stable ICU 3.8.

setReorderingMode

public void setReorderingMode(int reorderingMode)
Modify the operation of the Bidi algorithm such that it implements some variant to the basic Bidi algorithm or approximates an "inverse Bidi" algorithm, depending on different values of the "reordering mode". This method must be called before setPara(), and stays in effect until called again with a different argument.

The normal operation of the Bidi algorithm as described in the Unicode Standard Annex #9 is to take text stored in logical (keyboard, typing) order and to determine how to reorder it for visual rendering.

With the reordering mode set to a value other than REORDER_DEFAULT, this method changes the behavior of some of the subsequent methods in a way such that they implement an inverse Bidi algorithm or some other algorithm variants.

Some legacy systems store text in visual order, and for operations with standard, Unicode-based algorithms, the text needs to be transformed into logical order. This is effectively the inverse algorithm of the described Bidi algorithm. Note that there is no standard algorithm for this "inverse Bidi", so a number of variants are implemented here.

In other cases, it may be desirable to emulate some variant of the Logical to Visual algorithm (e.g. one used in MS Windows), or perform a Logical to Logical transformation.

  • When the Reordering Mode is set to REORDER_DEFAULT, the standard Bidi Logical to Visual algorithm is applied.
  • When the reordering mode is set to REORDER_NUMBERS_SPECIAL, the algorithm used to perform Bidi transformations when calling setPara should approximate the algorithm used in Microsoft Windows XP rather than strictly conform to the Unicode Bidi algorithm.
    The differences between the basic algorithm and the algorithm addressed by this option are as follows:
    • Within text at an even embedding level, the sequence "123AB" (where AB represent R or AL letters) is transformed to "123BA" by the Unicode algorithm and to "BA123" by the Windows algorithm.
    • Arabic-Indic numbers (AN) are handled by the Windows algorithm just like regular numbers (EN).
  • When the reordering mode is set to REORDER_GROUP_NUMBERS_WITH_R, numbers located between LTR text and RTL text are associated with the RTL text. For instance, an LTR paragraph with content "abc 123 DEF" (where upper case letters represent RTL characters) will be transformed to "abc FED 123" (and not "abc 123 FED"), "DEF 123 abc" will be transformed to "123 FED abc" and "123 FED abc" will be transformed to "DEF 123 abc". This makes the algorithm reversible and makes it useful when round trip (from visual to logical and back to visual) must be achieved without adding LRM characters. However, this is a variation from the standard Unicode Bidi algorithm.
    The source text should not contain Bidi control characters other than LRM or RLM.
  • When the reordering mode is set to REORDER_RUNS_ONLY, a "Logical to Logical" transformation must be performed:
    • If the default text level of the source text (argument paraLevel in setPara) is even, the source text will be handled as LTR logical text and will be transformed to the RTL logical text which has the same LTR visual display.
    • If the default level of the source text is odd, the source text will be handled as RTL logical text and will be transformed to the LTR logical text which has the same LTR visual display.
    This mode may be needed when logical text which is basically Arabic or Hebrew, with possible included numbers or phrases in English, has to be displayed as if it had an even embedding level (this can happen if the displaying application treats all text as if it was basically LTR).
    This mode may also be needed in the reverse case, when logical text which is basically English, with possible included phrases in Arabic or Hebrew, has to be displayed as if it had an odd embedding level.
    Both cases could be handled by adding LRE or RLE at the head of the text, if the display subsystem supports these formatting controls. If it does not, the problem may be handled by transforming the source text in this mode before displaying it, so that it will be displayed properly.
    The source text should not contain Bidi control characters other than LRM or RLM.
  • When the reordering mode is set to REORDER_INVERSE_NUMBERS_AS_L, an "inverse Bidi" algorithm is applied. Runs of text with numeric characters will be treated like LTR letters and may need to be surrounded with LRM characters when they are written in reordered sequence (the option INSERT_LRM_FOR_NUMERIC can be used with method writeReordered to this end. This mode is equivalent to calling setInverse() with argument isInverse set to true.
  • When the reordering mode is set to REORDER_INVERSE_LIKE_DIRECT, the "direct" Logical to Visual Bidi algorithm is used as an approximation of an "inverse Bidi" algorithm. This mode is similar to mode REORDER_INVERSE_NUMBERS_AS_L but is closer to the regular Bidi algorithm.
    For example, an LTR paragraph with the content "FED 123 456 CBA" (where upper case represents RTL characters) will be transformed to "ABC 456 123 DEF", as opposed to "DEF 123 456 ABC" with mode REORDER_INVERSE_NUMBERS_AS_L.
    When used in conjunction with option OPTION_INSERT_MARKS, this mode generally adds Bidi marks to the output significantly more sparingly than mode REORDER_INVERSE_NUMBERS_AS_L.
    with option INSERT_LRM_FOR_NUMERIC in calls to writeReordered.
  • When the reordering mode is set to REORDER_INVERSE_FOR_NUMBERS_SPECIAL, the Logical to Visual Bidi algorithm used in Windows XP is used as an approximation of an "inverse Bidi" algorithm.
    For example, an LTR paragraph with the content "abc FED123" (where upper case represents RTL characters) will be transformed to "abc 123DEF.

In all the reordering modes specifying an "inverse Bidi" algorithm (i.e. those with a name starting with REORDER_INVERSE), output runs should be retrieved using getVisualRun(), and the output text with writeReordered(). The caller should keep in mind that in "inverse Bidi" modes the input is actually visually ordered text and reordered output returned by getVisualRun() or writeReordered() are actually runs or character string of logically ordered output.
For all the "inverse Bidi" modes, the source text should not contain Bidi control characters other than LRM or RLM.

Note that option OUTPUT_REVERSE of writeReordered has no useful meaning and should not be used in conjunction with any value of the reordering mode specifying "inverse Bidi" or with value REORDER_RUNS_ONLY.

Parameters:
reorderingMode - specifies the required variant of the Bidi algorithm.
See Also:
setInverse(boolean), setPara(java.lang.String, byte, byte[]), writeReordered(int), INSERT_LRM_FOR_NUMERIC, OUTPUT_REVERSE, REORDER_DEFAULT, REORDER_NUMBERS_SPECIAL, REORDER_GROUP_NUMBERS_WITH_R, REORDER_RUNS_ONLY, REORDER_INVERSE_NUMBERS_AS_L, REORDER_INVERSE_LIKE_DIRECT, REORDER_INVERSE_FOR_NUMBERS_SPECIAL
Status:
Stable ICU 3.8.

getReorderingMode

public int getReorderingMode()
What is the requested reordering mode for a given Bidi object?

Returns:
the current reordering mode of the Bidi object
See Also:
setReorderingMode(int)
Status:
Stable ICU 3.8.

setReorderingOptions

public void setReorderingOptions(int options)
Specify which of the reordering options should be applied during Bidi transformations.

Parameters:
options - A combination of zero or more of the following reordering options: OPTION_DEFAULT, OPTION_INSERT_MARKS, OPTION_REMOVE_CONTROLS, OPTION_STREAMING.
See Also:
getReorderingOptions(), OPTION_DEFAULT, OPTION_INSERT_MARKS, OPTION_REMOVE_CONTROLS, OPTION_STREAMING
Status:
Stable ICU 3.8.

getReorderingOptions

public int getReorderingOptions()
What are the reordering options applied to a given Bidi object?

Returns:
the current reordering options of the Bidi object
See Also:
setReorderingOptions(int)
Status:
Stable ICU 3.8.

setPara

public void setPara(String text,
                    byte paraLevel,
                    byte[] embeddingLevels)
Perform the Unicode Bidi algorithm. It is defined in the Unicode Standard Annex #9, version 13, also described in The Unicode Standard, Version 4.0 .

This method takes a piece of plain text containing one or more paragraphs, with or without externally specified embedding levels from styled text and computes the left-right-directionality of each character.

If the entire text is all of the same directionality, then the method may not perform all the steps described by the algorithm, i.e., some levels may not be the same as if all steps were performed. This is not relevant for unidirectional text.
For example, in pure LTR text with numbers the numbers would get a resolved level of 2 higher than the surrounding text according to the algorithm. This implementation may set all resolved levels to the same value in such a case.

The text can be composed of multiple paragraphs. Occurrence of a block separator in the text terminates a paragraph, and whatever comes next starts a new paragraph. The exception to this rule is when a Carriage Return (CR) is followed by a Line Feed (LF). Both CR and LF are block separators, but in that case, the pair of characters is considered as terminating the preceding paragraph, and a new paragraph will be started by a character coming after the LF. Although the text is passed here as a String, it is stored internally as an array of characters. Therefore the documentation will refer to indexes of the characters in the text.

Parameters:
text - contains the text that the Bidi algorithm will be performed on. This text can be retrieved with getText() or getTextAsString.
paraLevel - specifies the default level for the text; it is typically 0 (LTR) or 1 (RTL). If the method shall determine the paragraph level from the text, then paraLevel can be set to either LEVEL_DEFAULT_LTR or LEVEL_DEFAULT_RTL; if the text contains multiple paragraphs, the paragraph level shall be determined separately for each paragraph; if a paragraph does not include any strongly typed character, then the desired default is used (0 for LTR or 1 for RTL). Any other value between 0 and MAX_EXPLICIT_LEVEL is also valid, with odd levels indicating RTL.
embeddingLevels - (in) may be used to preset the embedding and override levels, ignoring characters like LRE and PDF in the text. A level overrides the directional property of its corresponding (same index) character if the level has the LEVEL_OVERRIDE bit set.

Except for that bit, it must be paraLevel<=embeddingLevels[]<=MAX_EXPLICIT_LEVEL, with one exception: a level of zero may be specified for a paragraph separator even if paraLevel>0 when multiple paragraphs are submitted in the same call to setPara().

Caution: A reference to this array, not a copy of the levels, will be stored in the Bidi object; the embeddingLevels should not be modified to avoid unexpected results on subsequent Bidi operations. However, the setPara() and setLine() methods may modify some or all of the levels.

Note: the embeddingLevels array must have one entry for each character in text.
Throws:
IllegalArgumentException - if the values in embeddingLevels are not within the allowed range
See Also:
LEVEL_DEFAULT_LTR, LEVEL_DEFAULT_RTL, LEVEL_OVERRIDE, MAX_EXPLICIT_LEVEL
Status:
Stable ICU 3.8.

setPara

public void setPara(char[] chars,
                    byte paraLevel,
                    byte[] embeddingLevels)
Perform the Unicode Bidi algorithm. It is defined in the Unicode Standard Annex #9, version 13, also described in The Unicode Standard, Version 4.0 .

This method takes a piece of plain text containing one or more paragraphs, with or without externally specified embedding levels from styled text and computes the left-right-directionality of each character.

If the entire text is all of the same directionality, then the method may not perform all the steps described by the algorithm, i.e., some levels may not be the same as if all steps were performed. This is not relevant for unidirectional text.
For example, in pure LTR text with numbers the numbers would get a resolved level of 2 higher than the surrounding text according to the algorithm. This implementation may set all resolved levels to the same value in such a case.

The text can be composed of multiple paragraphs. Occurrence of a block separator in the text terminates a paragraph, and whatever comes next starts a new paragraph. The exception to this rule is when a Carriage Return (CR) is followed by a Line Feed (LF). Both CR and LF are block separators, but in that case, the pair of characters is considered as terminating the preceding paragraph, and a new paragraph will be started by a character coming after the LF. The text is stored internally as an array of characters. Therefore the documentation will refer to indexes of the characters in the text.

Parameters:
chars - contains the text that the Bidi algorithm will be performed on. This text can be retrieved with getText() or getTextAsString.
paraLevel - specifies the default level for the text; it is typically 0 (LTR) or 1 (RTL). If the method shall determine the paragraph level from the text, then paraLevel can be set to either LEVEL_DEFAULT_LTR or LEVEL_DEFAULT_RTL; if the text contains multiple paragraphs, the paragraph level shall be determined separately for each paragraph; if a paragraph does not include any strongly typed character, then the desired default is used (0 for LTR or 1 for RTL). Any other value between 0 and MAX_EXPLICIT_LEVEL is also valid, with odd levels indicating RTL.
embeddingLevels - (in) may be used to preset the embedding and override levels, ignoring characters like LRE and PDF in the text. A level overrides the directional property of its corresponding (same index) character if the level has the LEVEL_OVERRIDE bit set.

Except for that bit, it must be paraLevel<=embeddingLevels[]<=MAX_EXPLICIT_LEVEL, with one exception: a level of zero may be specified for a paragraph separator even if paraLevel>0 when multiple paragraphs are submitted in the same call to setPara().

Caution: A reference to this array, not a copy of the levels, will be stored in the Bidi object; the embeddingLevels should not be modified to avoid unexpected results on subsequent Bidi operations. However, the setPara() and setLine() methods may modify some or all of the levels.

Note: the embeddingLevels array must have one entry for each character in text.
Throws:
IllegalArgumentException - if the values in embeddingLevels are not within the allowed range
See Also:
LEVEL_DEFAULT_LTR, LEVEL_DEFAULT_RTL, LEVEL_OVERRIDE, MAX_EXPLICIT_LEVEL
Status:
Stable ICU 3.8.

setPara

public void setPara(AttributedCharacterIterator paragraph)
Perform the Unicode Bidi algorithm on a given paragraph, as defined in the Unicode Standard Annex #9, version 13, also described in The Unicode Standard, Version 4.0 .

This method takes a paragraph of text and computes the left-right-directionality of each character. The text should not contain any Unicode block separators.

The RUN_DIRECTION attribute in the text, if present, determines the base direction (left-to-right or right-to-left). If not present, the base direction is computed using the Unicode Bidirectional Algorithm, defaulting to left-to-right if there are no strong directional characters in the text. This attribute, if present, must be applied to all the text in the paragraph.

The BIDI_EMBEDDING attribute in the text, if present, represents embedding level information. Negative values from -1 to -62 indicate overrides at the absolute value of the level. Positive values from 1 to 62 indicate embeddings. Where values are zero or not defined, the base embedding level as determined by the base direction is assumed.

The NUMERIC_SHAPING attribute in the text, if present, converts European digits to other decimal digits before running the bidi algorithm. This attribute, if present, must be applied to all the text in the paragraph. If the entire text is all of the same directionality, then the method may not perform all the steps described by the algorithm, i.e., some levels may not be the same as if all steps were performed. This is not relevant for unidirectional text.
For example, in pure LTR text with numbers the numbers would get a resolved level of 2 higher than the surrounding text according to the algorithm. This implementation may set all resolved levels to the same value in such a case.

Parameters:
paragraph - a paragraph of text with optional character and paragraph attribute information
Status:
Stable ICU 3.8.

orderParagraphsLTR

public void orderParagraphsLTR(boolean ordarParaLTR)
Specify whether block separators must be allocated level zero, so that successive paragraphs will progress from left to right. This method must be called before setPara(). Paragraph separators (B) may appear in the text. Setting them to level zero means that all paragraph separators (including one possibly appearing in the last text position) are kept in the reordered text after the text that they follow in the source text. When this feature is not enabled, a paragraph separator at the last position of the text before reordering will go to the first position of the reordered text when the paragraph level is odd.

Parameters:
ordarParaLTR - specifies whether paragraph separators (B) must receive level 0, so that successive paragraphs progress from left to right.
See Also:
setPara(java.lang.String, byte, byte[])
Status:
Stable ICU 3.8.

isOrderParagraphsLTR

public boolean isOrderParagraphsLTR()
Is this Bidi object set to allocate level 0 to block separators so that successive paragraphs progress from left to right?

Returns:
true if the Bidi object is set to allocate level 0 to block separators.
Status:
Stable ICU 3.8.

getDirection

public byte getDirection()
Get the directionality of the text.

Returns:
a value of LTR, RTL or MIXED that indicates if the entire text represented by this object is unidirectional, and which direction, or if it is mixed-directional.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
See Also:
LTR, RTL, MIXED
Status:
Stable ICU 3.8.

getTextAsString

public String getTextAsString()
Get the text.

Returns:
A String containing the text that the Bidi object was created for.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
See Also:
setPara(java.lang.String, byte, byte[]), setLine(int, int)
Status:
Stable ICU 3.8.

getText

public char[] getText()
Get the text.

Returns:
A char array containing the text that the Bidi object was created for.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
See Also:
setPara(java.lang.String, byte, byte[]), setLine(int, int)
Status:
Stable ICU 3.8.

getLength

public int getLength()
Get the length of the text.

Returns:
The length of the text that the Bidi object was created for.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
Status:
Stable ICU 3.8.

getProcessedLength

public int getProcessedLength()
Get the length of the source text processed by the last call to setPara(). This length may be different from the length of the source text if option OPTION_STREAMING has been set.
Note that whenever the length of the text affects the execution or the result of a method, it is the processed length which must be considered, except for setPara (which receives unprocessed source text) and getLength (which returns the original length of the source text).
In particular, the processed length is the one to consider in the following cases:
  • maximum value of the limit argument of setLine
  • maximum value of the charIndex argument of getParagraph
  • maximum value of the charIndex argument of getLevelAt
  • number of elements in the array returned by getLevels
  • maximum value of the logicalStart argument of getLogicalRun
  • maximum value of the logicalIndex argument of getVisualIndex
  • number of elements returned by getLogicalMap
  • length of text processed by writeReordered

Returns:
The length of the part of the source text processed by the last call to setPara.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
See Also:
setPara(java.lang.String, byte, byte[]), OPTION_STREAMING
Status:
Stable ICU 3.8.

getResultLength

public int getResultLength()
Get the length of the reordered text resulting from the last call to setPara(). This length may be different from the length of the source text if option OPTION_INSERT_MARKS or option OPTION_REMOVE_CONTROLS has been set.
This resulting length is the one to consider in the following cases:
  • maximum value of the visualIndex argument of getLogicalIndex
  • number of elements returned by getVisualMap
Note that this length stays identical to the source text length if Bidi marks are inserted or removed using option bits of writeReordered, or if option REORDER_INVERSE_NUMBERS_AS_L has been set.

Returns:
The length of the reordered text resulting from the last call to setPara.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
See Also:
setPara(java.lang.String, byte, byte[]), OPTION_INSERT_MARKS, OPTION_REMOVE_CONTROLS, REORDER_INVERSE_NUMBERS_AS_L
Status:
Stable ICU 3.8.

getParaLevel

public byte getParaLevel()
Get the paragraph level of the text.

Returns:
The paragraph level. If there are multiple paragraphs, their level may vary if the required paraLevel is LEVEL_DEFAULT_LTR or LEVEL_DEFAULT_RTL. In that case, the level of the first paragraph is returned.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
See Also:
LEVEL_DEFAULT_LTR, LEVEL_DEFAULT_RTL, getParagraph(int), getParagraphByIndex(int)
Status:
Stable ICU 3.8.

countParagraphs

public int countParagraphs()
Get the number of paragraphs.

Returns:
The number of paragraphs.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
Status:
Stable ICU 3.8.

getParagraphByIndex

public BidiRun getParagraphByIndex(int paraIndex)
Get a paragraph, given the index of this paragraph. This method returns information about a paragraph.

Parameters:
paraIndex - is the number of the paragraph, in the range [0..countParagraphs()-1].
Returns:
a BidiRun object with the details of the paragraph:
start will receive the index of the first character of the paragraph in the text.
limit will receive the limit of the paragraph.
embeddingLevel will receive the level of the paragraph.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if paraIndex is not in the range [0..countParagraphs()-1]
See Also:
BidiRun
Status:
Stable ICU 3.8.

getParagraph

public BidiRun getParagraph(int charIndex)
Get a paragraph, given a position within the text. This method returns information about a paragraph.
Note: if the paragraph index is known, it is more efficient to retrieve the paragraph information using getParagraphByIndex().

Parameters:
charIndex - is the index of a character within the text, in the range [0..getProcessedLength()-1].
Returns:
a BidiRun object with the details of the paragraph:
start will receive the index of the first character of the paragraph in the text.
limit will receive the limit of the paragraph.
embeddingLevel will receive the level of the paragraph.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if charIndex is not within the legal range
See Also:
BidiRun, getParagraphByIndex(int), getProcessedLength()
Status:
Stable ICU 3.8.

getParagraphIndex

public int getParagraphIndex(int charIndex)
Get the index of a paragraph, given a position within the text.

Parameters:
charIndex - is the index of a character within the text, in the range [0..getProcessedLength()-1].
Returns:
The index of the paragraph containing the specified position, starting from 0.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if charIndex is not within the legal range
See Also:
BidiRun, getProcessedLength()
Status:
Stable ICU 3.8.

setCustomClassifier

public void setCustomClassifier(BidiClassifier classifier)
Set a custom Bidi classifier used by the UBA implementation for Bidi class determination.

Parameters:
classifier - A new custom classifier. This can be null.
See Also:
getCustomClassifier()
Status:
Stable ICU 3.8.

getCustomClassifier

public BidiClassifier getCustomClassifier()
Gets the current custom class classifier used for Bidi class determination.

Returns:
An instance of class BidiClassifier
See Also:
setCustomClassifier(com.ibm.icu.text.BidiClassifier)
Status:
Stable ICU 3.8.

getCustomizedClass

public int getCustomizedClass(int c)
Retrieves the Bidi class for a given code point.

If a BidiClassifier is defined and returns a value other than CLASS_DEFAULT, that value is used; otherwise the default class determination mechanism is invoked.

Parameters:
c - The code point to get a Bidi class for.
Returns:
The Bidi class for the character c that is in effect for this Bidi instance.
See Also:
BidiClassifier
Status:
Stable ICU 3.8.

setLine

public Bidi setLine(int start,
                    int limit)
setLine() returns a Bidi object to contain the reordering information, especially the resolved levels, for all the characters in a line of text. This line of text is specified by referring to a Bidi object representing this information for a piece of text containing one or more paragraphs, and by specifying a range of indexes in this text.

In the new line object, the indexes will range from 0 to limit-start-1.

This is used after calling setPara() for a piece of text, and after line-breaking on that text. It is not necessary if each paragraph is treated as a single line.

After line-breaking, rules (L1) and (L2) for the treatment of trailing WS and for reordering are performed on a Bidi object that represents a line.

Important: the line Bidi object may reference data within the global text Bidi object. You should not alter the content of the global text object until you are finished using the line object.

Parameters:
start - is the line's first index into the text.
limit - is just behind the line's last index into the text (its last index +1).
Returns:
a Bidi object that will now represent a line of the text.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara
IllegalArgumentException - if start and limit are not in the range 0<=start<limit<=getProcessedLength(), or if the specified line crosses a paragraph boundary
See Also:
setPara(java.lang.String, byte, byte[]), getProcessedLength()
Status:
Stable ICU 3.8.

getLevelAt

public byte getLevelAt(int charIndex)
Get the level for one character.

Parameters:
charIndex - the index of a character.
Returns:
The level for the character at charIndex.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if charIndex is not in the range 0<=charIndex<getProcessedLength()
See Also:
getProcessedLength()
Status:
Stable ICU 3.8.

getLevels

public byte[] getLevels()
Get an array of levels for each character.

Note that this method may allocate memory under some circumstances, unlike getLevelAt().

Returns:
The levels array for the text, or null if an error occurs.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
Status:
Stable ICU 3.8.

getLogicalRun

public BidiRun getLogicalRun(int logicalPosition)
Get a logical run. This method returns information about a run and is used to retrieve runs in logical order.

This is especially useful for line-breaking on a paragraph.

Parameters:
logicalPosition - is a logical position within the source text.
Returns:
a BidiRun object filled with start containing the first character of the run, limit containing the limit of the run, and embeddingLevel containing the level of the run.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if logicalPosition is not in the range 0<=logicalPosition<getProcessedLength()
See Also:
BidiRun, BidiRun.getStart(), BidiRun.getLimit(), BidiRun.getEmbeddingLevel()
Status:
Stable ICU 3.8.

countRuns

public int countRuns()
Get the number of runs. This method may invoke the actual reordering on the Bidi object, after setPara() may have resolved only the levels of the text. Therefore, countRuns() may have to allocate memory, and may throw an exception if it fails to do so.

Returns:
The number of runs.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
Status:
Stable ICU 3.8.

getVisualRun

public BidiRun getVisualRun(int runIndex)
Get a BidiRun object according to its index. BidiRun methods may be used to retrieve the run's logical start, length and level, which can be even for an LTR run or odd for an RTL run. In an RTL run, the character at the logical start is visually on the right of the displayed run. The length is the number of characters in the run.

countRuns() is normally called before the runs are retrieved.

Example:

  Bidi bidi = new Bidi();
  String text = "abc 123 DEFG xyz";
  bidi.setPara(text, Bidi.RTL, null);
  int i, count=bidi.countRuns(), logicalStart, visualIndex=0, length;
  BidiRun run;
  for (i = 0; i < count; ++i) {
      run = bidi.getVisualRun(i);
      logicalStart = run.getStart();
      length = run.getLength();
      if (Bidi.LTR == run.getEmbeddingLevel()) {
          do { // LTR
              show_char(text.charAt(logicalStart++), visualIndex++);
          } while (--length > 0);
      } else {
          logicalStart += length;  // logicalLimit
          do { // RTL
              show_char(text.charAt(--logicalStart), visualIndex++);
          } while (--length > 0);
      }
  }
 

Note that in right-to-left runs, code like this places second surrogates before first ones (which is generally a bad idea) and combining characters before base characters.

Use of writeReordered(int), optionally with the KEEP_BASE_COMBINING option, can be considered in order to avoid these issues.

Parameters:
runIndex - is the number of the run in visual order, in the range [0..countRuns()-1].
Returns:
a BidiRun object containing the details of the run. The directionality of the run is LTR==0 or RTL==1, never MIXED.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if runIndex is not in the range 0<=runIndex<countRuns()
See Also:
countRuns(), BidiRun, BidiRun.getStart(), BidiRun.getLength(), BidiRun.getEmbeddingLevel()
Status:
Stable ICU 3.8.

getVisualIndex

public int getVisualIndex(int logicalIndex)
Get the visual position from a logical text position. If such a mapping is used many times on the same Bidi object, then calling getLogicalMap() is more efficient.

The value returned may be MAP_NOWHERE if there is no visual position because the corresponding text character is a Bidi control removed from output by the option OPTION_REMOVE_CONTROLS.

When the visual output is altered by using options of writeReordered() such as INSERT_LRM_FOR_NUMERIC, KEEP_BASE_COMBINING, OUTPUT_REVERSE, REMOVE_BIDI_CONTROLS, the visual position returned may not be correct. It is advised to use, when possible, reordering options such as OPTION_INSERT_MARKS and OPTION_REMOVE_CONTROLS.

Note that in right-to-left runs, this mapping places second surrogates before first ones (which is generally a bad idea) and combining characters before base characters. Use of writeReordered(int), optionally with the KEEP_BASE_COMBINING option can be considered instead of using the mapping, in order to avoid these issues.

Parameters:
logicalIndex - is the index of a character in the text.
Returns:
The visual position of this character.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if logicalIndex is not in the range 0<=logicalIndex<getProcessedLength()
See Also:
getLogicalMap(), getLogicalIndex(int), getProcessedLength(), MAP_NOWHERE, OPTION_REMOVE_CONTROLS, writeReordered(int)
Status:
Stable ICU 3.8.

getLogicalIndex

public int getLogicalIndex(int visualIndex)
Get the logical text position from a visual position. If such a mapping is used many times on the same Bidi object, then calling getVisualMap() is more efficient.

The value returned may be MAP_NOWHERE if there is no logical position because the corresponding text character is a Bidi mark inserted in the output by option OPTION_INSERT_MARKS.

This is the inverse method to getVisualIndex().

When the visual output is altered by using options of writeReordered() such as INSERT_LRM_FOR_NUMERIC, KEEP_BASE_COMBINING, OUTPUT_REVERSE, REMOVE_BIDI_CONTROLS, the logical position returned may not be correct. It is advised to use, when possible, reordering options such as OPTION_INSERT_MARKS and OPTION_REMOVE_CONTROLS.

Parameters:
visualIndex - is the visual position of a character.
Returns:
The index of this character in the text.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if visualIndex is not in the range 0<=visualIndex<getResultLength()
See Also:
getVisualMap(), getVisualIndex(int), getResultLength(), MAP_NOWHERE, OPTION_INSERT_MARKS, writeReordered(int)
Status:
Stable ICU 3.8.

getLogicalMap

public int[] getLogicalMap()
Get a logical-to-visual index map (array) for the characters in the Bidi (paragraph or line) object.

Some values in the map may be MAP_NOWHERE if the corresponding text characters are Bidi controls removed from the visual output by the option OPTION_REMOVE_CONTROLS.

When the visual output is altered by using options of writeReordered() such as INSERT_LRM_FOR_NUMERIC, KEEP_BASE_COMBINING, OUTPUT_REVERSE, REMOVE_BIDI_CONTROLS, the visual positions returned may not be correct. It is advised to use, when possible, reordering options such as OPTION_INSERT_MARKS and OPTION_REMOVE_CONTROLS.

Note that in right-to-left runs, this mapping places second surrogates before first ones (which is generally a bad idea) and combining characters before base characters. Use of writeReordered(int), optionally with the KEEP_BASE_COMBINING option can be considered instead of using the mapping, in order to avoid these issues.

Returns:
an array of getProcessedLength() indexes which will reflect the reordering of the characters.

The index map will result in indexMap[logicalIndex]==visualIndex, where indexMap represents the returned array.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
See Also:
getVisualMap(), getVisualIndex(int), getProcessedLength(), MAP_NOWHERE, OPTION_REMOVE_CONTROLS, writeReordered(int)
Status:
Stable ICU 3.8.

getVisualMap

public int[] getVisualMap()
Get a visual-to-logical index map (array) for the characters in the Bidi (paragraph or line) object.

Some values in the map may be MAP_NOWHERE if the corresponding text characters are Bidi marks inserted in the visual output by the option OPTION_INSERT_MARKS.

When the visual output is altered by using options of writeReordered() such as INSERT_LRM_FOR_NUMERIC, KEEP_BASE_COMBINING, OUTPUT_REVERSE, REMOVE_BIDI_CONTROLS, the logical positions returned may not be correct. It is advised to use, when possible, reordering options such as OPTION_INSERT_MARKS and OPTION_REMOVE_CONTROLS.

Returns:
an array of getResultLength() indexes which will reflect the reordering of the characters.

The index map will result in indexMap[visualIndex]==logicalIndex, where indexMap represents the returned array.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
See Also:
getLogicalMap(), getLogicalIndex(int), getResultLength(), MAP_NOWHERE, OPTION_INSERT_MARKS, writeReordered(int)
Status:
Stable ICU 3.8.

reorderLogical

public static int[] reorderLogical(byte[] levels)
This is a convenience method that does not use a Bidi object. It is intended to be used for when an application has determined the levels of objects (character sequences) and just needs to have them reordered (L2). This is equivalent to using getLogicalMap() on a Bidi object.

Parameters:
levels - is an array of levels that have been determined by the application.
Returns:
an array of levels.length indexes which will reflect the reordering of the characters.

The index map will result in indexMap[logicalIndex]==visualIndex, where indexMap represents the returned array.

Status:
Stable ICU 3.8.

reorderVisual

public static int[] reorderVisual(byte[] levels)
This is a convenience method that does not use a Bidi object. It is intended to be used for when an application has determined the levels of objects (character sequences) and just needs to have them reordered (L2). This is equivalent to using getVisualMap() on a Bidi object.

Parameters:
levels - is an array of levels that have been determined by the application.
Returns:
an array of levels.length indexes which will reflect the reordering of the characters.

The index map will result in indexMap[visualIndex]==logicalIndex, where indexMap represents the returned array.

Status:
Stable ICU 3.8.

invertMap

public static int[] invertMap(int[] srcMap)
Invert an index map. The index mapping of the argument map is inverted and returned as an array of indexes that we will call the inverse map.

Parameters:
srcMap - is an array whose elements define the original mapping from a source array to a destination array. Some elements of the source array may have no mapping in the destination array. In that case, their value will be the special value MAP_NOWHERE. All elements must be >=0 or equal to MAP_NOWHERE. Some elements in the source map may have a value greater than the srcMap.length if the destination array has more elements than the source array. There must be no duplicate indexes (two or more elements with the same value except MAP_NOWHERE).
Returns:
an array representing the inverse map. This array has a number of elements equal to 1 + the highest value in srcMap. For elements of the result array which have no matching elements in the source array, the corresponding elements in the inverse map will receive a value equal to MAP_NOWHERE. If element with index i in srcMap has a value k different from MAP_NOWHERE, this means that element i of the source array maps to element k in the destination array. The inverse map will have value i in its k-th element. For all elements of the destination array which do not map to an element in the source array, the corresponding element in the inverse map will have a value equal to MAP_NOWHERE.
See Also:
MAP_NOWHERE
Status:
Stable ICU 3.8.

createLineBidi

public Bidi createLineBidi(int lineStart,
                           int lineLimit)
Create a Bidi object representing the bidi information on a line of text within the paragraph represented by the current Bidi. This call is not required if the entire paragraph fits on one line.

Parameters:
lineStart - the offset from the start of the paragraph to the start of the line.
lineLimit - the offset from the start of the paragraph to the limit of the line.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara
IllegalArgumentException - if lineStart and lineLimit are not in the range 0<=lineStart<lineLimit<=getProcessedLength(), or if the specified line crosses a paragraph boundary
Status:
Stable ICU 3.8.

isMixed

public boolean isMixed()
Return true if the line is not left-to-right or right-to-left. This means it either has mixed runs of left-to-right and right-to-left text, or the base direction differs from the direction of the only run of text.

Returns:
true if the line is not left-to-right or right-to-left.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara
Status:
Stable ICU 3.8.

isLeftToRight

public boolean isLeftToRight()
Return true if the line is all left-to-right text and the base direction is left-to-right.

Returns:
true if the line is all left-to-right text and the base direction is left-to-right.
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara
Status:
Stable ICU 3.8.

isRightToLeft

public boolean isRightToLeft()
Return true if the line is all right-to-left text, and the base direction is right-to-left

Returns:
true if the line is all right-to-left text, and the base direction is right-to-left
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara
Status:
Stable ICU 3.8.

baseIsLeftToRight

public boolean baseIsLeftToRight()
Return true if the base direction is left-to-right

Returns:
true if the base direction is left-to-right
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
Status:
Stable ICU 3.8.

getBaseLevel

public int getBaseLevel()
Return the base level (0 if left-to-right, 1 if right-to-left).

Returns:
the base level
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
Status:
Stable ICU 3.8.

getRunCount

public int getRunCount()
Return the number of level runs.

Returns:
the number of level runs
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
Status:
Stable ICU 3.8.

getRunLevel

public int getRunLevel(int run)
Return the level of the nth logical run in this line.

Parameters:
run - the index of the run, between 0 and countRuns()-1
Returns:
the level of the run
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if run is not in the range 0<=run<countRuns()
Status:
Stable ICU 3.8.

getRunStart

public int getRunStart(int run)
Return the index of the character at the start of the nth logical run in this line, as an offset from the start of the line.

Parameters:
run - the index of the run, between 0 and countRuns()
Returns:
the start of the run
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if run is not in the range 0<=run<countRuns()
Status:
Stable ICU 3.8.

getRunLimit

public int getRunLimit(int run)
Return the index of the character past the end of the nth logical run in this line, as an offset from the start of the line. For example, this will return the length of the line for the last run on the line.

Parameters:
run - the index of the run, between 0 and countRuns()
Returns:
the limit of the run
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
IllegalArgumentException - if run is not in the range 0<=run<countRuns()
Status:
Stable ICU 3.8.

requiresBidi

public static boolean requiresBidi(char[] text,
                                   int start,
                                   int limit)
Return true if the specified text requires bidi analysis. If this returns false, the text will display left-to-right. Clients can then avoid constructing a Bidi object. Text in the Arabic Presentation Forms area of Unicode is presumed to already be shaped and ordered for display, and so will not cause this method to return true.

Parameters:
text - the text containing the characters to test
start - the start of the range of characters to test
limit - the limit of the range of characters to test
Returns:
true if the range of characters requires bidi analysis
Status:
Stable ICU 3.8.

reorderVisually

public static void reorderVisually(byte[] levels,
                                   int levelStart,
                                   Object[] objects,
                                   int objectStart,
                                   int count)
Reorder the objects in the array into visual order based on their levels. This is a utility method to use when you have a collection of objects representing runs of text in logical order, each run containing text at a single level. The elements at index from objectStart up to objectStart + count in the objects array will be reordered into visual order assuming each run of text has the level indicated by the corresponding element in the levels array (at index - objectStart + levelStart).

Parameters:
levels - an array representing the bidi level of each object
levelStart - the start position in the levels array
objects - the array of objects to be reordered into visual order
objectStart - the start position in the objects array
count - the number of objects to reorder
Status:
Stable ICU 3.8.

writeReordered

public String writeReordered(int options)
Take a Bidi object containing the reordering information for a piece of text (one or more paragraphs) set by setPara() or for a line of text set by setLine() and return a string containing the reordered text.

The text may have been aliased (only a reference was stored without copying the contents), thus it must not have been modified since the setPara() call.

This method preserves the integrity of characters with multiple code units and (optionally) combining characters. Characters in RTL runs can be replaced by mirror-image characters in the returned string. Note that "real" mirroring has to be done in a rendering engine by glyph selection and that for many "mirrored" characters there are no Unicode characters as mirror-image equivalents. There are also options to insert or remove Bidi control characters; see the descriptions of the return value and the options parameter, and of the option bit flags.

Parameters:
options - A bit set of options for the reordering that control how the reordered text is written. The options include mirroring the characters on a code point basis and inserting LRM characters, which is used especially for transforming visually stored text to logically stored text (although this is still an imperfect implementation of an "inverse Bidi" algorithm because it uses the "forward Bidi" algorithm at its core). The available options are: DO_MIRRORING, INSERT_LRM_FOR_NUMERIC, KEEP_BASE_COMBINING, OUTPUT_REVERSE, REMOVE_BIDI_CONTROLS, STREAMING
Returns:
The reordered text. If the INSERT_LRM_FOR_NUMERIC option is set, then the length of the returned string could be as large as getLength()+2*countRuns().
If the REMOVE_BIDI_CONTROLS option is set, then the length of the returned string may be less than getLength().
If none of these options is set, then the length of the returned string will be exactly getProcessedLength().
Throws:
IllegalStateException - if this call is not preceded by a successful call to setPara or setLine
See Also:
DO_MIRRORING, INSERT_LRM_FOR_NUMERIC, KEEP_BASE_COMBINING, OUTPUT_REVERSE, REMOVE_BIDI_CONTROLS, OPTION_STREAMING, getProcessedLength()
Status:
Stable ICU 3.8.

writeReverse

public static String writeReverse(String src,
                                  int options)
Reverse a Right-To-Left run of Unicode text. This method preserves the integrity of characters with multiple code units and (optionally) combining characters. Characters can be replaced by mirror-image characters in the destination buffer. Note that "real" mirroring has to be done in a rendering engine by glyph selection and that for many "mirrored" characters there are no Unicode characters as mirror-image equivalents. There are also options to insert or remove Bidi control characters. This method is the implementation for reversing RTL runs as part of writeReordered(). For detailed descriptions of the parameters, see there. Since no Bidi controls are inserted here, the output string length will never exceed src.length().

Parameters:
src - The RTL run text.
options - A bit set of options for the reordering that control how the reordered text is written. See the options parameter in writeReordered().
Returns:
The reordered text. If the REMOVE_BIDI_CONTROLS option is set, then the length of the returned string may be less than src.length(). If this option is not set, then the length of the returned string will be exactly src.length().
Throws:
IllegalArgumentException - if src is null.
See Also:
writeReordered(int)
Status:
Stable ICU 3.8.

getBaseDirection

public static byte getBaseDirection(CharSequence paragraph)
Get the base direction of the text provided according to the Unicode Bidirectional Algorithm. The base direction is derived from the first character in the string with bidirectional character type L, R, or AL. If the first such character has type L, LTR is returned. If the first such character has type R or AL, RTL is returned. If the string does not contain any character of these types, then NEUTRAL is returned. This is a lightweight function for use when only the base direction is needed and no further bidi processing of the text is needed.

Parameters:
paragraph - the text whose paragraph level direction is needed.
Returns:
LTR, RTL, NEUTRAL
See Also:
LTR, RTL, NEUTRAL
Status:
Draft ICU 4.6.


Copyright (c) 2011 IBM Corporation and others.