Class BreakIteratorWrapper
java.lang.Object
org.apache.lucene.analysis.icu.segmentation.BreakIteratorWrapper
Wraps RuleBasedBreakIterator, making object reuse convenient and emitting a rule status for emoji
sequences.
-
Field Summary
FieldsModifier and TypeFieldDescription(package private) static final com.ibm.icu.text.UnicodeSet
(package private) static final com.ibm.icu.text.UnicodeSet
private final com.ibm.icu.text.RuleBasedBreakIterator
private int
private int
private char[]
private final CharArrayIterator
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate int
calcStatus
(int current, int next) Returns current rule status for the text between breaks.(package private) int
current()
(package private) int
private boolean
isEmoji
(int current, int next) Returns true if the current text represents emoji character or sequence(package private) int
next()
(package private) void
setText
(char[] text, int start, int length)
-
Field Details
-
textIterator
-
rbbi
private final com.ibm.icu.text.RuleBasedBreakIterator rbbi -
text
private char[] text -
start
private int start -
status
private int status -
EMOJI_RK
static final com.ibm.icu.text.UnicodeSet EMOJI_RK -
EMOJI
static final com.ibm.icu.text.UnicodeSet EMOJI
-
-
Constructor Details
-
BreakIteratorWrapper
BreakIteratorWrapper(com.ibm.icu.text.RuleBasedBreakIterator rbbi)
-
-
Method Details
-
current
int current() -
getRuleStatus
int getRuleStatus() -
next
int next() -
calcStatus
private int calcStatus(int current, int next) Returns current rule status for the text between breaks. (determines token type) -
isEmoji
private boolean isEmoji(int current, int next) Returns true if the current text represents emoji character or sequence -
setText
void setText(char[] text, int start, int length)
-