Class CharsetEncoderICU

java.lang.Object
java.nio.charset.CharsetEncoder
com.ibm.icu.charset.CharsetEncoderICU
Direct Known Subclasses:
CharsetASCII.CharsetEncoderASCII, CharsetBOCU1.CharsetEncoderBOCU, CharsetCompoundText.CharsetEncoderCompoundText, CharsetHZ.CharsetEncoderHZ, CharsetISCII.CharsetEncoderISCII, CharsetISO2022.CharsetEncoderISO2022CN, CharsetISO2022.CharsetEncoderISO2022JP, CharsetISO2022.CharsetEncoderISO2022KR, CharsetLMBCS.CharsetEncoderLMBCS, CharsetMBCS.CharsetEncoderMBCS, CharsetSCSU.CharsetEncoderSCSU, CharsetUTF16.CharsetEncoderUTF16, CharsetUTF32.CharsetEncoderUTF32, CharsetUTF7.CharsetEncoderUTF7, CharsetUTF8.CharsetEncoderUTF8

public abstract class CharsetEncoderICU extends CharsetEncoder
An abstract class that provides framework methods of decoding operations for concrete subclasses. In the future this class will contain API that will implement converter semantics of ICU4C.
  • Field Details

    • MISSING_CHAR_MARKER

      static final char MISSING_CHAR_MARKER
      See Also:
    • errorBuffer

      byte[] errorBuffer
    • errorBufferLength

      int errorBufferLength
    • fromUnicodeStatus

      int fromUnicodeStatus
      these are for encodeLoopICU
    • fromUChar32

      int fromUChar32
    • useSubChar1

      boolean useSubChar1
    • useFallback

      boolean useFallback
    • EXT_MAX_UCHARS

      static final int EXT_MAX_UCHARS
      See Also:
    • preFromUFirstCP

      int preFromUFirstCP
    • preFromUArray

      char[] preFromUArray
    • preFromUBegin

      int preFromUBegin
    • preFromULength

      int preFromULength
    • invalidUCharBuffer

      char[] invalidUCharBuffer
    • invalidUCharLength

      int invalidUCharLength
    • fromUContext

      Object fromUContext
    • onUnmappableInput

      private CharsetCallback.Encoder onUnmappableInput
    • onMalformedInput

      private CharsetCallback.Encoder onMalformedInput
    • fromCharErrorBehaviour

      CharsetCallback.Encoder fromCharErrorBehaviour
    • EMPTY

      private static final CharBuffer EMPTY
  • Constructor Details

    • CharsetEncoderICU

      CharsetEncoderICU(CharsetICU cs, byte[] replacement)
  • Method Details

    • isFallbackUsed

      public boolean isFallbackUsed()
      Is this Encoder allowed to use fallbacks? A fallback mapping is a mapping that will convert a Unicode codepoint sequence to a byte sequence, but the encoded byte sequence will round trip convert to a different Unicode codepoint sequence.
      Returns:
      true if the converter uses fallback, false otherwise.
    • setFallbackUsed

      public void setFallbackUsed(boolean usesFallback)
      Sets whether this Encoder can use fallbacks?
      Parameters:
      usesFallback - true if the user wants the converter to take advantage of the fallback mapping, false otherwise.
    • isFromUUseFallback

      final boolean isFromUUseFallback(int c)
    • isFromUUseFallback

      static final boolean isFromUUseFallback(boolean iUseFallback, int c)
      Use fallbacks from Unicode to codepage when useFallback or for private-use code points
    • isUnicodePrivateUse

      private static final boolean isUnicodePrivateUse(int c)
    • implOnMalformedInput

      protected void implOnMalformedInput(CodingErrorAction newAction)
      Sets the action to be taken if an illegal sequence is encountered
      Overrides:
      implOnMalformedInput in class CharsetEncoder
      Parameters:
      newAction - action to be taken
      Throws:
      IllegalArgumentException
    • implOnUnmappableCharacter

      protected void implOnUnmappableCharacter(CodingErrorAction newAction)
      Sets the action to be taken if an illegal sequence is encountered
      Overrides:
      implOnUnmappableCharacter in class CharsetEncoder
      Parameters:
      newAction - action to be taken
      Throws:
      IllegalArgumentException
    • setFromUCallback

      public final void setFromUCallback(CoderResult err, CharsetCallback.Encoder newCallback, Object newContext)
      Sets the callback encoder method and context to be used if an illegal sequence is encountered. You would normally call this twice to set both the malform and unmappable error. In this case, newContext should remain the same since using a different newContext each time will negate the last one used.
      Parameters:
      err - CoderResult
      newCallback - CharsetCallback.Encoder
      newContext - Object
    • setFromUContext

      public final void setFromUContext(Object newContext)
      Sets fromUContext used in callbacks.
      Parameters:
      newContext - Object
      Throws:
      IllegalArgumentException - The object is an illegal argument for UContext.
    • getCallback

      private static CharsetCallback.Encoder getCallback(CodingErrorAction action)
    • implFlush

      protected CoderResult implFlush(ByteBuffer out)
      Flushes any characters saved in the converter's internal buffer and resets the converter.
      Overrides:
      implFlush in class CharsetEncoder
      Parameters:
      out - action to be taken
      Returns:
      result of flushing action and completes the decoding all input. Returns CoderResult.UNDERFLOW if the action succeeds.
    • implReset

      protected void implReset()
      Resets the from Unicode mode of converter
      Overrides:
      implReset in class CharsetEncoder
    • fromUnicodeReset

      private void fromUnicodeReset()
    • encodeLoop

      protected CoderResult encodeLoop(CharBuffer in, ByteBuffer out)
      Encodes one or more chars. The default behaviour of the converter is stop and report if an error in input stream is encountered. To set different behaviour use @see CharsetEncoder.onMalformedInput()
      Specified by:
      encodeLoop in class CharsetEncoder
      Parameters:
      in - buffer to decode
      out - buffer to populate with decoded result
      Returns:
      result of decoding action. Returns CoderResult.UNDERFLOW if the decoding action succeeds or more input is needed for completing the decoding action.
    • encodeLoop

      abstract CoderResult encodeLoop(CharBuffer source, ByteBuffer target, IntBuffer offsets, boolean flush)
    • encode

      final CoderResult encode(CharBuffer source, ByteBuffer target, IntBuffer offsets, boolean flush)
    • fromUnicodeWithCallback

      final CoderResult fromUnicodeWithCallback(CharBuffer source, ByteBuffer target, IntBuffer offsets, boolean flush)
    • isLegalReplacement

      public boolean isLegalReplacement(byte[] repl)
      Overrides super class method
      Overrides:
      isLegalReplacement in class CharsetEncoder
    • fromUWriteBytes

      static final CoderResult fromUWriteBytes(CharsetEncoderICU cnv, byte[] bytesArray, int bytesBegin, int bytesLength, ByteBuffer out, IntBuffer offsets, int sourceIndex)
    • fromUCountPending

      int fromUCountPending()
    • setSourcePosition

      private final void setSourcePosition(CharBuffer source)
      Parameters:
      source -
    • cbFromUWriteSub

      CoderResult cbFromUWriteSub(CharsetEncoderICU encoder, CharBuffer source, ByteBuffer target, IntBuffer offsets)
    • cbFromUWriteUChars

      CoderResult cbFromUWriteUChars(CharsetEncoderICU encoder, CharBuffer source, ByteBuffer target, IntBuffer offsets)
    • handleSurrogates

      final CoderResult handleSurrogates(CharBuffer source, char lead)

      Handles a common situation where a character has been read and it may be a lead surrogate followed by a trail surrogate. This method can change the source position and will modify fromUChar32.

      If null is returned, then there was success in reading a surrogate pair, the codepoint is stored in fromUChar32 and fromUChar32 should be reset (to 0) after being read.

      Parameters:
      source - The encoding source.
      lead - A character that may be the first in a surrogate pair.
      Returns:
      CoderResult.malformedForLength(1) or CoderResult.UNDERFLOW if there is a problem, or null if there isn't.
      See Also:
    • handleSurrogates

      final CoderResult handleSurrogates(char[] sourceArray, int sourceIndex, int sourceLimit, char lead)

      Same as handleSurrogates(CharBuffer, char), but with arrays. As an added requirement, the calling method must also increment the index if this method returns null.

      Parameters:
      lead - A character that may be the first in a surrogate pair.
      source - The encoding source.
      Returns:
      CoderResult.malformedForLength(1) or CoderResult.UNDERFLOW if there is a problem, or null if there isn't.
      See Also:
    • maxCharsPerByte

      public final float maxCharsPerByte()
      Returns the maxCharsPerByte value for the Charset that created this encoder.
      Returns:
      maxCharsPerByte
    • getMaxBytesForString

      public static int getMaxBytesForString(int length, int maxCharSize)
      Calculates the size of a buffer for conversion from Unicode to a charset. The calculated size is guaranteed to be sufficient for this conversion. It takes into account initial and final non-character bytes that are output by some converters. It does not take into account callbacks which output more than one charset character sequence per call, like escape callbacks. The default (substitution) callback only outputs one charset character sequence.
      Parameters:
      length - Number of chars to be converted.
      maxCharSize - Return value from maxBytesPerChar for the converter that will be used.
      Returns:
      Size of a buffer that will be large enough to hold the output of bytes