Class CharsetUTF8
java.lang.Object
java.nio.charset.Charset
com.ibm.icu.charset.CharsetICU
com.ibm.icu.charset.CharsetUTF8
- All Implemented Interfaces:
Comparable<Charset>
- Direct Known Subclasses:
CharsetCESU8
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) class(package private) class -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int[]private static final byte[]private final booleanFields inherited from class CharsetICU
codepage, conversionType, hasFromUnicodeFallback, hasToUnicodeFallback, icuCanonicalName, maxBytesPerChar, maxCharsPerByte, minBytesPerChar, name, options, platform, ROUNDTRIP_AND_FALLBACK_SET, ROUNDTRIP_SET, subChar, subChar1, subCharLen, unicodeMask -
Constructor Summary
ConstructorsConstructorDescriptionCharsetUTF8(String icuCanonicalName, String javaCanonicalName, String[] aliases) -
Method Summary
Modifier and TypeMethodDescriptionprivate static final byteencodeHeadOf1(int char32) private static final byteencodeHeadOf2(int char32) private static final byteencodeHeadOf3(int char32) private static final byteencodeHeadOf4(int char32) private static final byteencodeLastTail(int char32) private static final byteencodeSecondToLastTail(int char32) private static final byteencodeThirdToLastTail(int char32) (package private) voidgetUnicodeSetImpl(UnicodeSet setFillIn, int which) This follows ucnv.c method ucnv_detectUnicodeSignature() to detect the start of the stream for example U+FEFF (the Unicode BOM/signature character) that can be ignored.Methods inherited from class CharsetICU
contains, forNameICU, getCharset, getCompleteUnicodeSet, getNonSurrogateUnicodeSet, getUnicodeSet, isFixedWidth, isSurrogateMethods inherited from class Charset
aliases, availableCharsets, canEncode, compareTo, decode, defaultCharset, displayName, displayName, encode, encode, equals, forName, hashCode, isRegistered, isSupported, name, toString
-
Field Details
-
fromUSubstitution
private static final byte[] fromUSubstitution -
BITMASK_FROM_UTF8
private static final int[] BITMASK_FROM_UTF8 -
isCESU8
private final boolean isCESU8
-
-
Constructor Details
-
CharsetUTF8
-
-
Method Details
-
encodeHeadOf1
private static final byte encodeHeadOf1(int char32) -
encodeHeadOf2
private static final byte encodeHeadOf2(int char32) -
encodeHeadOf3
private static final byte encodeHeadOf3(int char32) -
encodeHeadOf4
private static final byte encodeHeadOf4(int char32) -
encodeThirdToLastTail
private static final byte encodeThirdToLastTail(int char32) -
encodeSecondToLastTail
private static final byte encodeSecondToLastTail(int char32) -
encodeLastTail
private static final byte encodeLastTail(int char32) -
newDecoder
- Specified by:
newDecoderin classCharset
-
newEncoder
- Specified by:
newEncoderin classCharset
-
getUnicodeSetImpl
Description copied from class:CharsetICUThis follows ucnv.c method ucnv_detectUnicodeSignature() to detect the start of the stream for example U+FEFF (the Unicode BOM/signature character) that can be ignored. Detects Unicode signature byte sequences at the start of the byte stream and returns number of bytes of the BOM of the indicated Unicode charset. 0 is returned when no Unicode signature is recognized.- Specified by:
getUnicodeSetImplin classCharsetICU
-