|
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectsun.text.normalizer.UCharacter
public final class UCharacter
The UCharacter class provides extensions to the java.lang.Character class. These extensions provide support for Unicode 3.2 properties and together with the UTF16 class, provide support for supplementary characters (those with code points above U+FFFF).
Code points are represented in these API using ints. While it would be more convenient in Java to have a separate primitive datatype for them, ints suffice in the meantime.
To use this class please add the jar file name icu4j.jar to the
class path, since it contains data files which supply the information used
by this file.
E.g. In Windows
set CLASSPATH=%CLASSPATH%;$JAR_FILE_PATH/ucharacter.jar.
Otherwise, another method would be to copy the files uprops.dat and
unames.icu from the icu4j source subdirectory
$ICU4J_SRC/src/com.ibm.icu.impl.data to your class directory
$ICU4J_CLASS/com.ibm.icu.impl.data.
Aside from the additions for UTF-16 support, and the updated Unicode 3.1 properties, the main differences between UCharacter and Character are:
Further detail differences can be determined from the program com.ibm.icu.dev.test.lang.UCharacterCompare
This class is not subclassable
com.ibm.icu.lang.UCharacterEnums| Nested Class Summary | |
|---|---|
static interface |
UCharacter.ECharacterCategory
Deprecated. This is a draft API and might change in a future release of ICU. |
static interface |
UCharacter.HangulSyllableType
Hangul Syllable Type constants. |
static interface |
UCharacter.NumericType
Numeric Type constants. |
| Field Summary | |
|---|---|
static int |
MAX_VALUE
The highest Unicode code point value (scalar value) according to the Unicode Standard. |
static int |
MIN_VALUE
The lowest Unicode code point value. |
static double |
NO_NUMERIC_VALUE
Special value that is returned by getUnicodeNumericValue(int) when no numeric value is defined for a code point. |
static int |
SUPPLEMENTARY_MIN_VALUE
The minimum value for Supplementary code points |
| Method Summary | |
|---|---|
static int |
digit(int ch,
int radix)
Retrieves the numeric value of a decimal digit code point. |
static String |
foldCase(String str,
boolean defaultmapping)
The given string is mapped to its case folding equivalent according to UnicodeData.txt and CaseFolding.txt; if any character has no case folding equivalent, the character itself is returned. |
static VersionInfo |
getAge(int ch)
Get the "age" of the code point. |
static int |
getCodePoint(char lead,
char trail)
Returns a code point corresponding to the two UTF16 characters. |
static int |
getDirection(int ch)
Returns the Bidirection property of a code point. |
static int |
getIntPropertyValue(int ch,
int type)
Gets the property value for an Unicode property type of a code point. |
static int |
getType(int ch)
Returns a value indicating a code point's Unicode category. |
static double |
getUnicodeNumericValue(int ch)
Get the numeric value for a Unicode code point as defined in the Unicode Character Database. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final int MIN_VALUE
public static final int MAX_VALUE
public static final int SUPPLEMENTARY_MIN_VALUE
public static final double NO_NUMERIC_VALUE
getUnicodeNumericValue(int),
Constant Field Values| Method Detail |
|---|
public static int digit(int ch,
int radix)
java.lang.Character.digit(). Note that this
will return positive values for code points for which isDigit
returns false, just like java.lang.Character.
ch - the code point to queryradix - the radix
public static double getUnicodeNumericValue(int ch)
Get the numeric value for a Unicode code point as defined in the Unicode Character Database.
A "double" return type is necessary because some numeric values are fractions, negative, or too large for int.
For characters without any numeric values in the Unicode Character Database, this function will return NO_NUMERIC_VALUE.
API Change: In release 2.2 and prior, this API has a return type int and returns -1 when the argument ch does not have a corresponding numeric value. This has been changed to synch with ICU4C
This corresponds to the ICU4C function u_getNumericValue.
ch - Code point to get the numeric value for.
public static int getType(int ch)
ch - code point whose type is to be determined
public static int getCodePoint(char lead,
char trail)
lead - the lead chartrail - the trail char
IllegalArgumentException - thrown when argument characters do
not form a valid codepointpublic static int getDirection(int ch)
ch - the code point to be determined its direction
public static String foldCase(String str,
boolean defaultmapping)
str - the String to be converteddefaultmapping - Indicates if all mappings defined in
CaseFolding.txt is to be used, otherwise the
mappings for dotted I and dotless i marked with
'I' in CaseFolding.txt will be skipped.
#foldCase(int, boolean)public static VersionInfo getAge(int ch)
Get the "age" of the code point.
The "age" is the Unicode version when the code point was first designated (as a non-character or for Private Use) or assigned a character.
This can be useful to avoid emitting code points to receiving processes that do not accept newer characters.
The data is from the UCD file DerivedAge.txt.
ch - The code point.
public static int getIntPropertyValue(int ch,
int type)
Gets the property value for an Unicode property type of a code point. Also returns binary and mask property values.
Unicode, especially in version 3.2, defines many more properties than the original set in UnicodeData.txt.
The properties APIs are intended to reflect Unicode properties as defined in the Unicode Character Database (UCD) and Unicode Technical Reports (UTR). For details about the properties see http://www.unicode.org/.
For names of Unicode properties see the UCD file PropertyAliases.txt.
Sample usage: int ea = UCharacter.getIntPropertyValue(c, UProperty.EAST_ASIAN_WIDTH); int ideo = UCharacter.getIntPropertyValue(c, UProperty.IDEOGRAPHIC); boolean b = (ideo == 1) ? true : false;
ch - code point to test.type - UProperty selector constant, identifies which binary
property to check. Must be
UProperty.BINARY_START <= type < UProperty.BINARY_LIMIT or
UProperty.INT_START <= type < UProperty.INT_LIMIT or
UProperty.MASK_START <= type < UProperty.MASK_LIMIT.
UProperty,
#hasBinaryProperty,
#getIntPropertyMinValue,
#getIntPropertyMaxValue,
#getUnicodeVersion
|
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||