aicas logoJamaica 3.2 release 62

sun.text.normalizer
Class UCharacterProperty

java.lang.Object
  extended by sun.text.normalizer.UCharacterProperty
All Implemented Interfaces:
Trie.DataManipulate

public final class UCharacterProperty
extends Object
implements Trie.DataManipulate

Internal class used for Unicode character property database.

This classes store binary data read from uprops.icu. It does not have the capability to parse the data into more high-level information. It only returns bytes of information when required.

Due to the form most commonly used for retrieval, array of char is used to store the binary data.

UCharacterPropertyDB also contains information on accessing indexes to significant points in the binary data.

Responsibility for molding the binary data into more meaning form lies on UCharacter.

Since:
release 2.1, february 1st 2002

Field Summary
static int EXC_CASE_FOLDING_
          Exception indicator for case folding type
static int EXC_COMBINING_CLASS_
          EXC_COMBINING_CLASS_ is not found in ICU.
static int EXC_DENOMINATOR_VALUE_
          Exception indicator for denominator type
static int EXC_LOWERCASE_
          Exception indicator for lowercase type
static int EXC_MIRROR_MAPPING_
          Exception indicator for mirror type
static int EXC_NUMERIC_VALUE_
          Exception indicator for numeric type
static int EXC_SPECIAL_CASING_
          Exception indicator for special casing type
static int EXC_TITLECASE_
          Exception indicator for titlecase type
static int EXC_UNUSED_
          Exception indicator for digit type
static int EXC_UPPERCASE_
          Exception indicator for uppercase type
static int EXCEPTION_MASK
          Exception test mask
static char LATIN_SMALL_LETTER_I_
          Latin lowercase i
 int[] m_property_
          Character property table
 CharTrie m_trie_
          Trie data
 char[] m_trieData_
          Optimization CharTrie data array
 char[] m_trieIndex_
          Optimization CharTrie index array
 int m_trieInitialValue_
          Optimization CharTrie data offset
 VersionInfo m_unicodeVersion_
          Unicode version
static int TYPE_MASK
          Character type mask
 
Method Summary
 UnicodeSet addPropertyStarts(UnicodeSet set)
           
 int getAdditional(int codepoint)
          Gets the unicode additional properties.
 VersionInfo getAge(int codepoint)
          Get the "age" of the code point.
 int getException(int index, int etype)
          Gets the exception value at the index, assuming that data type is available.
static int getExceptionIndex(int prop)
          Getting the exception index for argument property
 void getFoldCase(int index, int count, StringBuffer str)
          Gets the folded case value at the index
 int getFoldingOffset(int value)
          Called by com.ibm.icu.util.Trie to extract from a lead surrogate's data the index array offset of the indexes for that lead surrogate.
 UnicodeSet getInclusions()
           
static UCharacterProperty getInstance()
          Loads the property data and initialize the UCharacterProperty instance.
 int getProperty(int ch)
          Gets the property value at the index.
static int getRawSupplementary(char lead, char trail)
          Forms a supplementary code point from the argument character
Note this is for internal use hence no checks for the validity of the surrogate characters are done
static int getSignedValue(int prop)
          Getting the signed numeric value of a character embedded in the property argument
 boolean hasExceptionValue(int index, int indicator)
          Determines if the exception value passed in has the kind of information which the indicator wants, e.g if the exception value contains the digit value of the character
static boolean isRuleWhiteSpace(int c)
          Checks if the argument c is to be treated as a white space in ICU rules.
 void setIndexData(CharTrie.FriendAgent friendagent)
          Java friends implementation
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_trie_

public CharTrie m_trie_
Trie data


m_trieIndex_

public char[] m_trieIndex_
Optimization CharTrie index array


m_trieData_

public char[] m_trieData_
Optimization CharTrie data array


m_trieInitialValue_

public int m_trieInitialValue_
Optimization CharTrie data offset


m_property_

public int[] m_property_
Character property table


m_unicodeVersion_

public VersionInfo m_unicodeVersion_
Unicode version


EXC_UPPERCASE_

public static final int EXC_UPPERCASE_
Exception indicator for uppercase type

See Also:
Constant Field Values

EXC_LOWERCASE_

public static final int EXC_LOWERCASE_
Exception indicator for lowercase type

See Also:
Constant Field Values

EXC_TITLECASE_

public static final int EXC_TITLECASE_
Exception indicator for titlecase type

See Also:
Constant Field Values

EXC_UNUSED_

public static final int EXC_UNUSED_
Exception indicator for digit type

See Also:
Constant Field Values

EXC_NUMERIC_VALUE_

public static final int EXC_NUMERIC_VALUE_
Exception indicator for numeric type

See Also:
Constant Field Values

EXC_DENOMINATOR_VALUE_

public static final int EXC_DENOMINATOR_VALUE_
Exception indicator for denominator type

See Also:
Constant Field Values

EXC_MIRROR_MAPPING_

public static final int EXC_MIRROR_MAPPING_
Exception indicator for mirror type

See Also:
Constant Field Values

EXC_SPECIAL_CASING_

public static final int EXC_SPECIAL_CASING_
Exception indicator for special casing type

See Also:
Constant Field Values

EXC_CASE_FOLDING_

public static final int EXC_CASE_FOLDING_
Exception indicator for case folding type

See Also:
Constant Field Values

EXC_COMBINING_CLASS_

public static final int EXC_COMBINING_CLASS_
EXC_COMBINING_CLASS_ is not found in ICU. Used to retrieve the combining class of the character in the exception value

See Also:
Constant Field Values

LATIN_SMALL_LETTER_I_

public static final char LATIN_SMALL_LETTER_I_
Latin lowercase i

See Also:
Constant Field Values

TYPE_MASK

public static final int TYPE_MASK
Character type mask

See Also:
Constant Field Values

EXCEPTION_MASK

public static final int EXCEPTION_MASK
Exception test mask

See Also:
Constant Field Values
Method Detail

setIndexData

public void setIndexData(CharTrie.FriendAgent friendagent)
Java friends implementation


getFoldingOffset

public int getFoldingOffset(int value)
Called by com.ibm.icu.util.Trie to extract from a lead surrogate's data the index array offset of the indexes for that lead surrogate.

Specified by:
getFoldingOffset in interface Trie.DataManipulate
Parameters:
value - data value for a surrogate from the trie, including the folding offset
Returns:
data offset or 0 if there is no data for the lead surrogate

getProperty

public int getProperty(int ch)
Gets the property value at the index. This is optimized. Note this is alittle different from CharTrie the index m_trieData_ is never negative.

Parameters:
ch - code point whose property value is to be retrieved
Returns:
property value of code point

getSignedValue

public static int getSignedValue(int prop)
Getting the signed numeric value of a character embedded in the property argument

Parameters:
prop - the character
Returns:
signed numberic value

getExceptionIndex

public static int getExceptionIndex(int prop)
Getting the exception index for argument property

Parameters:
prop - character property
Returns:
exception index

hasExceptionValue

public boolean hasExceptionValue(int index,
                                 int indicator)
Determines if the exception value passed in has the kind of information which the indicator wants, e.g if the exception value contains the digit value of the character

Parameters:
index - exception index
indicator - type indicator
Returns:
true if type value exist

getException

public int getException(int index,
                        int etype)
Gets the exception value at the index, assuming that data type is available. Result is undefined if data is not available. Use hasExceptionValue() to determine data's availability.

Parameters:
index -
etype - exception data type
Returns:
exception data type value at index

getFoldCase

public void getFoldCase(int index,
                        int count,
                        StringBuffer str)
Gets the folded case value at the index

Parameters:
index - of the case value to be retrieved
count - number of characters to retrieve
str - string buffer to which to append the result

getAdditional

public int getAdditional(int codepoint)
Gets the unicode additional properties. C version getUnicodeProperties.

Parameters:
codepoint - codepoint whose additional properties is to be retrieved
Returns:
unicode properties

getAge

public VersionInfo getAge(int codepoint)

Get the "age" of the code point.

The "age" is the Unicode version when the code point was first designated (as a non-character or for Private Use) or assigned a character.

This can be useful to avoid emitting code points to receiving processes that do not accept newer characters.

The data is from the UCD file DerivedAge.txt.

This API does not check the validity of the codepoint.

Parameters:
codepoint - The code point.
Returns:
the Unicode version number

getRawSupplementary

public static int getRawSupplementary(char lead,
                                      char trail)
Forms a supplementary code point from the argument character
Note this is for internal use hence no checks for the validity of the surrogate characters are done

Parameters:
lead - lead surrogate character
trail - trailing surrogate character
Returns:
code point of the supplementary character

getInstance

public static UCharacterProperty getInstance()
                                      throws RuntimeException
Loads the property data and initialize the UCharacterProperty instance.

Throws:
RuntimeException - when data is missing or data has been corrupted

isRuleWhiteSpace

public static boolean isRuleWhiteSpace(int c)
Checks if the argument c is to be treated as a white space in ICU rules. Usually ICU rule white spaces are ignored unless quoted.

Parameters:
c - codepoint to check
Returns:
true if c is a ICU white space

addPropertyStarts

public UnicodeSet addPropertyStarts(UnicodeSet set)

getInclusions

public UnicodeSet getInclusions()

aicas logoJamaica 3.2 release 62

aicas GmbH, Karlsruhe - Germany    www.aicas.com
Copyright 2001-2008 aicas GmbH. All Rights Reserved.