The CharacterEncoding class defines the interface of the byte and character encodings for predicates and conversions.
inherits
Behaviour supers: All
inherits
Behaviour supers: All
methods
deferred String name; |
Return the name of this encoding.
deferred char decode byte b; |
Return the decoded byte b, i.e. the Unicode character corresponding to the byte b in the receiving encoding.
deferred byte encode char c; |
Return the byte encoding of the character c. If the byte equivalent of the character c does not exist in the receiving encoding, an encoding-condition is signaled, and the byte encoded is the byteValue of the object returned, or 127 if nil is returned.
deferred boolean isAlpha byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a letter.
deferred boolean isDigit byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a digit.
deferred boolean isLower byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a lowercase letter.
deferred boolean isPunct byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a punctuation character.
deferred boolean isSpace byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a space character.
deferred boolean isUpper byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a uppercase letter.
deferred byte toLower byte b; |
Return the lowercase version of the byte b, according to the receiving encoding. If the character is not in uppercase, it is returned unharmed.
deferred byte toUpper byte b; |
Return the uppercase version of the byte b, according to the receiving encoding. If the character is not in lowercase, it is returned unharmed.
deferred int digitValue byte b; |
Return the numeric value of the digit denoted by the byte b in the receiving encoding.
deferred int alphaValue byte b; |
Return the index of the letter b relative to the start of its letter range. Thus, 'a' returns 0, 'f' returns 5, etc.
An instance of the CharEncoding class maintains information on on a particular mapping for encoding a subset of Unicode characters to 8-bit bytes. An example of such mappings is iso-8859-1, which is the well known western european byte encoding, of which USASCII is a subset.
inherits
State supers: State, Constants, Conditions, CharacterEncoding
variables
Currently known encodings.
methods
ByteArray loadBytes int num from String name extension String ext; |
Load num bytes from the file with the name and the extension ext (sans dot). The full path of the file is obtained from the main Bundle.
instance (id) named String name; |
Return the CharEncoding known as the name. This always succeeds, as a CharEncoding reads the resources it needs on demand.
variables
The name of this encoding.
The decoding map.
The encoding map.
The byte map for conversion to lower case within the encoding.
The byte map for conversion to upper case within the encoding.
The byte map for conversion to title case within the encoding.
The bitmap for testing whether a byte is a digit.
The bitmap for testing whether a byte is a letter.
The bitmap for testing whether a byte is lower case.
The bitmap for testing whether a byte is a punctuation character.
Bitmap for space predicate.
The bitmap for testing whether a byte is upper case.
methods
id init String n; |
Designated initializer.
char decode byte b; |
Return the decoded byte b, i.e. the Unicode character corresponding to the byte b in the receiving encoding.
CharArray decoding; |
Return the decoding map, reading it iff necessary.
byte encode char c; |
Return the byte encoding of the character c. If the byte equivalent of the character c does not exist in the receiving encoding, an encoding-condition is signaled, and the byte encoded is the byteValue of the object returned, or 127 if nil is returned.
IntDictionary encoding; |
Return the encoding map, creating it from the decoding map if necessary.
protected ByteArray loadConversion String conversion; |
Load and return the conversion table for the conversion of the receiving encoding.
protected ByteArray loadPredicateSet String predicate; |
Load and return the predicate set for the predicate of the receiving encoding.
boolean isAlpha byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a letter.
boolean isDigit byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a digit.
boolean isLower byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a lowercase letter.
boolean isPunct byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a punctuation character.
boolean isSpace byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a space character.
boolean isUpper byte b; |
Return TRUE the character denoted by the byte b in the receiving encoding is a uppercase letter.
byte toLower byte b; |
Return the lowercase version of the byte b, according to the receiving encoding. If the character is not in uppercase, it is returned unharmed.
byte toUpper byte b; |
Return the uppercase version of the byte b, according to the receiving encoding. If the character is not in lowercase, it is returned unharmed.
int digitValue byte b; |
Return the numeric value of the digit denoted by the byte b in the receiving encoding.
int alphaValue byte b; |
Return the index of the letter b relative to the start of its letter range. Thus, 'a' returns 0, 'f' returns 5, etc.
A replacement for a real CharEncoding used during program initialization.
inherits
State supers: State, CharacterEncoding
variables
The one and only USASCIIEncoding object.
methods
instance (id)
shared;
|
Undocumented.
methods
String name; |
We're really a dummy, so we do not have a name. In fact, that is how we're recognized.
char decode byte b; |
This is acceptable for iso-8859-1.
byte encode char c; |
This is acceptable for iso-8859-1.
boolean isAlpha byte b; |
Undocumented.
boolean isDigit byte b; |
Undocumented.
boolean isLower byte b; |
Undocumented.
boolean isPunct byte b; |
Undocumented.
boolean isSpace byte b; |
Undocumented.
boolean isUpper byte b; |
Undocumented.
byte toLower byte b; |
Undocumented.
byte toUpper byte b; |
Undocumented.
int digitValue byte b; |
Undocumented.
int alphaValue byte b; |
Undocumented.