 |
| A basic understanding of of coded character sets or code pages is essential to deal with multiple languages in information processing systems. |
|
|
 | A simple definition of a coded character set: it is a set of numbers or code points and the character assigned to each one of them.
These assignments are often documented in the form of a grid of 16 rows by as many columns as needed in a single printed page. Each cell has an associated number (called a code position or code point) based on the row / column location. The character corresponding to that code point is shown in the cell. Such a document was called a code page, and the term code page has become synonymous with coded character set. Figure 1 shows an example of an IBM EBCDIC code page in which the code point x'C1' is assigned the character 'A'.

Figure 1.
Of course, larger coded character sets cannot be documented in the form of a single simple grid. The Unicode book (as well as ISO/IEC 10646 standard), for example, shows the assignments as code charts over several pages, each chart having 16 rows each and varying number of columns (mostly 8) per page. Large coded character sets are also shown in many other formats.
Some documents may require Adobe Acrobat Reader. Download Adobe Acrobat Reader. |
|



Continue to "Attributes of character sets"
|
|
|  | |