Skip to main content

Software  > Globalization > JavaScript > 

Globalize your On Demand Business

How JavaScript supports the internationalization of Web content.
Text processing

JavaScript String objects also store 16-bit Unicode strings (which do not have to be well-formed UTF-16). There is a small set of standard operations for them, such as concatenation and lowercasing. Single characters are simply represented as short strings, or occasionally as integers with their Unicode code points as values.

Most "interesting" string manipulations are done with regular expressions, which are a basic feature of JavaScript. They support prefix/suffix tests and complicated pattern matches, and search and replace, and tests for sets of certain classes of characters. However, Unicode is only supported on the most basic level. Outside of the ASCII range, there are no predefined character classes, so that a script has to define expressions with explicitly listed ranges of characters. For example, there are more than a thousand uppercase characters in Unicode; if a script needs to find them in a string, then it needs to define a regular expression with a character range that lists all of them.

The ECMAScript language is somewhat limited in that the specification is written entirely in terms of 16-bit Unicode code units. Supplementary characters (those with code point values above 0xffff) are represented in UTF-16 with pairs of special "surrogate" code units (or with pairs of \uxxxx escapes in a script), and can be used in strings, but not in identifiers or in a meaningful way in regular expressions. Historically, this is similar to other early Unicode implementations because supplementary characters were not assigned until Unicode 3.1 (and the current edition 3 of ECMAScript predates that). Only a small minority of texts requires any supplementary characters, but some of the 45,000 supplementary Chinese characters are increasingly in demand. A script has to use custom functions to handle them.


gray line

Continue to "Internationalization and localization"


E-mail us
Easy ways to get the answers you need.
E-mail us