Unicode has provided a foundation for communicating textual data. However, the locale-dependant data used to drive features such as collation and date/time formatting may be incorrect or inconsistent between systems. This may not only present an irritating user experience, but prevent accurate data transfer.
The Common Locale Data Repository (CDLR) is a step towards solving these problems, by providing an interchange format for locale data and developing a repository of such data available. The first version of the CLDR, Version 1.0, was released on January 16, 2004 and is available for use. The data files are currently hosted by IBM (links to references are at the end of this article).
Traditionally, the data associated with locales provides support for formatting and parsing of dates, times, numbers, and currencies; for the default units of currency; for measurement units, for collation (sorting), plus translated names for time zones, languages, countries, and scripts. CLDR supplies locale data for a wide variety of types of information. In the future it should also supply data for text boundaries (character, word, line, and sentence), text transformations (including transliterations), and support for other services.
Examples of platforms with their own locale data are ICU, OpenOffice.org, and POSIX and POSIX-like operating systems such as Linux, Solaris, and AIX. |