Skip to main content

Software  > Globalization > Mozilla > 

Globalize your On Demand Business

The Mozilla project provides globalized, open-source tools for Web browsing, communications and composition.
Internationalization

With the ability to change the user interface to different languages with a full product installation, Mozilla is also capable of dealing with changing HTML specifications regarding Web page encoding for text display and input. Mozilla widgets and HTML rendering can support the input and display of most languages used on Web. Mozilla can display all of the HTML 4.0 character entity references and numeric character references separately from the document’s character encoding. Mozilla can also display hexadecimal character references using Unicode as the internal encoding. All internal processing in UTF-16. Fig.5 shows Mozilla data flow:

Figure 5: Data flow in Mozilla

Fig.5: Data flow in Mozilla.

Mozilla uses characters from several Unicode ranges to display a single Web page, and appears to be able to interrogate the operating system to identify fonts that include characters from any required Unicode range. Mozilla URL bookmarks are kept in an UTF-8 encoded HTML file and the browsing history data are kept in UTF-16. Mozilla International Library (libi18n) is the core component for unicode and other character encoding rendering and processing. Development of Internationalization Library of Mozilla has a long history since Netscape Navigator 1.1 in 1995. Libi18n provides the underlying internationalization utility functions used in Mozilla to support international Web browsing and Internet Mail/News functionality. The functions that libi18n provides to other Mozilla modules include:

  • Character Code Conversion
  • Finding Character Boundaries
  • Handling I18N related HTTP Headers
  • Line/Word Breaking (for text layout support)
  • Locale Sensitive Operations (collation, date/time formatting)
  • Mail/News Header Processing
  • Platform Independent String Resources
  • String Comparison
  • Unicode String Functions

To see Unicode characters displayed on a Web page, view our Unicode samples page.

Recent versions of Mozilla include support for Hebrew, Arabic, and other bidirectional (BiDi) languages (which display and input of text in a bidirectional format). The actual implementation of the Unicode BiDi Algorithm was taken from IBM's International Components for Unicode (ICU), with some necessary modifications. Fig.6 shows the BiDi (Arabic language) user registration page in Mozilla 1.6:

Arabic language Web page in Mozilla

Fig. 6: Arabic language Web page in Mozilla. Click on image for full-size display.

Mozilla also supports Internationalized Domain Names which means that there no need to use a plug-in to process non-ASCII domain names. An Internationalized Domain Name (IDN) is a domain/host name which uses non-ASCII characters. Until recently domain names allowed only a subset of 7-bit ASCII characters.

As the Internet has spread to non-English speaking people around the world, it has become increasingly clear that forcing them to use domain names written only in a subset of the Latin alphabet is not ideal. Mozilla supports international keyboard layout support on all platform and offers complete global IME support on Windows. Since Mozilla uses Unicode throughout and its own text drawing fields rather than the ones supplied by the operating system, there are virtually no Global IME input limitations for text fields under any version of Windows.

The latest Mozilla build integrates Google Translation Service, which currently can translate Web pages between English, French, German, Spanish, Portuguese and Italian. Fig.7 is an original French web page followed by its translated English version within Mozilla.

.French Web page

French Web page translated to English

Fig. 7: French Web page (above) and page translated into English. Click on image for full-size display.

Mozilla provides good default fonts for detected language match the retrieved document and has a build in intelligent fallback algorithms for missing glyphs. Mozilla can also list the fonts installed on a user's system at runtime. From the end-user prospective, Mozilla can automatically choose fonts for most Unicode ranges and writing systems, but for some encodings you can specify the font that you want Mozilla to use, shown in Fig.8.

Figure 8: Font selection (click to enlarge)

Fig. 8: Font selection. Click on image for full-size display.

Mozilla provides powerful algorithms for detecting the correct character encoding. The encoding detection methods include:

  • HTTP response header charset
  • Web document charset, i.e. HTTP meta equivalent charset information
  • Cached charset: from cache or bookmarks
  • Mozilla default view charset
  • Detect by character analysis

Mozilla will cache 5 most recently used encoding for fast encoding selection. Also the encoding menu is dynamic and additional encodings via Mozilla XPCOM (Cross Platform Component Object Model, a framework for writing cross-platform, modular software) pluggable components. Mozilla allows user to change the web page encoding schemes by selecting a specific language as Fig.9.

Figure 9: Web page encoding (click to enlarge)

Fig 9: Web page encoding. Click on image for full-size display.


gray line

Continue to "Conclusion"


E-mail us
Easy ways to get the answers you need.
E-mail us