Skip to main content

Software  > Globalization > Indic languages > 

Globalize your On Demand Business

Along with the number of languages and scripts involved, Indic languages provide challenges to developers because of their complexity and orthographic nature.
Vowel modification in Indic languages

Below are examples of vowel modification and conjunct-formation in the Unicode representation.

Hindi examples
1.Consonant + vowel sign

Indic character
Indic character
Indic character
consonant 'ha'
+ vowel sign 'i'
= syllable 'hi'
 
2. Consonant + consonant
Indic character
Indic character
Indic character
Indic character
consonant 'na'
+ 'halant'
+ consonant 'da'
= syllable 'nda'
Note ligature formation
3. Consonant + consonant + vowel sign
Indic character
Indic character
Indic character
Indic character
Indic character
consonant 'na'
+ 'halant'
+ consonant 'da'
+ vowel sign 'ii'
= syllable 'ndii'
4. Combination of 1 + 3
Indic character
Indic character
Indic character
Indic character
Indic character
Indic character
Indic character
consonant 'ha'
+ vowel sign 'i'
consonant 'na'
+ 'halant'
+ consonant 'da'
+ vowel sign 'ii'
= word 'hindii'
Note that the spelling of the word Hindi written in Devanagari script does have long 'ii' at the end.
Tamil example
Indic character
Indic character
Indic character
Indic character
Indic character
Indic character
consonant 'ta'
+ consonant 'ma'
+ vowel sign 'i'
+ consonant 'zha'
+ special modifier 'pulli'
= word 'Tamizh'
Gujarati example
Indic character
Indic character
Indic character
Indic character
Indic character
Indic character
Indic character
Indic character
cons.
'ga'
+
vowel sign 'u'
+
cons.
'ja'
+
cons.
'ra'
+
vowel sign
'aa'
+
cons.
'ta'
+
vowel sign 'ii'
= word 'Gujaratii'
Punjabi example
Indic character
Indic character
Indic character
Indic character
Indic character
Indic character
Indic character
cons.
‘pa’
+
special modifier 'tippi'
+
consonant 'ja'
+
vowel sign
'aa'
+
consonant 'ba'
+
vowel sign 'ii'
= word 'Punjabii'

Ligatures
As shown above, in many instances Indic syllables form new glyphs. These glyphs are called ligatures.

Character reshaping is often simple and follows simple rules. In some cases however, the resultant ligatures have no relation to the original constituents and it is impossible for an untrained person to identify them.

Unlike bidirectional languages, no major layout transformation is required for Indic scripts when the Unicode approach is followed. Since character reshaping occurs at the individual conjunct-cluster level, the scope of complexity is localized. This means that if a specific behavior is known, it is relatively easy to render the expected behavior

Note: Bidirectional processing will be required for languages such as Kashmiri, Sindhi and Urdu when written in the Urdu script. These cases are not addressed in this article.

Continue to "Storage, input, and display of Indic text"


E-mail us
Easy ways to get the answers you need.
E-mail us