 |
| Linguini is a vector-space based categorizer used for high-precision language identification. |
|
|
 | Language identification is a critical component in producing automated translations for the vast quantities of Web-based documents, and Linguini is a significant step toward achieving accurate identification. During testing, Linguini identified document languages with 100% accuracy after scanning as little as five to ten percent of an average document.
Click here to download Linguini: Language Identification for Multilingual Documents.
For more information about Linguini, email global@ibm.com or visit the globalization diiscussion forum .

The document is in Adobe (PDF) format (231k). Click here to download a free copy.
This is a reprint of a paper published in the Journal of Management Information Systems, Vol. 16, No. 3, Winter 2000. |
|
|
|  | |