Skip to main content

Software  > Globalization > 

Globalize your On Demand Business

Decoding Unicode-An overview of Unicode Transformation Formats

Once you have decided to start using the Unicode standard to encode character data within an application, the next step is to decide which of the various Unicode encoding schemes you will be using to store data.  Although Unicode is a single unified standard for a wide spread of encoding characters used in languages today, there are three widely accepted schemes, or Unicode transformation formats ( UTF's ), that you might use when processing Unicode data:  UTF-8, UTF-16, and UTF-32.  Each one has inherent advantages and disadvantages depending upon the types of characters you intend to be handling and  whether memory or disk space is plentiful.  In this article, we will take a closer look at the structure of Unicode, and at each of the transformation formats.

John Emmons
Globalization Architect
AIX Operating System


gray line

Continue to General Structure of Unicode


E-mail us
Easy ways to get the answers you need.
E-mail us

Locate IBM globalized products

Product languages
Locate IBM globalized products

Software and globalization tools

Downloads
Software and globalization tools

interact globally

Discussion forum
interact globally

Topic contents

Introduction

General structure of Unicode

Multiple formats

Summary

Relevant topics
Featured Link

Applying Globalization Architecture Imperatives to your business

Featured Link

ICU

Unicode: Why globalized applications need it and how to migrate to it