Skip to main content

Software  >  Software Development  >  IBM REXX Family  >  

NetRexx

Technical detail

Characters and encodings

In the definition of a programming language it is important to emphasise the distinction between a character and the coded representation[1]  (encoding) of a character. The character 'A', for example, is the first letter of the English (Roman) alphabet, and this meaning is independent of any specific coded representation of that character. Different coded character sets (such as, for example, the ASCII[2]  and EBCDIC[3]  codes) use quite different encodings for this character (decimal values 65 and 193, respectively).

Except where stated otherwise, this document uses characters to convey meaning and not to imply a specific character code (the exceptions are certain operations that specifically convert between characters and their representations). At no time is NetRexx concerned with the glyph (actual appearance) of a character.

Character Sets

Programming in the NetRexx language can be considered to involve the use of two character sets. The first is used for expressing the NetRexx program itself, and is the relatively small set of characters described in the next section. The second character set is the set of characters that can be used as character data by a particular implementation of a NetRexx language processor. This character set may be limited in size (sometimes to a limit of 256 different characters, which have a convenient 8-bit representation), or it may be much larger. The Unicode[4]  character set, for example, allows for 65536 characters, each encoded in 16 bits.

Usually, most or all of the characters in the second (data) character set are also allowed within a NetRexx program, but only within commentary or immediate (literal) data.

The NetRexx language explicitly defines the first character set, in order that programs will be portable and understandable; at the same time it avoids restrictions due to the language itself on the character set used for data. However, where the language itself manipulates or inspects the data (as when carrying out arithmetic operations), there may be requirements on the data character set (for example, numbers can only be expressed if there are digit characters in the set).

 

Footnotes:

[1]  These terms have the meanings as defined by the International Organization for Standardization, in ISO 2382 Data processing -- Vocabulary.

[2]  American Standard Code for Information Interchange.

[3]  Extended Binary Coded Decimal Interchange Code.

[4]  The Unicode Standard: Worldwide Character Encoding, Version 1.0. Volume 1, ISBN 0-201-56788-1, 1991, and Volume 2, ISBN 0-201-60845-6 1992, Addison-Wesley, Reading, MA.

 

 

PreviousTable of contents Next
We're here to help

Easy ways to get the answers you need.

 E-mail us

or call us at
877-426-3774
Priority code:
104CBW67