|
Extending the Boundaries with XML Support on TPF
Carrie J. Evans, IBM TPF ID Core Team
Imagine a world in which the relationships between items are both necessary
and optional; a world that allows you to pull a marble from the bottom
of a jar without having any impact on the other marbles. Imagine the complexity
of consistency erased.
Today, great advancements in technology introduce new ideas, theories,
languages, terminology, and acronyms. Extensible Markup Language (XML)
is one of these new technologies that holds the potential of being a great
solution for sharing information across different computing platforms.
While there is no guarantee that this is the last solution, XML does offer
a seemingly revolutionary approach to data categorization and communication.
The relationships between pieces of data are defined, yet flexible so data
can more easily be changed, moved, and deleted. XML simplifies data sharing
between programs, platforms, and users. The following paragraphs will help
you gain an understanding of basic XML concepts, the XML support that is
added to the TPF 4.1 system when applying APAR PJ27634, and insight into
how XML can favorably impact the TPF community.
What Is XML?
XML is a markup language that combines the power of Standard Generalized
Markup Language (SGML) and the simplicity of Hypertext Markup Language
(HTML). Both XML and HTML are the children of the SGML markup language:
-
HTML is an application of SGML that is used to define the
display properties of data for use primarily on the Internet. It has a
quiet simplicity that allows for the formatting of data using predefined
tags.
-
XML is a subset of SGML that can be used to categorize data
with regard to either presentation or data retrieval. XML retains the restrictive
and powerful nature of SGML but, as a subset developed by experienced SGML
users, is easier to write and maintain. XML can be used to display Web
pages if the browser has the ability to parse (or read) the XML language.
It can also be used to manage data for easy retrieval. XML is already being
used for publishing as well as for data storage and retrieval, data interchange
between heterogeneous platforms, data transformations, and data displays.
As it evolves and becomes more powerful, XML may allow for single-source
data retrieval and data display.
XML is a recommendation from the World Wide Web Consortium (W3C). (Go to
http://www.w3.org for more information about the W3C.) "A W3C Recommendation
is a technical report that is the end result of extensive consensus building
inside and outside of the W3C about a particular technology or policy.
The W3C considers that the ideas or technology specified by a Recommendation
are appropriate for widespread deployment and promote W3C's mission" (go
to http://www.w3.org/Consortium/Process-20010208/tr.html).
How Does XML Work?
XML documents are made up of elements, attributes, and values:
-
An element is an opening tag, a closing tag, and the contents between the
two. In the following example, there are three elements: name,
first,
and last:
<name>
<first>Mickey</first>
<last>Mouse</last>
</name>
-
An attribute is a name-value pair specified as part of the opening tag
of an element. For example, countryCode, areaCode,
and pNumber are all attributes of the element
PhoneNumber:
<PhoneNumber countryCode="44"
areaCode="340" pNumber="635 3343" />
-
A value is the data contained between an opening tag and a closing tag
or the piece of an attribute between double quotations (" ").
An XML document can be well-formed, valid or neither. A well-formed
document, which can be either invalid against a specified Document Type
Definition (DTD) or simply not checked for validity, is a document that
follows the basic rules for writing XML markup language. A well-formed
document adheres to the following rules:
-
Every XML document must have a root element.
-
All tags must be opened and closed. XML also allows you to write empty
elements by adding an ending slash before the closing bracket; for example,
<address />.
-
Tags must follow nesting rules.
-
Either single quotations (' ') or double quotations (" ") must surround
the value of an attribute.
A valid XML document, which must also be (by definition)
well-formed,
is one that conforms to the rules of the associated DTD. A DTD is a type
of schema that defines the acceptable element tags and attributes (and
nesting order for those elements and attributes). In other words, it defines
both the vocabulary and the syntax for an XML document type. A DTD is written
in a specific and strict syntax and can either reside inside the XML document
or in a separate file. While a DTD is not required for a well-formed
XML document, it is necessary if you want to validate your XML document.
The following figure shows a DTD and an XML document that is based on
that DTD. The encoding declaration identifies the code page in which the
document is written, the root element must be identified properly and all
other elements and attributes must be nested inside the root element, and
the DTD declaration in the XML document identifies the path name for the
DTD on which the document is validated. Finally, the figure highlights
the declaration of two different elements in the DTD with how they appear
in the XML document.
While there are a variety of editors available for writing XML documents
and DTDs, you can also write them in any basic text editor. However, reading
an XML document can be slightly more difficult. Not all Web browsers support
XML, and not all of them have the ability to read an XML document. Microsoft
Internet Explorer 5 does have the ability to parse some XML documents depending
on their encoding. Our XML document, pnr.xml,
would be shown in this browser as follows, indicating that the XML document
is well-formed:
Working with XML is a process of developing the language that most closely
maps to the information you want to store. Oftentimes, the process of planning
which elements to declare in your DTD will prove to be much more difficult
than the process of writing the actual XML document. However, once the
initial DTD is written, it is easy to write and change many XML documents
based on the same DTD so that your data can be displayed, read by an application,
or stored in a database.
The W3C Web site (http://www.w3.org/) contains the complete XML specification.
XML.com (http://www.xml.com/) has an annotated version of the specification
at http://www.xml.com/axml/testaxml.htm.
XML Support on TPF
The XML parser (APAR PJ27634) is the XML Parser for C++ (XML4C) Version
3.1.2 ported to the TPF 4.1 system. The parser is XML Version 1.0 compliant
and allows TPF 4.1 applications written in C++ language to do the following:
-
Parse XML documents using the Document Object Model (DOM) Version 1.0 specification
Notes:
-
The DOM can also assist in using TPF applications to modify and create
XML documents.
-
Some APIs from the DOM version 2.0 specification are provided for experimental
use. They are not supported for production work.
-
Parse XML documents using the Simple API for XML (SAX) Version 1.0 specification.
-
Parse XML documents with or without validation against a specified DTD.
Applications on the TPF 4.1 system interact with XML documents that are
in the file system, coming in through standard input (stdin)
or residing in memory. This interaction is made possible through application
programming interfaces (APIs) specified by either the DOM or SAX specifications
and can be either nonvalidating or validating against a DTD. The API definitions
are contained within a set of header files that application programmers
will need to have in their #include (or search) path.
IBM contributed the XML4C parser to the Apache XML Project (see http://xml.apache.org)
as open source in November 1999. The XML Parser for C++ (XML4C) Version
3.1.2 is based on Xerces-C Version 1.1 and is fully compliant with the
Unicode 3.0 specification. While the Xerces-C parser can be updated by
the open source community, the XML4C parser is maintained only by IBM and
may differ slightly from the Xerces-C parser.
XML and the TPF Community
The benefits of using XML vary, but overall, marked-up data and the
ability to read and interpret that data provide the following benefits
for the TPF community:
-
With XML, TPF applications can more easily read information from a variety
of platforms. The data is platform-independent, so now the sharing of data
between you and your customers can be simplified.
-
Companies that work in the business-to-business (B2B) environment are developing
DTDs for their industry. The ability to parse XML documents gives TPF an
opportunity to be exploited in the B2B environment.
-
XML data can be read even if you do not have a detailed picture of how
that data is structured. Your clients will no longer need to go through
complex processes to update how to interpret data that you send to them
because the DTD gives the ability to understand the information.
-
Changing the content and structure of data is easier with XML. The data
is tagged so you can add and remove elements without impacting existing
elements. You will be able to change the data without having to change
the application.
Despite all the benefits of using XML, there are some things to be aware
of. First of all, working with marked-up data can be additional work when
writing applications because it requires more pieces to work together.
Your XML document must be well-formed and may need to comply with
the rules of a given DTD (and therefore be valid); the encoding
of your documents must be supported on TPF; you must decide which API with
which to parse your documents (DOM or SAX). Given the benefits of using
XML, this additional work up front can reduce the amount of work needed
to make a change in the future. Second, although it is a recommendation
developed by the W3C, XML is still a developing technology. As with any
new technology, there will be bumps along its development road.
The TPF development lab is working with an XML task force that is sponsored
through the TPF Users Group (TPFUG). The intention of the task force is
to explore XML support on TPF and identify requirements for additional
support. If you have an interest in XML on TPF, we encourage you to join
this task force. For more information about the TPFUG, go to http://www.tpfug.org/.
For More Information
The technical documentation for APAR PJ27634 is exclusively online in
browser-readable files. XML on TPF: An Online User's Guide will
provide you with additional information about XML as well as the technical
details of XML support on the TPF 4.1 system. The guide also includes a
list of resources for learning XML and a tutorial that will walk you step-by-step
through the process of using a TPF application written in C++ language
to access XML data. To view the guide, do either of the following:
-
Go to http://www.ibm.com/tpf/pubs/tpfpubs.htm
-
View the contents of the IBM Online Library: TPF Systems Collection
CD-ROM, which contains the TPF 4.1 books in Portable Document Format (PDF)
and BookManager book format. This CD-ROM also contains the written information
for XML parser as a set of HTML files. Instructions for using TPF Systems
Collection are included with the CD-ROM.
With SGML as a parent and HTML as a sibling, XML holds the potential of
providing new and improved capabilities for both data interchange and data
display. While the technology is growing and changing, there are many opportunities
to exploit the technology in an effort to increase efficiency and simplify
consistent communication. Relationships between data remain necessary,
but may become more flexible; and suddenly, the jar of marbles is slightly
more stable.
Second Quarter 2001 - Table of Contents
|