Skip to main content

     
  TPF : Library : TPF Newsletters
Products > Software > Transaction Systems > TPF > Library > Newsletters >

 

Extending the Boundaries with XML Support on TPF

Carrie J. Evans, IBM TPF ID Core Team

Imagine a world in which the relationships between items are both necessary and optional; a world that allows you to pull a marble from the bottom of a jar without having any impact on the other marbles. Imagine the complexity of consistency erased.

Today, great advancements in technology introduce new ideas, theories, languages, terminology, and acronyms. Extensible Markup Language (XML) is one of these new technologies that holds the potential of being a great solution for sharing information across different computing platforms. While there is no guarantee that this is the last solution, XML does offer a seemingly revolutionary approach to data categorization and communication. The relationships between pieces of data are defined, yet flexible so data can more easily be changed, moved, and deleted. XML simplifies data sharing between programs, platforms, and users. The following paragraphs will help you gain an understanding of basic XML concepts, the XML support that is added to the TPF 4.1 system when applying APAR PJ27634, and insight into how XML can favorably impact the TPF community.

What Is XML?

XML is a markup language that combines the power of Standard Generalized Markup Language (SGML) and the simplicity of Hypertext Markup Language (HTML). Both XML and HTML are the children of the SGML markup language:

  • HTML is an application of SGML that is used to define the display properties of data for use primarily on the Internet. It has a quiet simplicity that allows for the formatting of data using predefined tags.
  • XML is a subset of SGML that can be used to categorize data with regard to either presentation or data retrieval. XML retains the restrictive and powerful nature of SGML but, as a subset developed by experienced SGML users, is easier to write and maintain. XML can be used to display Web pages if the browser has the ability to parse (or read) the XML language. It can also be used to manage data for easy retrieval. XML is already being used for publishing as well as for data storage and retrieval, data interchange between heterogeneous platforms, data transformations, and data displays. As it evolves and becomes more powerful, XML may allow for single-source data retrieval and data display.
XML is a recommendation from the World Wide Web Consortium (W3C). (Go to http://www.w3.org for more information about the W3C.) "A W3C Recommendation is a technical report that is the end result of extensive consensus building inside and outside of the W3C about a particular technology or policy. The W3C considers that the ideas or technology specified by a Recommendation are appropriate for widespread deployment and promote W3C's mission" (go to http://www.w3.org/Consortium/Process-20010208/tr.html).

How Does XML Work?

XML documents are made up of elements, attributes, and values:

  • An element is an opening tag, a closing tag, and the contents between the two. In the following example, there are three elements: name, first, and last:

  • <name>
    <first>Mickey</first>
    <last>Mouse</last>
    </name>
     

  • An attribute is a name-value pair specified as part of the opening tag of an element. For example, countryCode, areaCode, and pNumber are all attributes of the element PhoneNumber:

  • <PhoneNumber countryCode="44" areaCode="340" pNumber="635 3343" />

  • A value is the data contained between an opening tag and a closing tag or the piece of an attribute between double quotations (" ").
An XML document can be well-formed, valid or neither. A well-formed document, which can be either invalid against a specified Document Type Definition (DTD) or simply not checked for validity, is a document that follows the basic rules for writing XML markup language. A well-formed document adheres to the following rules:
  • Every XML document must have a root element.
  • All tags must be opened and closed. XML also allows you to write empty elements by adding an ending slash before the closing bracket; for example, <address />.
  • Tags must follow nesting rules.
  • Either single quotations (' ') or double quotations (" ") must surround the value of an attribute.
A valid XML document, which must also be (by definition) well-formed, is one that conforms to the rules of the associated DTD. A DTD is a type of schema that defines the acceptable element tags and attributes (and nesting order for those elements and attributes). In other words, it defines both the vocabulary and the syntax for an XML document type. A DTD is written in a specific and strict syntax and can either reside inside the XML document or in a separate file. While a DTD is not required for a well-formed XML document, it is necessary if you want to validate your XML document.

The following figure shows a DTD and an XML document that is based on that DTD. The encoding declaration identifies the code page in which the document is written, the root element must be identified properly and all other elements and attributes must be nested inside the root element, and the DTD declaration in the XML document identifies the path name for the DTD on which the document is validated. Finally, the figure highlights the declaration of two different elements in the DTD with how they appear in the XML document.

DTD elements

While there are a variety of editors available for writing XML documents and DTDs, you can also write them in any basic text editor. However, reading an XML document can be slightly more difficult. Not all Web browsers support XML, and not all of them have the ability to read an XML document. Microsoft Internet Explorer 5 does have the ability to parse some XML documents depending on their encoding. Our XML document, pnr.xml, would be shown in this browser as follows, indicating that the XML document is well-formed:

XML document

Working with XML is a process of developing the language that most closely maps to the information you want to store. Oftentimes, the process of planning which elements to declare in your DTD will prove to be much more difficult than the process of writing the actual XML document. However, once the initial DTD is written, it is easy to write and change many XML documents based on the same DTD so that your data can be displayed, read by an application, or stored in a database.

The W3C Web site (http://www.w3.org/) contains the complete XML specification. XML.com (http://www.xml.com/) has an annotated version of the specification at http://www.xml.com/axml/testaxml.htm.

XML Support on TPF

The XML parser (APAR PJ27634) is the XML Parser for C++ (XML4C) Version 3.1.2 ported to the TPF 4.1 system. The parser is XML Version 1.0 compliant and allows TPF 4.1 applications written in C++ language to do the following:

  • Parse XML documents using the Document Object Model (DOM) Version 1.0 specification

  • Notes:
    1. The DOM can also assist in using TPF applications to modify and create XML documents.
    2. Some APIs from the DOM version 2.0 specification are provided for experimental use. They are not supported for production work.
  • Parse XML documents using the Simple API for XML (SAX) Version 1.0 specification.
  • Parse XML documents with or without validation against a specified DTD.
Applications on the TPF 4.1 system interact with XML documents that are in the file system, coming in through standard input (stdin) or residing in memory. This interaction is made possible through application programming interfaces (APIs) specified by either the DOM or SAX specifications and can be either nonvalidating or validating against a DTD. The API definitions are contained within a set of header files that application programmers will need to have in their #include (or search) path.

IBM contributed the XML4C parser to the Apache XML Project (see http://xml.apache.org) as open source in November 1999. The XML Parser for C++ (XML4C) Version 3.1.2 is based on Xerces-C Version 1.1 and is fully compliant with the Unicode 3.0 specification. While the Xerces-C parser can be updated by the open source community, the XML4C parser is maintained only by IBM and may differ slightly from the Xerces-C parser.

XML and the TPF Community

The benefits of using XML vary, but overall, marked-up data and the ability to read and interpret that data provide the following benefits for the TPF community:

  • With XML, TPF applications can more easily read information from a variety of platforms. The data is platform-independent, so now the sharing of data between you and your customers can be simplified.
  • Companies that work in the business-to-business (B2B) environment are developing DTDs for their industry. The ability to parse XML documents gives TPF an opportunity to be exploited in the B2B environment.
    • XML data can be read even if you do not have a detailed picture of how that data is structured. Your clients will no longer need to go through complex processes to update how to interpret data that you send to them because the DTD gives the ability to understand the information.
  • Changing the content and structure of data is easier with XML. The data is tagged so you can add and remove elements without impacting existing elements. You will be able to change the data without having to change the application.
Despite all the benefits of using XML, there are some things to be aware of. First of all, working with marked-up data can be additional work when writing applications because it requires more pieces to work together. Your XML document must be well-formed and may need to comply with the rules of a given DTD (and therefore be valid); the encoding of your documents must be supported on TPF; you must decide which API with which to parse your documents (DOM or SAX). Given the benefits of using XML, this additional work up front can reduce the amount of work needed to make a change in the future. Second, although it is a recommendation developed by the W3C, XML is still a developing technology. As with any new technology, there will be bumps along its development road.

The TPF development lab is working with an XML task force that is sponsored through the TPF Users Group (TPFUG). The intention of the task force is to explore XML support on TPF and identify requirements for additional support. If you have an interest in XML on TPF, we encourage you to join this task force. For more information about the TPFUG, go to http://www.tpfug.org/.

For More Information

The technical documentation for APAR PJ27634 is exclusively online in browser-readable files. XML on TPF: An Online User's Guide will provide you with additional information about XML as well as the technical details of XML support on the TPF 4.1 system. The guide also includes a list of resources for learning XML and a tutorial that will walk you step-by-step through the process of using a TPF application written in C++ language to access XML data. To view the guide, do either of the following:

  • Go to http://www.ibm.com/tpf/pubs/tpfpubs.htm
  • View the contents of the IBM Online Library: TPF Systems Collection CD-ROM, which contains the TPF 4.1 books in Portable Document Format (PDF) and BookManager book format. This CD-ROM also contains the written information for XML parser as a set of HTML files. Instructions for using TPF Systems Collection are included with the CD-ROM.
With SGML as a parent and HTML as a sibling, XML holds the potential of providing new and improved capabilities for both data interchange and data display. While the technology is growing and changing, there are many opportunities to exploit the technology in an effort to increase efficiency and simplify consistent communication. Relationships between data remain necessary, but may become more flexible; and suddenly, the jar of marbles is slightly more stable.

Second Quarter 2001 - Table of Contents