Issue No 5:
METADATA AND XML
The Importance of Metadata and XML for System and Database
Intercommunication within and between Enterprises
Metadata and XML for Business
- Why is Metadata Important?
- How is Metadata used with XML?
- Available Metadata Courses
PERTH, AUSTRALIA March 15, 1999: Welcome to the fifth issue of The Enterprise Newsletter issued quarterly to help
you prepare today to be one of the winners of tomorrow. My Mission is for The Enterprise
Newsletter to become a key vehicle to communicate innovative applications of Information
Technology to Enterprises. My objective is to help you and your organization become a
"best practice" example of the application of IT to business to become
"10 out of 10". For emphasis, I use the acronym "TEN" to refer to The
TEN - The Enterprise Newsletter
Back to Contents.
AND XML FOR BUSINESS
In previous issues we briefly looked at how XML (Extensible
Markup Language) will help businesses not only survive the coming Competitive Armageddon
but also grow and prosper in the resulting turmoil (see TEN#3 below). This issue
discusses related topics, Metadata and XML, and their importance to business.
1. Why is
In this issue we address a common problem. How can we
convince managers to plan, budget and apply resources for metadata management? What is
metadata and why is it important? What technologies are involved? Internet and Intranet
technologies are part of the answer and will get the immediate attention of management.
XML is the other technology.
Every country is now interconnected in a vast, global telephone network. We are now
able to telephone anywhere in the world. We can phone a number, and the telephone assigned
to that number would ring in Russia, or China, or in Outer Mongolia. But when it is
answered, we may not understand the person at the other end. They may speak a different
language. So we can be connected, but what is said has no meaning. We cannot share
Today, we also use a computer and the World Wide Web. We enter a web site address into
a browser on our desktop machine a unique address in words that is analogous to a
telephone number. We can then be connected immediately to a computer assigned to that
address and attached to the Internet anywhere in the world. That computer sends a web page
based on the address we have supplied, to be displayed in our browser. This is typically
in English, but may be in another language. We are connected, but like the telephone
analogy if it is in another language, what is said has no meaning. We cannot share
Now consider the reason why it is difficult for some of the systems used in an
organization to communicate with and share information with other systems. Technically,
the programs in each system are able to be interconnected and so can communicate with
other programs. But they use different terms to refer to the same data that needs to be
shared. For example, an accounting system may use the term "customer" to refer
to a person or organization that buys products or services. Another system may refer to
the same person or organization as a "client". Sales may use the term
"prospect". They all use different terminology different language
to refer to the same data and information. But if they use the wrong language, again they
cannot share information.
But the problem is even worse. Consider terminology used in different parts of the
business. Accountants use a "jargon" a technical language which is
difficult for non-accountants to understand. So also the jargon used by engineers, or
production people, or sales and marketing people, or managers is difficult for others to
understand. They all speak a different "language". What is said has no meaning.
They cannot easily share common information. In fact in some enterprises it is a miracle
that people manage to communicate meaning at all!
Each organization has its own internal language, its own jargon, which has evolved over
time so that similar people can communicate meaning. As we saw above, there can be more
than one language used in an organization. Metadata identifies an organizations own
"language". Where different terms refer to the same thing, a common term is
agreed for all to use. Then people can communicate more clearly. And systems and programs
can intercommunicate with meaning. But without a clear definition and without common use
of an organizations metadata, information cannot be shared effectively throughout
Previously each part of the business maintained its own version of
"customer", or "client" or "prospect". They defined
processes and assigned staff to add new customers, clients or prospects to
their own files and databases. When common details about customers, clients or prospects
changed, each redundant version of that data also had to be changed. It requires staff to
make these changes. Yet these are all redundant processes making the same changes to
redundant data versions. This is enormously expensive in time and people. It is also quite
The importance of metadata can now be seen. Metadata defines the common language
used within an enterprise so that all people, systems and programs can communicate
precisely. Confusion disappears. Common data is shared. And enormous cost savings are
made. For it means that redundant processes (used to maintain redundant data versions
up-to-date) are eliminated, as the redundant data versions are integrated into a common
data version for all to share.
Back to Contents.
2. How Is Metadata
Used with XML?
Much effort has earlier gone into the definition and
implementation of Electronic Data Interchange (EDI) standards to address this problem of
intercommunication between dissimilar systems and databases. EDI has now been widely used
for business-to-business commerce for many years. It works well, but it is quite complex
and very expensive. As a result, it is cost-justifiable generally only for large
Once an organizations metadata is defined and documented, all programs can use it
to communicate. EDI was the mechanism that was used previously. But now this
intercommunication has become much easier.
Extensible Markup Language (XML) is a new Internet technology that has been developed
to address this problem. XML can be used to document the metadata used by one system so
that it can be integrated with the metadata used by other systems. This is analogous to
language dictionaries that are used throughout the world, so that people from different
countries can communicate. Legacy files and other databases can now be integrated more
readily. Systems throughout the business can now coordinate their activities more
effectively as a direct result of XML and management support for metadata.
XML now provides the capability that was previously only available to large
organizations through the use of EDI. XML allows the metadata used by each program and
database to be published as the language to be used for this intercommunication. But
distinct from EDI, XML is simple to use and inexpensive to implement for both small and
large organizations. Because of this simplicity, we like to think of XML as:
"XML is EDI for the Rest of Us"
XML will become a major part of the application development mainstream. It provides a
bridge between structured databases and unstructured text, delivered via XML then
converted to HTML during a transition period for display in web browsers. It includes the
||Extensible Markup Language
||Defines document content using
metadata tags and namespaces
||Document Type Definition
||Defines XML document structure
(analogous to database schema)
||Extensible Style Language
||XSL or Cascading Style Sheets
(CSS) separate layout from data
||Extensible Linking Language
||XLL implements multi-directional
links (single or multiple)
||Document Object Model
||Standard language interface for
processing XML in any language
||Resource Definition Framework
||W3C Interoperability Project for
data content interchange
Metadata is used to define the structure of an XML document or file. Metadata is
published in a Document Type Definition (DTD) file for reference by other systems. A DTD
file defines the structure of an XML file or document. It is analogous to the Database
Definition Language (DDL) file that is used to define the structure of a database, but
with a different syntax.
An example of an XML document identifying data retrieved from a PERSON database
follows. This includes metadata markup tags (surrounded by <
>, such as
<person_name>) that provide various details about a person. From this, we can see
that it is easy to find specific contact information in <contact_details>, such as
<email>, <phone>, <fax> and <mobile> (cell phone) numbers.
Although I have not shown it, the DTD also specifies whether certain tags must exist or
are optional, and whether some tags can exist more than once -such as multiple
<phone> and <mobile> tags below.
<PERSON person_id="p1100" sex="M">
Information Engineering Services Pty Ltd
Metadata that is used by various industries, communities or bodies can be used with
XML, XSL and XLL to define markup vocabularies. The World Wide Web Consortium (W3C) has
developed a standard framework that can be used to define these vocabularies. This is
called the Resource Definition Framework (RDF). It is a model for metadata
applications that support XML. RDF was initiated by the W3C to build standards for XML
applications so that they can inter-operate and intercommunicate more easily, avoiding
communication problems that we discussed earlier.
There is considerable effort in various industries to define their own standard
language, called a markup vocabulary, using XML for their metadata. These become
unique languages for intercommunication between participants in an industry. Markup
vocabularies include: Mathematic Markup Language (MathML); Chemical Markup
Language (CML); Open Financial Exchange (OFX); Internet Content Exchange
(ICE); Voice Recognition Markup Language (VML); JavaBean Markup Language
(JBL); Synchronized Multimedia Integration Language (SMIL); and Wireless Markup
Language (WML). Other markup languages have been defined for Channel Definition
Format (CDF), Meta Content Framework (MCF), Open Software Description
(OSD) and Web Interface Definition Language (WIDL). For example, the Channel
Definition Format which was delivered as part of Microsoft Internet Explorer 4.0
and now widely used is based on XML.
The W3C and RDF web sites are two good starting points for more information about the
above markup languages. The RDF web site is at http://www.w3.org/Metadata/RDF/. RDF is now
a W3C recommendation, the first step towards becoming a standard. The W3C web site is at
http://www.w3.org/. These all will point you to specific web sites that provide additional
details about the above markup languages.
With XML, even more effective applications become possible. For example, an
organization can define the unique metadata used by its suppliers' legacy inventory
systems. This will enable that organization to place orders via the Internet directly with
those suppliers' systems, for automatic fulfillment of product orders. This application
and eight other typical XML applications are available from the Microsoft XML Scenarios
web site at http://microsoft.com/xml/scenario/intro.asp.
XML is enabling technology to integrate unstructured text and structured databases for
next generation E-Commerce and EDI applications. Web sites will evolve over time to use
XML to provide the capability and functionality presently offered by HTML, but with far
greater power and flexibility. Netscape Communicator 5.0 and Microsoft Internet Explorer
5.0 browsers will soon be released. Microsoft Office 2000 will also be released in the
second quarter 1999. All of these will support XML. New XML development tools will also be
released in 1999 to enable XML applications to be developed more easily.
The acceptance and application of XML is progressing rapidly, as it offers a very
simple - yet extremely powerful - way to intercommunicate between different databases and
systems, both within and outside an organization. This is structured data that is
available from databases and legacy files. Yet for most enterprises, over 90% of the
knowledge resources exist not in structured databases and files, but in unstructured text
documents, in graphics and images, as well as in audio and video files.
How well an organization accesses and uses its knowledge resources often determines its
competitive advantage and future prosperity. The use and application of knowledge will
become even more important in the future competitive Armageddon that we are all about to
The tools are coming, but a greater task remains still remains to be completed. This is
the definition of your own metadata, your common enterprise language for
intercommunication, so that you can use these tools effectively. The definition of
metadata depends on knowledge of data modeling, previously carried out by IT people. But
this is not just a task for IT. As it is vitally dependent on business knowledge, it also
requires the involvement of business experts. Not by interview, but by their active
participation. While data modeling has until now been a technical IT discipline, business
data modeling is not. It can be learned by business people as well as IT staff.
Business experts with detailed knowledge of an enterprise can draw on that knowledge to
define the unique language used in different parts of the organization. This language is
defined as metadata by using business data modeling methods. These methods require a
knowledge of the business, not of computers. They can be learned through self-study
business-driven courses that take 8 12 hours to complete. A data modeling case
study workshop is also available. This uses a supplied CASE modeling tool to capture the
defined metadata and assess the understanding of data modeling concepts.
These self-study courses and workshop are part of the Certified Business Data Modeler
(CBDM) course series. You can view the course outlines and review introductory slides
online by visiting the IES web site at http://www.ies.aust.com/~ieinfo/ or the Visible
Australia web site at http://www.visible.com.au/ and clicking on the "Store"
link from any page as described below in "Available Metadata Courses".
Successful completion of the workshop will enable your business and IT people to be
assessed to determine whether they qualify as Certified Business Data Modelers (CBDM).
Following this training, they can apply their new skills to the definition of your own
metadata. The competition is building, the tools are coming, the technology is XML. But
these all assume that your metadata has already been defined. There is no time to lose.
Back to Contents.
Available Metadata Courses
We offer a variety of courses and books on data modeling
and metadata from our Online Store. These have been designed for use by business staff as
well as IT. PowerPoint is used for individual training. For larger numbers of people, the
courses can be delivered across a corporate Intranet to train hundreds or thousands of
staff. They learn how to work together in a design partnership to define the metadata used
throughout an enterprise. We also have CASE modeling tools to help you define and capture
your metadata. You can review these by clicking on the "Store"
link from the following Web Sites:
Select the items you want into your shopping basket - then supply all required
purchase, credit card and delivery information by clicking on the "Checkout"
icon. Purchases can be made securely online by credit card. Or instead you can purchase
via credit card offline by printing and faxing the Order Form. Or we will invoice larger
orders if you prefer, when you fax the Order Form together with a Corporate Purchase
Back to Contents.