Methodologies and Technologies for Rapid Enterprise Architecture Delivery

Google
 
Web www.ies.aust.com

| Home | Courses | Projects | Papers | Contact Us |

THE ENTERPRISE NEWSLETTER

Issue No 5:
METADATA AND XML FOR BUSINESS

 The Importance of Metadata and XML for System and Database Intercommunication within and between Enterprises

Printable PDF Version

Contents

Metadata and XML for Business

  1. Why is Metadata Important?
  2. How is Metadata used with XML?
  3. Available Metadata Courses

PERTH, AUSTRALIA – March 15, 1999: Welcome to the fifth issue of The Enterprise Newsletter – issued quarterly to help you prepare today to be one of the winners of tomorrow. My Mission is for The Enterprise Newsletter to become a key vehicle to communicate innovative applications of Information Technology to Enterprises. My objective is to help you and your organization become a "best practice" example of the application of IT to business – to become "10 out of 10". For emphasis, I use the acronym "TEN" to refer to The Enterprise Newsletter.

Clive Finkelstein
TEN - The Enterprise Newsletter

Back to Contents.


METADATA AND XML FOR BUSINESS

In previous issues we briefly looked at how XML (Extensible Markup Language) will help businesses not only survive the coming Competitive Armageddon – but also grow and prosper in the resulting turmoil (see TEN#3 below). This issue discusses related topics, Metadata and XML, and their importance to business.


1. Why is Metadata Important?

In this issue we address a common problem. How can we convince managers to plan, budget and apply resources for metadata management? What is metadata and why is it important? What technologies are involved? Internet and Intranet technologies are part of the answer and will get the immediate attention of management. XML is the other technology.

Every country is now interconnected in a vast, global telephone network. We are now able to telephone anywhere in the world. We can phone a number, and the telephone assigned to that number would ring in Russia, or China, or in Outer Mongolia. But when it is answered, we may not understand the person at the other end. They may speak a different language. So we can be connected, but what is said has no meaning. We cannot share information.

Today, we also use a computer and the World Wide Web. We enter a web site address into a browser on our desktop machine – a unique address in words that is analogous to a telephone number. We can then be connected immediately to a computer assigned to that address and attached to the Internet anywhere in the world. That computer sends a web page based on the address we have supplied, to be displayed in our browser. This is typically in English, but may be in another language. We are connected, but like the telephone analogy – if it is in another language, what is said has no meaning. We cannot share information.

Now consider the reason why it is difficult for some of the systems used in an organization to communicate with and share information with other systems. Technically, the programs in each system are able to be interconnected and so can communicate with other programs. But they use different terms to refer to the same data that needs to be shared. For example, an accounting system may use the term "customer" to refer to a person or organization that buys products or services. Another system may refer to the same person or organization as a "client". Sales may use the term "prospect". They all use different terminology – different language – to refer to the same data and information. But if they use the wrong language, again they cannot share information.

But the problem is even worse. Consider terminology used in different parts of the business. Accountants use a "jargon" – a technical language – which is difficult for non-accountants to understand. So also the jargon used by engineers, or production people, or sales and marketing people, or managers is difficult for others to understand. They all speak a different "language". What is said has no meaning. They cannot easily share common information. In fact in some enterprises it is a miracle that people manage to communicate meaning at all!

Each organization has its own internal language, its own jargon, which has evolved over time so that similar people can communicate meaning. As we saw above, there can be more than one language used in an organization. Metadata identifies an organization’s own "language". Where different terms refer to the same thing, a common term is agreed for all to use. Then people can communicate more clearly. And systems and programs can intercommunicate with meaning. But without a clear definition and without common use of an organization’s metadata, information cannot be shared effectively throughout the enterprise.

Previously each part of the business maintained its own version of "customer", or "client" or "prospect". They defined processes – and assigned staff – to add new customers, clients or prospects to their own files and databases. When common details about customers, clients or prospects changed, each redundant version of that data also had to be changed. It requires staff to make these changes. Yet these are all redundant processes making the same changes to redundant data versions. This is enormously expensive in time and people. It is also quite unnecessary.

The importance of metadata can now be seen. Metadata defines the common language used within an enterprise so that all people, systems and programs can communicate precisely. Confusion disappears. Common data is shared. And enormous cost savings are made. For it means that redundant processes (used to maintain redundant data versions up-to-date) are eliminated, as the redundant data versions are integrated into a common data version for all to share.

Back to Contents.


2. How Is Metadata Used with XML?

Much effort has earlier gone into the definition and implementation of Electronic Data Interchange (EDI) standards to address this problem of intercommunication between dissimilar systems and databases. EDI has now been widely used for business-to-business commerce for many years. It works well, but it is quite complex and very expensive. As a result, it is cost-justifiable generally only for large corporations.

Once an organization’s metadata is defined and documented, all programs can use it to communicate. EDI was the mechanism that was used previously. But now this intercommunication has become much easier.

Extensible Markup Language (XML) is a new Internet technology that has been developed to address this problem. XML can be used to document the metadata used by one system so that it can be integrated with the metadata used by other systems. This is analogous to language dictionaries that are used throughout the world, so that people from different countries can communicate. Legacy files and other databases can now be integrated more readily. Systems throughout the business can now coordinate their activities more effectively as a direct result of XML and management support for metadata.

XML now provides the capability that was previously only available to large organizations through the use of EDI. XML allows the metadata used by each program and database to be published as the language to be used for this intercommunication. But distinct from EDI, XML is simple to use and inexpensive to implement for both small and large organizations. Because of this simplicity, we like to think of XML as:

"XML is EDI for the Rest of Us"

XML will become a major part of the application development mainstream. It provides a bridge between structured databases and unstructured text, delivered via XML then converted to HTML during a transition period for display in web browsers. It includes the following components:

XML Extensible Markup Language Defines document content using metadata tags and namespaces
DTD Document Type Definition Defines XML document structure (analogous to database schema)
XSL Extensible Style Language XSL or Cascading Style Sheets (CSS) separate layout from data
XLL Extensible Linking Language XLL implements multi-directional links (single or multiple)
DOM Document Object Model Standard language interface for processing XML in any language
RDF Resource Definition Framework W3C Interoperability Project for data content interchange

Metadata is used to define the structure of an XML document or file. Metadata is published in a Document Type Definition (DTD) file for reference by other systems. A DTD file defines the structure of an XML file or document. It is analogous to the Database Definition Language (DDL) file that is used to define the structure of a database, but with a different syntax.

An example of an XML document identifying data retrieved from a PERSON database follows. This includes metadata markup tags (surrounded by < … >, such as <person_name>) that provide various details about a person. From this, we can see that it is easy to find specific contact information in <contact_details>, such as <email>, <phone>, <fax> and <mobile> (cell phone) numbers. Although I have not shown it, the DTD also specifies whether certain tags must exist or are optional, and whether some tags can exist more than once -such as multiple <phone> and <mobile> tags below.

<PERSON person_id="p1100" sex="M">

<person_name>

<given_name>Clive</given_name>

<surname>Finkelstein</surname>

</person_name>

<company>

Information Engineering Services Pty Ltd

</company>

<country>Australia</country>

<contact_details>

<email>cfink@ies.aust.com</email>

<phone>+61-8-9402-8300</phone>

<phone>(08) 9309-6163</phone>

<fax>+61-8-9402-8322</fax>

<mobile>+61-411-472-375</mobile>

<mobile>0411-472-375</mobile>

</contact_details>

</PERSON>

Metadata that is used by various industries, communities or bodies can be used with XML, XSL and XLL to define markup vocabularies. The World Wide Web Consortium (W3C) has developed a standard framework that can be used to define these vocabularies. This is called the Resource Definition Framework (RDF). It is a model for metadata applications that support XML. RDF was initiated by the W3C to build standards for XML applications so that they can inter-operate and intercommunicate more easily, avoiding communication problems that we discussed earlier.

There is considerable effort in various industries to define their own standard language, called a markup vocabulary, using XML for their metadata. These become unique languages for intercommunication between participants in an industry. Markup vocabularies include: Mathematic Markup Language (MathML); Chemical Markup Language (CML); Open Financial Exchange (OFX); Internet Content Exchange (ICE); Voice Recognition Markup Language (VML); JavaBean Markup Language (JBL); Synchronized Multimedia Integration Language (SMIL); and Wireless Markup Language (WML). Other markup languages have been defined for Channel Definition Format (CDF), Meta Content Framework (MCF), Open Software Description (OSD) and Web Interface Definition Language (WIDL). For example, the Channel Definition Format – which was delivered as part of Microsoft Internet Explorer 4.0 and now widely used – is based on XML.

The W3C and RDF web sites are two good starting points for more information about the above markup languages. The RDF web site is at http://www.w3.org/Metadata/RDF/. RDF is now a W3C recommendation, the first step towards becoming a standard. The W3C web site is at http://www.w3.org/. These all will point you to specific web sites that provide additional details about the above markup languages.

With XML, even more effective applications become possible. For example, an organization can define the unique metadata used by its suppliers' legacy inventory systems. This will enable that organization to place orders via the Internet directly with those suppliers' systems, for automatic fulfillment of product orders. This application and eight other typical XML applications are available from the Microsoft XML Scenarios web site at http://microsoft.com/xml/scenario/intro.asp.

XML is enabling technology to integrate unstructured text and structured databases for next generation E-Commerce and EDI applications. Web sites will evolve over time to use XML to provide the capability and functionality presently offered by HTML, but with far greater power and flexibility. Netscape Communicator 5.0 and Microsoft Internet Explorer 5.0 browsers will soon be released. Microsoft Office 2000 will also be released in the second quarter 1999. All of these will support XML. New XML development tools will also be released in 1999 to enable XML applications to be developed more easily.

The acceptance and application of XML is progressing rapidly, as it offers a very simple - yet extremely powerful - way to intercommunicate between different databases and systems, both within and outside an organization. This is structured data that is available from databases and legacy files. Yet for most enterprises, over 90% of the knowledge resources exist not in structured databases and files, but in unstructured text documents, in graphics and images, as well as in audio and video files.

How well an organization accesses and uses its knowledge resources often determines its competitive advantage and future prosperity. The use and application of knowledge will become even more important in the future competitive Armageddon that we are all about to enter.

The tools are coming, but a greater task remains still remains to be completed. This is the definition of your own metadata, your common enterprise language for intercommunication, so that you can use these tools effectively. The definition of metadata depends on knowledge of data modeling, previously carried out by IT people. But this is not just a task for IT. As it is vitally dependent on business knowledge, it also requires the involvement of business experts. Not by interview, but by their active participation. While data modeling has until now been a technical IT discipline, business data modeling is not. It can be learned by business people as well as IT staff.

Business experts with detailed knowledge of an enterprise can draw on that knowledge to define the unique language used in different parts of the organization. This language is defined as metadata by using business data modeling methods. These methods require a knowledge of the business, not of computers. They can be learned through self-study business-driven courses that take 8 – 12 hours to complete. A data modeling case study workshop is also available. This uses a supplied CASE modeling tool to capture the defined metadata and assess the understanding of data modeling concepts.

These self-study courses and workshop are part of the Certified Business Data Modeler (CBDM) course series. You can view the course outlines and review introductory slides online by visiting the IES web site at http://www.ies.aust.com/~ieinfo/ or the Visible Australia web site at http://www.visible.com.au/ and clicking on the "Store" link from any page as described below in "Available Metadata Courses".

Successful completion of the workshop will enable your business and IT people to be assessed to determine whether they qualify as Certified Business Data Modelers (CBDM). Following this training, they can apply their new skills to the definition of your own metadata. The competition is building, the tools are coming, the technology is XML. But these all assume that your metadata has already been defined. There is no time to lose.

Back to Contents.


3. Available Metadata Courses

We offer a variety of courses and books on data modeling and metadata from our Online Store. These have been designed for use by business staff as well as IT. PowerPoint is used for individual training. For larger numbers of people, the courses can be delivered across a corporate Intranet to train hundreds or thousands of staff. They learn how to work together in a design partnership to define the metadata used throughout an enterprise. We also have CASE modeling tools to help you define and capture your metadata. You can review these by clicking on the "Store" link from the following Web Sites:

Select the items you want into your shopping basket - then supply all required purchase, credit card and delivery information by clicking on the "Checkout" icon. Purchases can be made securely online by credit card. Or instead you can purchase via credit card offline by printing and faxing the Order Form. Or we will invoice larger orders if you prefer, when you fax the Order Form together with a Corporate Purchase Order.

Back to Contents.
 

 

Home
Courses

Projects
Papers
TEN Archive
Contact Us

Search

 

 


AUTHOR

Clive Finkelstein is the "Father" of Information Engineering (IE), developed by him from 1976. He is an International Consultant and Instructor, and was the Managing Director of Information Engineering Services Pty Ltd (IES) in Australia. 

Clive Finkelstein's books, online interviews, courses and details are available at http://www.ies.aust.com/cbfindex.htm.

For More Information, Contact:

  Clive Finkelstein
59B Valentine Ave
Dianella, Perth WA 6059 Australia
 
  Details:
Web Site:
Phone:
Fax:
Email:
http://www.ies.aust.com/cbfindex.htm
http://www.ies.aust.com/
+61-8-9275-3459
+61-8-6210-1579
clive.finkelstein@ies.aust.com

(c) Copyright 1995-2015 Clive Finkelstein. All Rights Reserved.


| Home | Courses | Projects | Papers | TEN Archive | Contact Us | [Search |

(c) Copyright 2004-2009 Information Engineering Services Pty Ltd. All Rights Reserved.