Issue No 12:
FOR ENTERPRISE APPLICATION INTEGRATION (EAI)
Integration (EAI) Technologies for Data Content Analysis™, Inter-Enterprise
Data Integration™ and HyperRelational Analysis™
Technologies for Enterprise Application Integration
AUSTRALIA – September 7, 2000:
discussed in the last issue that XML is not a silver bullet, but that it offers
great benefit for Enterprise
Application Integration (EAI) within and across enterprises if used
effectively. This issue continues the theme. It discusses additional products
and technologies for EAI. It also announces upcoming conferences and seminars
that will assist you.
TEN - The Enterprise Newsletter
Back to Contents.
FOR ENTERPRISE APPLICATION INTEGRATION (EAI)
One of the
biggest problems facing enterprises today is the question of integration of
application systems and databases within and across enterprises. These may be
legacy databases and systems that were developed years ago for a specific
purpose, and which are still being used very effectively in the enterprise. Or
they may have been recently developed, but are difficult to integrate with other
databases and systems that contain much the same data. These may all be
redundant versions of the same data, each version of which must be kept
up-to-date with any changes so that all data versions are current.
discussed in earlier issues that XML
can be used to achieve real-time updates of these redundant data versions, using
technologies such as Microsoft BizTalk and XML. In this issue we will discuss
various technologies – including XML – that assist Enterprise
approach uses “Data Content Analysis™
” for normalization of live
databases or files, to reverse engineer third normal form data structures and
database designs directly from the live data content. This can permit EAI to be
achieved more effectively than by using the unnormalized databases and files.
The second approach is based on XML, which is used to expose all aspects of
databases, including business rules. This is called “Inter-Enterprise Data
The third approach analyzes the implicit relationships between tables that
reside in databases developed by the enterprise, as well as databases developed
by Enterprise Resource Planning (ERP) vendors such as SAP, Baan and others. This
is called “HyperRelational Analysis™ ”.
Back to Contents.
We have all
encountered legacy databases and application systems that were developed many
years ago, but the database design and application design were never documented.
Or they were originally documented, but changes have since been made to the
applications or the databases – yet those changes were never updated in the
documentation. As a result, little is known today of the structure of those
Of course it
is possible to reverse engineer these undocumented legacy databases to determine
their structure by using CASE modeling tools. These extract from the database
catalog various details about the tables and columns that comprise those
databases. With this knowledge of the database structure, legacy database
designs can be integrated with other databases. They can then be reengineered
for new database environments. But the problem becomes more complex when it is
necessary to reengineer databases that were unnormalized for performance.
You know the
problem. Many of these legacy databases did not store details about customers,
or orders, or products only in the relevant Customer, Order or Product tables as
normalized data. Instead these details were combined together in common tables
as unnormalized data, hoping in this way to avoid perceived performance
problems. This may indeed have enabled improved database performance, but it was
achieved often at the expense of creating redundant data versions throughout the
enterprise. The problem emerges when redundant data changes. For example, if a
customer’s address is changed, or a product price is changed, each redundant
data version has to be updated so that all versions reflect the same status of
Application Integration (EAI) brings all of these redundant data versions
together, so that relevant customer, or product, or other details exist in only
one place – yet can be shared throughout the enterprise. When a change occurs,
the change then only needs to be made once. The single, updated data version is
then immediately available at its latest status for everyone who is authorized
to use it.
when both of these problems occur together: unnormalized data versions that are
dissipated redundantly throughout the enterprise, plus an absence of
documentation of those unnormalized database designs. To resolve this problem
requires enormous expenditure of effort. Examining the database catalogs and the
live data content – to infer data dependencies and so derive normalized
database designs for EAI – is largely a manual task.
new technologies are emerging to assist this analysis, based on the application
of Data Content Analysis. Products such as Axio from Evoke Software (http://www.evokesoft.com/)
analyze live databases to infer data dependencies. All of the data values in a
column are first analyzed for data value consistency and data quality. For
example, the same address column may have some rows that seem to be different
– appearing as “100 Fillmore”, and also as “100 Fillmore Street”. When
quality problems like this are detected, these different values can be changed
so that only consistent data values exist (using only “100 Fillmore Street”,
products are available to assist this data quality analysis. However Evoke Axio
takes this analysis further. It also examines the data values in each row of a
table to identify columns that are dependent on the values of other columns in
the same row. This dependency analysis of data values identifies possible
primary and foreign keys. It enables those columns to be normalized to third
normal form (3NF). It eliminates data redundancy by deriving 3NF database
designs and 3NF data models, working from the live data content of the database.
The end result is the automatic generation of 3NF Data Definition Language (DDL)
schema scripts to install the 3NF databases using appropriate Data Base
Management Systems (DBMS) products. This in turn enables more accurate
Enterprise Application Integration.
Back to Contents.
approach to Enterprise Application Integration is called Inter-Enterprise Data
Integration. This is a technology that is used by infoShark (http://www.infoshark.com/).
It is based on use of XML in their latest product: XMLShark. This enables
rapid exchange of data between enterprise databases and internet-based users by
generating relational XML data directly from legacy data sources. XMLShark
enables XML data to be securely used, shared and exchanged on the Web and then
parsed and returned to a corporate source.
automatically scans existing data sources to understand their structure and
underlying business rules, and then generates data mappings. This renders
seemingly incompatible data sources instantly interoperable. Says Barbara
Bouldin, CTO of infoShark: “XMLShark will broker, cache, and synchronize
information via the Internet in real-time or batch, based on business rules as
well as preserve the integrity and security of the data.”
The XML data
generated by XMLShark is formatted in industry-accepted XML CARD (Commerce
Accelerated Relational Data) Schema that contains both the data and the database
structure. It converts
relational data to and from XML. It translates the relational data in real-time to an XML-based information
cache that enables the bi-directional exchange of relational data to the
Internet, or anywhere in an enterprise.
created the CARD schema to represent relational data and its metadata in XML.
This schema conforms to the current working draft of the W3C and
according to infoShark has been accepted by BizTalk as a standard. CARD is
freely available on numerous websites, including http://www.xml.org,
http://www.biztalk.org and http://www.infoShark.com.
adhering to the CARD schema can provide all the necessary information to
recreate relational databases and populate them with their data. This
information includes such things as primary/foreign key relationships, indices,
constraints, and native data types. This schema can be used by a business to
provide a subset of a production database to business partners. It contains
commerce-related information for pricing individual pieces of data contained
within the document. By setting a
business value for information, companies’ data can be easily sold in an
e-commerce environment. The main goal of the CARD schema is to provide a common
language (complete with basic business rules) for a bi-directional XML-based
an Oracle constraint might be a rule such that a manager’s base salary must be
a minimum of $75,000 per year but cannot exceed $100,000.
<![CDATA[salary >= 75000 and salary <= 100000]]>
Analysis (described above for Evoke Axio) and Inter-Enterprise Data Integration
(as used by XMLShark) are two examples of Enterprise Application Integration
technologies. A third technology that approaches EAI from a completely different
perspective is that of HyperRelational Analysis – used for Enterprise
Back to Contents.
growing interest in Enterprise Portals, also called Corporate
Portals. Quite distinct from Internet Portals such as Yahoo, an Enterprise
Portal provides a single gateway to an enterprise that is tailored to the
requirements of each individual. A general definition follows:
Enterprise Portal is a single gateway – accessed via the corporate Intranet,
or via a secure Extranet used by customers, suppliers and business partners, or
via the Internet – to the relevant workflows, application systems and
databases – typically integrated using XML and tailored to the specific job
responsibilities of each individual.”
an Employee Portal enables employees to access the processes, the systems and
the databases – via Intranet or Internet – that they need to carry out
assigned job responsibilities, with full security and firewall protection.
Customer Portal is a single gateway across the Internet, or via a secure
Extranet, to details about products and services, catalogs, and order and
invoice status for customers – all
integrated using XML and tailored to the unique requirements of each customer.
It offers clear opportunities for customer personalization and management with
one-to-one Customer Relationship Management (CRM).
problem however is in achieving a level of effective application and database
integration so that the single point of access of an Enterprise Portal appears
seamless. Each database to be accessed in this environment may have been
originally designed for use by specific application systems. But they may not be
easily integrated with other databases, as they were never required to work
together. We discussed in earlier articles that XML could assist this Enterprise
Application Integration. But another technology is also available: HyperRelational
Analysis is a patented database integration technology that is used by TopTier
Software (now part of SAP http://www.sap.com/) to
analyze explicit and implicit database structures. It uses primary and foreign
keys in a database catalog to analyze explicit relationships that are defined by
primary and foreign key constraints. It analyzes these keys to identify other
relationships that are implicit. It uses them to integrate dissimilar databases
in an “Enterprise
if Table A is related to Table B and also Table B is related to Table C, then
Table A is implicitly related to Table C. In another example, a relationship may
be explicitly defined from an Order table to a Customer table based on a common
key of Customer-Number. This same Customer-Number key may also exist in other
tables in the database, and in other databases throughout the enterprise.
HyperRelational Analysis thus identifies both explicit and also implicit
relationships based on this common key.
this database analysis, TopTier then supports integration across databases by
using a “drag-and-relate” access technique from their Enterprise
Integration Portal interface. The power of this integration access is
dramatic. For example, a Customer-Number or a Product-Number key value from an
SAP R/3 database can be dragged by an end-user onto relevant Customer or Product
tables in a Baan ERP database, or in its own
databases. The result is direct access to details of that customer or product
across different ERP vendor and enterprise
example further illustrates this power. A Return-ID from a Shipping Return (with
a foreign key of Shipment-Tracking-Number) is dragged by an end-user onto the
Federal Express icon in a TopTier Enterprise Integration Portal interface – to
drill down automatically to details retrieved from the FedEx web site of the
FedEx delivery for that tracking number. The product CD provided by TopTier
includes movies that dramatically show the jaw-dropping power of this technology
We have all
seen the double spread advertisement by SAP of a beautiful woman looking out
from the page of a newspaper or magazine with four simple words: “You Can. It
Does”. This refers to the flexibility of mySAP.com, a portal capability that
was developed by SAP based largely on the power of TopTier HyperRelational
Analysis. Flexible TopTier drag-and-relate database integration capability is an
integral component of mySAP.com, and in other environments within and across
enterprises. Information on mySAP.com and on TopTier is available from http://www.mysap.com/
and also http://www.toptier.com/.
Back to Contents.