This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Information Architecture
For many applications, information architecture and data modelling is limited to providing the
application logic a means of robust persistence. Thus the design of the information architecture
is driven purely by application logic and programming concerns. However for effective business
management systems of the class we are concerned with, we must assume that the data is valuable in
its own right and carries uses beyond the simple transaction processing logic of the system.
Data is a first class concern for our overall approach to business management system design. We
predicate this elevation of importance on the following assumptions:
-
The business management system database will be the system of record for the majority of
material business records. This may be on a company, divisional, or departmental basis.
-
Third party applications will need to consume data and may produce data which properly is
recorded in the system of record database. There may be many such applications.
-
We expect that the third party reporting and business intelligence tools will be used to
provide specialized presentations and insights into the data.
-
The business management system we build may be subordinate to more fundamental business
management system which acts as the system of record. This will likely be true if our
application is only supporting a division or department of a larger organization.
In all of these cases we’re describing scenarios where our business management system is planned
to be part of a larger ecosystem of collaborating applications. This is called the
“best-of-breed” approach to business systems architecture and is common feature of enterprise
applications deployments.
1 - Data Organization
Business Management Systems often contain a large number of relations, often into the hundreds of
database tables, with some systems exceeding one thousand relations. And while the number of
relations required to support the broad functional concerns of the typical business management
system can be large, this data can be generalized into a relatively small handful of categories of
data.
Principal Classifications
Most broadly speaking, our basic data breaks down into “business domain objects”: the records
defining customers, products, warehouses, etc. which the application users maintain as needed; and
“transactions” records: the recorded actions which are taken by the business referencing the
business domain objects. We can refine these broad definitions further.
Master Data
Master Data are relations which enumerate business domain objects along with the attributes which
configures each object for use by the system. The Master Data establishes the definition of the
business domain object in the system. Examples of Master Data include relations defining
Entities,
Relationships, and Products.
The primary trait of each Master Data record is that it represents information which is true
“now”, meaning in the current moment. If the business domain object represented by a Master
Record changes, such as a customer changing its mailing address, the Master Data record is changed
to reflect the new reality so that at any given moment a Master Data record represents the
authoritative definition of the specific business domain object being described. When the Master
Record changes, there is no sense of history kept; the record behaves as if the updated data was
always the data.
Master Data records have life-cycle stages which determine how they can be used by the system.
These stages can be generalized as:
-
Preliminary Planning
In this stage the record exists, may be needed for certain longer range planning usage, but
isn’t available for regular transaction processing or reporting. This life-cycle stage
indicates future intention.
-
Regular Use (Active)
Records in this stage are actively used in daily business activities and reporting.
-
Obsolescence
Over time, some records will represent specific business domain objects which are planned to
go out of use. As that time approaches, certain transaction processing should cease (e.g.
purchasing transactions), while others remain available as the Out of Use stage approaches.
-
Out of Use (Inactive)
At the end of the Obsolescence stage, the record is not available for use in regular
transaction processing. The record will still be visible to business users as there is
expected to be historical/reporting relevance.
-
Purge Eligible
Once the record is no longer referenced by prior transaction histories, the record may become
eligible to be deleted from the database outright. It is important to reiterate that any
existing transaction history referencing the record should prevent this stage from being
reached to keep the data integral.
While all Master Data follows these general stages, any specific Master Data record or relation
may only do so informally within the business nor will the business management system always have
well defined recognition of these stages. The business management system may provide alternative
stage names, subdivide the stages, or allow the definition of the recognized stages to be
configured as suits the specific purpose at hand. In all cases however, the stages as listed are
the functionally distinct stages which will matter during the course of executing application
logic.
Because Master Data records are reused during the course of multiple business activities, Master
Data relations will constitute significantly less of the overall application data retained when
compared to other kinds of data. Record counts may be low, in the tens of records, but may
commonly be in the thousands of records. It is feasible, though rare, for some Master Data
relations to grow into the
Supporting Data
Supporting Data relations carry records which exist to support other, more fundamental, kinds of
data such Master Data or Transaction Data. Supporting data is Master Data-like in that the
records also posses the Master Data primary trait of being a representation of the present state
of the Supporting Data.
Supporting Data comes in some basic sub-types:
-
Simple Enumerations
Most (if not all) business management systems use “lists of values” to provide predefined
acceptable values used by attributes in our primary data records. Examples of these Simple
Enumerations include, lists of available order statuses, approval process states, and product
categorization.
It is not uncommon for business management systems to allow many of these Simple Enumerations
to be configured, as needed, by the user as current business requirements dictate. While user
management of Simple Enumerations suggests the possibility of life-cycle stages, most often
the business management system mandates that all values existing as records of the Simple
Enumeration are considered “Active” and the records are simply deleted when no longer of use.
Naturally, systems which recognize all existing records as “Active” should only allow for the
deletion of Simple Enumeration records when such records are no longer referenced.
-
Quantitative Data
For certain Master Data records, it can be convenient to track summarized Quantitative Data.
Consider the simplified example of a Master Data relation defining products sold from a single
warehouse. While the Master Data records for a product will define the configuration of the
product, there is also Quantitative Data such as how much quantity on hand of the product is
currently present, the value of the product on hand, etc.
Any one Quantitative Data record will always correspond to a single Master Data “parent”
record. This data could be stored in the same relation as the corresponding Master Data, and
in some business management systems it is. However, there can be technical considerations for
storing this quantitative data separately using Quantitative Data relations. Quantitative
Data tends to be updated much more frequently than the corresponding Master Data; this can
give rise to lock contention in the database due to the competing uses. In addition
Quantitative Data usually consists of a small number of numeric fields taking much less space
per row than the corresponding Master Data, which can have many more fields including text
fields; since updating a row causes the copying of all row data, we can be more efficient by
separating our frequently changing data from our infrequently changing data.
Quantitative Data doesn’t express any sort of life-cycle stages. Any Quantitative Data record
will assume the life-cycle stage of its corresponding Master Data parent record.
Supporting Data retention needs are of trivial concern in the broader context of the application
as a whole.
Transaction Data
Transaction Data relations describe specific instances of business activity. Examples of
Transaction Data include sales orders, order fulfillment and shipping, customer and vendor
invoices, and customer support tickets. Transaction Data records depend on Master Data and make
use of Supporting Data to describe these business activities.
The primary trait of Transaction Data is that each Transaction Data record represents an instance
of a business activity which is finite in time. While the business activity is underway, the
Transaction Data records allow for the coordination of the business operations required to execute
the business activity. Once the business activity is concluded the Transaction Data acts as a
historical reference as to what business operations were performed, to provide data for later
analysis, and the support the resolution of any later disputes that might arise related to the
performance of the business activity.
Transaction Data Use of Master Data
The nature of Master Data is that it consists of long lived records which evolve over time to
describe how business domain objects exist “now”. The nature of Transaction Data, however, is to
record business activities as they happened at the time which means that once a business activity
is concluded the Transaction Data record representing it is static over time.
This disparity between the natures of Master Data and Transaction Data means that the Transaction
Data records must either duplicate many of the attributes obtained from the Master Data or
reference time-boxed, immutable versions of the master data. What you cannot do is directly
reference the simple Master Data as that is expected to change over time.
Transaction Data has a generalized life-cycle consisting of the following stages:
-
Preliminary Planning
During this stage, a Transaction Data record is being authored, awaiting approvals, or
otherwise not yet actionable. During this time the Transaction Data record is not generally
visible to business operations or involved third parties.
-
In Progress (Open)
Once all preliminary work is completed the Transaction Data record may be opened and made
visible/usable to the various business operations, including third parties if appropriate,
that will work the transaction to completion. Arriving in this stage may be given a number of
names: “opening”, “releasing”, “posting”, etc. they all indicate that the Transaction Data
record is actionable.
-
Cancelled
Certain kinds of opened business activities may be terminated prior to successful completion.
When this happens the Transaction Data record is no longer actionable and becomes part of the
historical record, but only insofar as unsuccessful activities are concerned.
Note that this ending stage is only appropriate for business activities which never reached
any state of completion. Some business activities may be partially successful, for example a
sales order which shipped 5 out of 10 units of an ordered item. All business activities which
are partially completed prior to termination are, for our purposes, not considered cancelled.
At the point of being Cancelled, the Transaction Data record should be considered immutable,
except for the possibility of deletion once the record is no longer useful for reporting or
analytics.
-
Closed
Upon the successful, or partially successful, completion of a business activity, the
associated Transaction Data record will be considered “Closed”. Closed transactions are not
eligible for further business operations to be performed and the Transaction Data becomes part
of the historical record for analysis and reference purposes.
Closed Transaction Data records may be referenced by new, related Transactions. For example a
closed sales order reference may be required to process a new customer return transaction.
At the point of being Closed, the Transaction Data record should be considered immutable,
except for the possibility of deletion once the record is no longer useful for reporting or
analytics.
Exceptions to Closed Transaction Immutability
For a variety of reasons, some good and some bad, many business management systems support the
idea of re-opening previously Closed transactions for further business activities to be conducted.
However, in principle Closed transactions should be considered final and we will adopt this as an
axiomatic assumption in this documentation.
-
Purge Eligible
Transaction Data records may be purged when their history is no longer relevant to supporting
business reporting or analysis. Such records may be set as eligible to be purged by any
process or batch job which runs to delete the records from the database. The conditions which
allow Transaction Data records to become Purge Eligible are:
-
Either already in the Preliminary Planning, Closed, or Cancelled life-cycle stages.
-
Are not referenced by other Transaction Data records which are not themselves Purge
Eligible.
In practice, Transaction Data in the Preliminary Planning and Cancelled stages have little
barrier to being purged from the system, but Closed transactions will usually have time based
constraints on Purge Eligibility; detailed Transaction Data must be retained for various
periods of time to support financial and tax audits and for reference when communicating with
different business partners.
Transaction Data constitutes the majority of the data retained by the application. Care must be
taken in structuring and managing this data at the database level to ensure acceptable application
performance in operations, reporting, and analytics. Transaction Data relations can reach into
the billions of records for the class of application contemplated here.
Secondary Classifications
There are some classes of data which exist for technical reasons and/or are optional components
which are not essential to business management system operations.
Analytic Data
Analytic Data exists to facilitate reporting and analytic workloads. The Analytic Data consists
of summarized Transaction Data with some facts drawn from the Master Data. In terms of structure,
Analytic Data resembles typical data warehousing tables which serve the same purpose.
The goals of Analytic Data in the business management system includeL
-
Allowing for the long term reporting of otherwise Purge Eligible Transaction data.
-
Providing the means to reporting contextually relevant Analytic Data within the user interface
of the business management system. Examples might include monthly customer sales on a
customer form or weekly sales of items on an item form.
It is not a goal of business management system Analytic Data to make proper data warehouses and
analytic tools unnecessary. Maintaining Analytic Data in the transaction processing system does
come with database and application performance penalties. Taking in-application Analytic Data too
far risks overall application usability; choosing what Analytic Data should be available in the
application must be done with care.
When Analytic Data doesn’t enhance the normal transaction processing functions of the application,
but may still be of analytic value within the business, a data warehouse solution with the
appropriate reporting tools should be considered.
Analytic Data may constitute a significant portion of the overall data retained in the database,
but should still be smaller than the Transaction Data (assuming the limitations on Analytic Data
capture previously discussed).
System Data
There is a need to retain certain data simply to facilitate the technical operations of the
business management system itself. This is the role of System Data. System Data can include
relations for system oriented configurations and relations that exist to manage user logins or
auditing.
Typically System Data will consume an insignificant amount of data storage space.
2 - Business Relationships
The modelling of business relationships has evolved over time, moving from rather simple and naive
ideas to more correct representations of real world business relationships. Here we examine this
history and establish
Historical Perspective
When working within a company, it is not uncommon to think about “our customers”, “our vendors”,
“our partners”, and “our employees” as though these are specific kinds of distinct entities.
However these terms are not describing classes of entities, but are describing relationships that
exist between two entities, our company (an Entity in its own right) and the external party. This
distinction between thinking of a “customer” as an entity vs. thinking of the “customer” as a
relationship is subtle, but appreciating the nuance of the distinction can lead to important
insights in how we might build business management systems.
Understanding the history of modelling business relationships can inform our approach to Business
Relationships and allow for capabilities which more representative models bring.
Speculative, Generalized Content Ahead
This following description of business relationship modelling over time has not been formally
researched and is purely anecdotal.
In addition, the description of historic systems and current systems is very generalized. There
are many, and have been many, business systems in existence and each one with its own
individualized approach modelling business relationship management.
In both cases, what is written below is the author’s experience of working with a variety of
business systems over a substantial length of time in a wide variety of scenarios. While
the generalized trends and practices described below are informed by reality, they may depend too
much on the author’s own direct experiences and assumptions and thus be limited thereby.
Early Models
Many early business management systems were designed using the naive, but common sense approach of
representing customers, vendors, etc. as specific kinds of entities. These systems would
implement each kind of entity with different kinds of records (tables) in the system:
This works well enough in most cases, but the real world is more complicated than this model
allows. For example a single external company may be a customer in some transactions, but also a
vendor in others.
Under the early business systems model you would have to create two records to handle a situation
like the example just discussed, one for the external company as customer and one for it as
vendor; including duplicating all of the common attributes such as company name, addresses, etc.
But maintenance of duplicated data is only one of the practical issues that arises out of this
model. Representing single, external companies with multiple, unrelated records in the business
system also partitions the knowledge of the complete relationship with the external company. The
only way to create the complete picture in a such a system is to know outside of the system that
the relationship with the external company is multi-faceted and to run independent (or specially
constructed) reports to combine outside of the system.
Recent Models
While these complex relationships with external companies are not the most common scenario, they
happen frequently enough that business systems evolved an improved model of business
relationships. The updated model represents the external company as an
Entity in the same sense that we defined in
the Business Relationships section while associating it
with a record which represents the relationship:
This recent model much more closely matches the reality of business relationships found in the
real world and is probably the most common model adopted by currently popular business systems.
Many older business systems which were originally designed using the early model simply tacked on
the “Entity” record type and linked it to the preexisting records representing entity classes. In
these cases the business system typically only allows a single customer/vendor/etc. relationship
to be defined for any single Entity. However in practice, though rare, there exist scenarios
where a single, large external Entity can have multiple, simultaneously active Relationships of
the same kind with the first party Entity; this happens when the corporate structure of the
external Entity is organized into substantially independent divisions or departments. This forces
the users of these systems to create Entities representing the divisions/departments
independently; naturally incurring the cost that a full view of the Entity is effectively broken.
Indeed, while the more recent models of business relationship data do reflect real world realities
of this data better, they still fail to model the most advanced scenarios and extended business
organizations which are seen in practice. This weakness stems from an implicit assumption in the
recent models that ideas such as customer or vendor are simply an extension of the Entity’s
description: the Entity “is a” customer, the Entity “is a” vendor. While better than the old
model where the customer/vendor/etc. ideas represented completely different entities, the recent
models still fail to appreciate what we’re actually modelling.
Our Approach
The reason the recent models described above succeed as well as they do is that the model isn’t
wrong, it’s merely incomplete. Properly understood, the recent model is not describing an Entity
with its extensions into more topically focused descriptions where needed, it’s really modelling
an Entity which has different Relationships with an implied, unmodelled Entity: the first party
Entity or “us”.
By recognizing and being mindful that we are modelling Entities and the Relationships between
them, and making that explicit in the data modelling, we can avoid the limitations of assuming too
many facts about reality.
Explicit Model
Our basic modelling technique makes the complete Relationship picture explicit in the data model.
However, because we are not assuming an implicit Entity “us”, we can model relationships between
arbitrary Entities. This becomes useful when we want to use our business system to manage the
businesses of multiple Entities. For example, consider a company with subsidiaries; each
subsidiary may operate with significant independence, yet each subsidiary’s financials are
consolidated with the parent company and may act as a group in some scenarios, such as purchasing.
In cases such as that, being able to explicitly model Entity/Entity Relationships allows the use
of the same business system for management activities across the conglomerate while also allowing
independence where needed. This same-system/independent-existence property of our model can
facilitate other business structures, but those will be discussed elsewhere.