BLOG:  Digital Financial Reporting

This is a blog for information relating to digital financial reporting.  This blog is basically my "lab notebook" for experimenting and learning about XBRL-based digital financial reporting.  This is my brain storming platform.  This is where I think out loud (i.e. publicly) about digital financial reporting. This information is for innovators and early adopters who are ushering in a new era of accounting, reporting, auditing, and analysis in a digital environment.

Much of the information contained in this blog is synthasized, summarized, condensed, better organized and articulated in my book XBRL for Dummies and in the chapters of Intelligent XBRL-based Digital Financial Reporting. If you have any questions, feel free to contact me.

Entries in Comparing XBRL and XML (3)

Many Different Forms of XML

This is a series of posts where I am providing information relating to figuring out what the best data format to use and why. Basically, when is XML better, when is XBRL better, and when is RDF/OWL better.

In another blog post I looked at different information exchange formats. In that post I mentioned that the world was standardizing on XML.  But which form of XML?  XML can come in many,  many different forms.

I took a small data set which I had in a database and generated XML from that data set.  The data set is simple enough: the population of each U.S. state. This PDF shows what the data set looks like in a rendered format.

Simple enough, here are some XML which I generated from the same Microsoft Access database information:

So, what is the point here?  Well actually, I have several points which I will list and discuss.

  1. Every one of those forms of XML represent the exact same set of information, the information which you can see in that PDF.  While the syntax of each of the files (the different XML forms above), the semantics of the information (the meaning of the information) is exactly the same.
  2. Some information is expressed more explicitly than others in each of the different forms of XML.  For example, the population data is an estimate as of July 1, 2008.  The point though is that fact (that the information is estimated and what point in time) is sometimes very explicit, other times somewhat implicit within the different forms of XML.
  3. The populations of each state are supposed to add up to the total for all the states. Here is another version of the first variation of XML with an error in it.  Can you see the error, the last two digits of the total have been transposed.  Different formats are better at communicating the fact that the information adds up than others.  Meaning, you could in XBRL communicate that the information adds up quite easily, and get a report which shows that the information does add up. This is a validation report.
  4. The states are related to each other in different ways.  For example, you can break down the states by say region: South, Northeast, West, Midwest, and so forth.  That information is not communicated in any of these XML formats.  However, any of these forms of XML could communicate that information, in XML, in whatever way they may desire.

Which form of XML is the best?  Well, that all depends on what you need from the information all things considered.  On the one extreme, if you just want to make a simple set of information available to a small group of people, any old XML will do.  In fact, you could use pretty much any data format.  But XML works well over the Web, it is in vogue, it is a good general format.

If you are, say, a government agency or other enterprise and you want to work with one data set and you don't need to exchange that information with other government agencies and you will only have one data format, traditional XML could work for you.  But what if you want to verify that numeric information adds up correctly?  Well, you could build your own validation mechanism because your data set is small and you don't have complex computations.

But how many government agencies or other enterprises don't have to interact with other government agencies or enterprises, subsidiaries, etc?  If you interact with others, you have to agree.  To agree, you need some sort of framework to agree on.  For example, the National Information Exchange Model (NIEM) is a framework to help government agencies involved with public safety and security to create XML which is easier to share.  The framework adds discipline to creating their XML formats.  Rather than each agency creating point solutions to exchanging information; the framework provides the discipline needed to create a canonical standard format which makes exchanging information easier.  (Their introduction document does a great job of explaining this.)

XBRL is also a framework for agreement.  For example, the US GAAP Taxonomy Architecture is part of a framework for using XBRL in a specific way, creating what amounts to an application profile (i.e. no XBRL tuples, no XBRL typed dimensions, no use of the XBRL scenario context element, build [Table]s in a specific way, etc.)  Also, the XBRL framework provides mechanisms for achieving things which are commonly needed in business reporting.  For example, it provides the ability to: add labels, add multiple labels, express computations between numeric information, express additional types of relations between concepts, etc.  If you need this and you are using XML, you would have to build these things yourself.

Sharing information to a large number of users is one thing.  While a framework helps make these systems work better, what if you want to connect information between all these systems?  Some people using traditional XML, some using XBRL, some using other formats.  That is what RDF/OWL and the Semantic Web are all about.  For example, this Data.gov project has converted numerous data sets into RDF/OWL. (This is a great book for understanding how the Semantic Web will be changing your life.)

The bottom line here as I see it is this: When you build your information exchange systems, be sure you are considering the right things for the long term.  I see four groups of XML:

This is not to say that one type of XML is better than another, it is more about understanding what you need to be considering when you try and determine your needs. Using the wrong type of XML is like trying to fit a square peg in a round hole.  You can do it, but it pretty.

Interactive Information Hypercube

I have been fiddling around with how to best use XBRL and have consolidated many, many, many other ideas into something that I am referring to as an "interactive information hypercube".

Breaking this down, this is what the notion of an interactive information hypercube is based on.  Again, these really are not my base ideas, I am just combining many other ideas together in order to achieve something which I believe needs to be achieved.  These ideas are:

  • Interactive Information: The notion of interactive information comes from the term "interactive data" which was to the best of my knowledge coined by the US SEC.  I believe that "information" is a more appropriate term than "data" in the context in which I am working.
  • Hypercube: Ever since I started trying to understand XBRL Dimensions, I never really understood the difference between a cube and a hypercube.  A couple of months ago I read something which clarified this, at least in my mind.  Everyone can probably visualize what a cube is.  A cube has three dimensions, it is a physical thing.  Some business data has three or less dimensions which can be made to fit into the three physical dimensions of a cube.  However, other business information has more than three dimensions which makes it difficult to visualize in the form of a cube.  A hypercube is something which can represent any number of dimensions.

Now, you really have to stretch your imagination a bit with this graphic.  But really take a look at the graphic. Imagine information expressed in that sort of form rather than on a piece of two dimensional paper!  That is the idea.  Clearly an application to view information would not look like that graphic; the point is that it does help one see the limitations of paper in communicating information.

This is a prototype "interactive information viewer" which I have been experimenting with during the process of creating XBRLS.  The prototype takes what I had referred to as "neutral format tables" in XBRLS, modifies the tables slightly, and organizes the "99-Combined" XBRLS meta pattern (which is really a combination of all the XBRLS meta patterns into one XBRL taxonomy and XBRL instance to test the patterns).  On the left, you can click on a hypercube from the XBRL taxonomy, and on the right a rendering of information relating to that hypercube is rendered in the form of a neutral format table.  The prototype is simply PNG images from an Excel spreadsheet.  The renderings were created manually in order to test the idea.  The next step is to automatically create the rendering from information in the XBRL taxonomy and XBRL instance. 

The prototype condenses down into an easier to work with set of hypercubes which you can view in this PDF.  A better example of the use case I am experimenting to try and make work with is a financial statement.  This PDF from the "comprehensive example"which I had created for XBRLS.  The larger example looks more like a financial statement and is therefore easier to relate to.  However, the XBRLS patterns in the 99-Combined example actually cover 100% of what is in the larger comprehensive example.  That is the point of the XBRLS meta patterns...that small set of meta patterns can be used to express literally anything which I have come across in either financial reporting or other areas of business reporting from my experience with such information.  Impossible you say?  Well, isn't it interesting that the fundamental concepts of addition, subtraction, multiplication and division in mathematics works in the domains of physics, business, chemistry, engineering, etc.  It is the simplicity of the meta patterns which offers the best evidence that they could be right.  Time and experimentation will tell.

An earlier version of this comprehensive example included the following experiment.  You can see the results of the experiment within this PDFwhich is similar to the PDF of the financial statement above enough to help you see the point I am about to make, but it is different (meaning, there is not a one to one correlation between the PDF files). The experiment was to express 100% of a financial statement within Excel pivot tables.  I did that and "printed" screen shots of the pivot tables organized within a Word document.

The point is this: A financial statement is a collection of hypercubes.

What I want to do is go back and redo the XBRLS comprehensive example using the same form as the prototype interactive information hypercube viewer from above.  That will be much easier for people to relate to and see that, in fact, (a) financial statements are collections of hypercubes and (b) that there are advantages to working with them as hypercubes, the primary benefit being that you can easily reorganize the financial information as you desire.

There are two things needed to make this work: an information model and a way to communicate flow.

XBRLS is the information model (at least one information model) which makes this work.  The COREP taxonomy will likely work this way also.

Flow is simply a mechanism for organizing the individual hypercubes in an order that you want.  That is actually easy to do, you can use an XBRL taxonomy to express flow.  I will go into that later.

There is another advantage to the notion of an interactive information hypercube that I can see.  Maybe I am right, maybe I am wrong.  Today, there is no "multidimensional model".  Each vendor implementing Business Intelligence (BI) software has their own model.  Similar, but different enough to make like more complicated than it needs to be for business users.  See "Getting Started with ADAPT".  This BI solutions provider Symmetry Corp outlined the issue, its ramifications, and their solution for it in that white paper.

What if one multidimensional model could be created which all software vendors used?  There is one SQL model.  Not perfect, but significantly more consistent between software vendors than the multidimensional model.  Who knows.

XBRL Builds On Top of XML

In 2004, Rene van Egmond and I wrote a white paper called Comparing XBRL and Native XML. That information made its way into the book I wrote, Financial Reporting Using XBRL, in 2006 (see section 4.11.2). Both iterations where very helpful trying to grasp what the differences between XML and XBRL were and explaining these differences to others. These comparisons pretty much had an "XML versus XBRL" bent. In retrospect, I have come to realize that the XML versus XBRL approach to comparing the two was not necessarily the best approach.

Here in we are in 2009 and I have an updated version of the analysis of XBRL as contrast to XML. XBRL is an approach to using XML and a layer on top of what most XML languages generally provide.

  • XBRL is XML. XBRL uses the XML syntax. Therefore, XBRL can leverage the entire family of XML specifications.
  • XBRL expresses semantics (meaning) in a standard format. XML only articulates syntax. For others to do what XBRL does with XML, you would basically have to reinvent what XBRL has already created.  Because these semantics care expressed in a standard format they can be exchanged.
  • XBRL allows content validation against the expressed meaning. Because the meaning (semantics) are expressed, it is possible to validate XBRL instances against that meaning.  And XBRL has created standard mechanisms for performing this validation, such as calculations and XBRL Formulas.
  • XBRL separates concept definitions from the content model. Typically with XML languages, the concept definitions and the content model are mixed together. Further, XML provides you with only one implicit set of relations (because it has only one content model) and the definition of those relations is mixed with the definition of elements and attributes. XBRL, on the other hand, uses an atomic approach (flat XML content model) in defining concepts and moves the expression of relations away from the XML schema. This separation of concept and relation definition leads to the next benefit of XBRL, you can express more than one set of relations and each of those sets of relations can be explicitly identified as being for a specific purpose.
  • XBRL can express multiple hierarchies of explicit relations. Because XBRL separates concept and relation definitions, you can define more than one hierarchy of such relations. Further, the hierarchies of relations defined are explicit rather than XML's implicit content model.
  • XBRL provides prescriptive extensibility. XML's greatest strength is also its greatest weakness. XML is extensible everywhere, in every direction. XBRL is extensible in a specific, prescriptive, and therefore predictable manner. As such, the extensibility is usable without modifying software for the extension. You can think of this as XBRL always having the same "shape".
  • XBRL easily fits into relational databases. XML can be made to easily fit into a relational database. Because of XBRL's separation of concept definition and relations and because the extensibility is predictable giving XBRL a consistent shape, XBRL taxonomies and XBRL instances are significantly easier to model within a relational database as compared to more traditional approaches of using XML. This is particularly true if you use a well-thought-out strategy to create your XBRL architecture. Getting XBRL into and out of relational databases is important because there are a lot of relational databases that XBRL must interact with.
  • XBRL provides a multidimensional models. The multidimensional model is being used by online analytical processing systems (OLAP) type systems, providing flexible presentation of information and the ability to "slice and dice" information. Business intelligence systems in particular is one big user of the multidimensional model. Although XML can be made to fit into a multidimensional model in many cases, doing so can be a struggle. XBRL can fits quite nicely into the existing applications, such as these business intelligence applications, which make use of the multidimensional models. Like with fitting into relational databases, this is particularly true if you use a well-thought-out strategy and create your XBRL architecture to do so. Alternatively, you could use an existing architecture and application profile that's specifically intended to fit into an application which makes use of the multidimensional model. Getting information into applications which make use of the multidimensional model is important because more and more applications, such as business intelligence applications, are leveraging the characteristics of the multidimensional model to provide flexible (think "interactive") presentation of information.
  • XBRL enables "intelligent", metadata driven connections to information.With XBRL, connections to information can be created by business users adjusting metadata rather than by requiring technical people writing code. As such, rather than building multiple point solutions, XBRL enables the creation of effective and efficient solutions that allow extendibility and don't require programming modifications to connect to new information or new information models. Again, this is because of the prescriptive manner of XBRL's extensibility, the "shape" of XBRL is always the same. With XML, every new connection pretty much has to be enabled by a programmer writing code because XML only communicates technical syntax and does so at the data level, not the meaning level, of information and because the shape of different implementations of each XML implementation can be so varied.

It would be great to get the perspective of people from the XML community which have gained a good understanding of XBRL to hear their view of this comparison.

It seems to me that it should be possible to draw some "line" and better understand when XBRL is a better solution to a problem and when creating a specific XML language is a better approach.