BLOG:  Digital Financial Reporting

This is a blog for information relating to digital financial reporting.  This blog is basically my "lab notebook" for experimenting and learning about XBRL-based digital financial reporting.  This is my brain storming platform.  This is where I think out loud (i.e. publicly) about digital financial reporting. This information is for innovators and early adopters who are ushering in a new era of accounting, reporting, auditing, and analysis in a digital environment.

Much of the information contained in this blog is synthasized, summarized, condensed, better organized and articulated in my book XBRL for Dummies and in the chapters of Intelligent XBRL-based Digital Financial Reporting. If you have any questions, feel free to contact me.

Entries in metadata (3)

Understanding Fact Groups, Metadata "Levels", Information as Contrast to Data, Measure/Member Relations

People might not get this, but this rendering of XBRL instance information (I am calling this a Level 1 Fact Group Rendering) is actually quite interesting from a number of perspectives.  I am not going to cover all the perspectives, just four. I hope I can make my points. I appologize for the colors, an artist I am not.

Fact Groups or Fact Tables

The rendering above shows a flat list of facts which are from one XBRL instance. There are several different fact groups.  Each fact group has different Measures (or some people call these dimensions or axis or aspects). Basically, each fact group (some people call these fact tables) is a fully complete table.  The different fact groups have different columns because each column contains the Measures appropriate for that specific fact group.

These fact groups are similar to business intelligence or online analytical processing (OLAP) "cubes".  XBRL uses the term hypercube rather than cube because a cube only has 3 dimensions where a hypercube can have any number of dimensions.

While it is true that the rendering is not all that great to use, it is certainly far better than looking at the XML of an XBRL instance. What if there were some default style sheet which rendered all XBRL as fact tables, in a consistent "more readable" format.  Would that be a good thing?  Maybe.  It could even be all you need for many different types of XBRL based information.  See the discussion about CSV files below.  Others need more rendering.  See the discussion about "Measure/Member Relations" below.

It seems to me that every XBRL instance can be expressed as a "more human readable fact table".  This includes XBRL which has no XBRL Dimensions (the context information is the dimensions), XBRL which has XBRL Dimensions (either explicit or typed members) and even tuples (the tuple is the Measure, the key concept is the Member collection of that Measure, and the other concepts are of the Measure-Concepts. I won't bore you with further details, but even non-XBRL dimensions can be expressed in this manner (i.e. the stuff you can put into the <segment> and <scenario> portion of an XBRL context.

Metadata "Levels"

You may want to go back and have a quick look at the graphic on this blog post. As I explain, the closer you are to the top of this inverted pyramid, the better the comparability you will experience.  Go back to the fact group rendering. Look at the namespaces table under "Report". There are five namespaces in that namespace table. Each of the namespaces corresponds to that diagram from the blog post (except I don't have a namespace for Regulator).

The point here is that who issues the metadata matters quite a bit. Each piece of metadata in the fact tables is explicitly defined as to who is providing the metadata except for two [Measure] values or these are sometimes referred to as "Members": brm:ReportingEntityMeasure and brm:CalendarTimeMeasure.  Basically, to reiterate and show more clearly that the blog post above, the more broadly used the metadata (the {Measure]s and the [Member]s) the higher the level of comparability which can be created.

Data as Contrast to Information

Take a look at this text file. For this you may want to go back and review this document which I referred to in another blog post. If you look at the text file you will probably not a number of things. First, it looks like a CSV (comma separated values) file.  Business people use these all the time to exchange information.  CSV files are easily loaded into spreadsheet applications such as Excel.

Compare that text file with the information in the fact group "[Network] gaap: http://xasb.org/gaap/SalesAnalysisByGeographicArea". Notice two things. First, the fact group rendering is way more explicit in describing the values than the text file.  Some of the context is missing from the text file, whereas the fact group is rather explicit.  Not everything is explicit.  You don't now if the information is audited or unaudited. You don't know if the information is actual or budgeted.

One point here is that information needs to be made as explicit as you might need it to be. It is up to the creators of the XBRL taxonomy and the consumer of the XBRL instance figure this out.  The other point is that XBRL can do everything that CSV can, but better.

Measure/Member Relations

Notice on the fact group rendering that the fact groups contain lists of facts.  There are no relations between the facts. Now take a look at this page which does show the relations between the Members used within the fact groups. You don't have to use them, but XBRL provides a way to document these or other relations.  CSV files (like above) don't have these relations expressed, that is one of the drawbacks of the CSV or table type formats.  They are flat.

You can apply the Measure/Member Relations and create far more useful renderings as is shown in the straw man implementation of the business reporting logical model.

Making your XBRL Unambiguous: Clues from the Semantic Web

In order for your XBRL information work on the Semantic Web or within your internal semantic web, or in any computer system for that matter, your data and metadata need to be unambiguous.

Before I get started here, I want to explain a few terms to business people.  Business people need a working knowledge of these terms in order to understand what is important to making your systems work, to making your XBRL unambiguous.

Why is this Important?

You may have heard terms like "metadata" and "semantic web".  But what do these terms mean and how do they relate to you.  In his book Pull, David Siegel explains these two important terms and how they will change the Web.  While the terms are defined in the book, what provides you the understanding are the countless examples of what having a "semantic web" will mean to you.

For anyone who lived through the beginning of the Web, to say there was hype surrounding the notion of how the Web would change life as we know it on planet earth is an understatement.  However, you have to admit that a lot of things have changed.  Just because there is hype does not mean that the Web is "empty", nor is it the case that "the Semantic Web" is empty.  In fact as I understand it, the Semantic Web was Sir Tim Berners-Lee's vision of what the Web needs to be, the Web as we know it today is just an interim step in that direction.

Metadata

It has been my experience that technical people like to complicate the notion of "metadata".  Perhaps they like to keep things mysterious.  You can go search the Web for a definition, in fact here is an explanation of metadata on Wikipedia.  I even hear techies use the term "meta-metadata"!

So what is metadata?  Metadata is just data.  It is just at a different level from what you normally thing of as data.  Metadata, like data, describes something.  That is it.  What is more important is to understand why metadata or data is important.  Computers are not magical things.  They can do magical things, but all this is enabled by the data and metadata which is provided by and linked together by humans.  For example, if you have a list of files on your computer you can only sort them in ways you have information about those files, the "data" or "metadata" about a file; such as the date you saved the file or the name of the file or the type of file.  The more data or metadata you have, the more a computer can do with data.

Semantic Web

Metadata and data is the foundation of the Semantic Web.  David Seigel gives a very simple explanation of the Semantic Web by posing two simple questions:

  1. Is it unambiguous?
  2. Is it on the Web?

So now we have a number of other terms floating around here: semantic and unambiguous. If you are a glutton for punishment, you can go here and read about semantic.  Fundamentally, semantic is about unambiguous meaning.

When I say Semantic Web, you can look at the scope of the "web" in a number of different ways.  It woulc be the "Semantic Web" meaning the open to the public on the Web, it could mean "semantic web" meaning only available within your organization, or it could just be some smaller subset of users in some sort of closed system, not open to everyone.  The type of web makes no difference.

As David Seigel explains in his book,

Data that is semantic means exactly the same thing to any system or person who uses it.

That is the key to making data usable to a computer.  You need to be unambiguous.  This is not to say that everyone has to interpret the information in the same way, this is about consistency in the meaning of the data and metadata.

Making XBRL Unambiguous

So, if you are creating XBRL you want to be unambiguous.  This does not mean unambiguous to you, it means unambiguous in general, to everyone.  There are ways to test to see if your information is unambiguous. As a business person you don't actually have to do these things yourself, you can ask the technical people implementing the system if they have done any of these.)

  • Try to express it in RDF (Resource Description Framework) /OWL (Web Ontology Language).  If you can, and it makes sense, then it is unambiguous.
  • Try and express your information model in UML (Unified Modeling Language) and see if it makes sense.  Again, if the UML model makes sense, then the data and metadata will likely make sense.
  • Try and use the data.  If the system works, then the data and metadata are unambiguous and logical.  If not, then something needs to be fixed.

If you have not tested your systems there is a high probability that your system is not providing information which is unambiguous.  It is the testing which helps you achieve this desired goal and makes your system usable.

This is particularly true when your system allows extensibility, where those submitting information can make any sort of adjustment to an XBRL taxonomy.  Those extending your XBRL taxonomy have to be given guidance to create those extensions as you had anticipated and consistently with one another.

This testing is best done during the time when you are architecting your XBRL taxonomy and other aspects of your system, not when your system is live.  Waiting until your system goes live to do this testing may mean the need to re-architect your system.

Another Step Toward Understanding OWL, RDF and How They Relate To XBRL

This is another step in the journey of understanding OWL, RDF, and how they relate to XBRL.  You can see other blog posts relating to this here.

OK, so I think I am getting more and more dialed in as to what OWL, RDF, and XBRL offers.  At this point I am not totally sure about this.  Consider this blog post brainstorming. I will use this post to get feedback from others who understand these things better than I do and then tune my perceptions.  So be sure to check back later to see where these perceptions end up.

RDF, OWL, and XBRL are all "graphs".  Or maybe I should say that they all share the same approach, graphs, to articulate information of some sort.  That approach was chosen because of the flexibility of graphs to expressing information.  You can do a lot with "subject-predicate-object" relations.  These "subject-predicate-object" relations where not invented by computer science.  Aristotle and Socrates, long ago, used these types of relations in philosophy it seems.

RDF is a "global standard way" of articulating subject-predicate-object relations.  It is a general tool.  It can articulate any subject-predicate-object relation.  If is very flexible.  It is very verbose.  There were lots of different syntaxes for expressing the RDF, one syntax which seems to have been catching on is the XML format of RDF. (I found another useful primer on all this here.)

One issue with RDF is that is that you can express ANY subject-predicate-object relation, whether it is logical or not logical.  OWL is used to express constraints on the subject-predicate-object relations.  (It is also somewhat of a "short cut" approach to creating "common" relations it seems.  But, lets not focus on that).  It seems as though OWL is sort of like a "schema" for the RDF if you understand XML Schema or database schemas.  Basically OWL expresses constraints on the RDF which software can use to determine if the RDF expressed "follows the rules" so to speak.  This is critically important just like a database schema is important or an XML schema is important.

OWL is very, very "powerful".  It goes far, far beyond what a database schema can express or an XML schema can express.  OWL can be used to express semantic meaning, it seems like ANY semantic meaning which can be expressed can be expressed in OWL.

XBRL's architecture uses "graphs" to express many things.  Those that created XBRL did this because XBRL needed flexibility.  The approach of using graphs gives you the flexibility.  You have subject-predicate-object relations in XBRL.  Again, remember that pretty much anything can be expressed in this manner, so clearly XBRL should be expressible in this manner.  And in fact some technical people "expressed XBRL in OWL" (I say this loosely).  You can see those OWL ontologies here.  It is something only your mother could love.  Basically, what they did is run a style sheet over the XBRL schemas and converted them into OWL.  They took one syntax, XML Schema, and converted it into another syntax, OWL.  Not really that useful for business people.  Might have some sort of technical use, that will be seen later.

Now, XBRL took some additional "short cuts".  RDF is built to express anything.  XBRL is built to express business information, a subset of "anything".  Could RDF/OWL be used to express the XBRL syntax.  Yes.  Is this useful?  I think it does have some utility.  But, there are better uses for OWL than expressing a logical model of the XBRL syntax.

So, XBRL is a "short cut" to expressing business information in a form that computers can make use of.  XBRL is a general format.  No one uses "XBRL".  Everyone uses some subset of XBRL, some application profile. This is why the COREP taxonomy does not work with the US GAAP Taxonomy.  Every XBRL taxonomy or system has a different "application profile" because it uses a different architecture.

It seems to me that one thing OWL can be used for is to "express" or "document" those different application profiles.  Other things can be used to model an application profile of XBRL, UML for example could do this.  OWL could be a very, very valuable tool for documenting an application profile.  Today application profiles are either not documented at all or rather poorly documented in a Word document or PDF.  No computer can read those documents.  Computer programmers have to read the documents, extract information, and build applications to work with the different XBRL application profiles.  For example, here is the US GAAP Taxonomy architecture.  Here is the SEC test suite.  Here is the CEBS FINREP taxonomy architecture.

I really don't know if it is possible for an OWL ontology to be constructed and for software to automatically generate "tests" of the XBRL application profile.  That could be nice.  But, having a consistent way to document XBRL application profiles could be nice.  UML could be that way.  OWL could also be that way, it seems.

But there is another thing that OWL can be used for.  What I am seeing is that anything can be expressed in OWL.  Well...almost anything.  There is one big constraint.  What you want to express has to be logical.  If it is not logical, it cannot be expressed.  Therefore, if it IS logical it CAN be expressed.  Additionally, you can see WHAT is expressed!  That is even more interesting and has more utility.

For example, OWL can be used to articulate where a taxonomy can be extended, what the information model can look like, what is allowed, what is not allowed, and so forth.  OWL basically can document how things work.  For example, you could use OWL to document how the US SEC XBRL filings "work".  By seeing that, you can determine things like does it work the way you WANT it to work.  Or, is there a better way.

Another way OWL might help XBRL is adding additional information to XBRL.  Now, XBRL can be used to add additional information.  For example, the definition linkbase of XBRL can be used to express "arcroles" and things which can be used to express meaning.  For example, the XBRL Dimensions specification did this.  But is XBRL really the best way to express additional information?  The XBRL way has its pros and cons.  The OWL way also has its pros and cons.  Does it really matter?  XBRL and OWL are only syntax.  The important thing is that what is expressed is logical, it is done in as standard a way as possible, and it works.

One final thing which I want to mention before I wrap up this post is my realization that RDF/OWL does seem to have one very significant thing "missing".  I really don't know if I really should say it is missing, more it is something that to really make use of RDF/OWL information you have to build domain specific software on top of it.  Like I said, RDF/OWL can be used to express anything.  Its biggest strength is also its biggest weakness.  Because it can be used to express anything, you have to write software which understands the specific subjects, the specific predicates, and the specific objects and does useful things with those relations.  There is one thing which I admit that I don't really grasp yet (at least one thing, could be more).  Software can be built which "learns" from the relations, building additional relations.  This can be useful, but I cannot grasp this right now.  This is in the realm of artificial intelligence.  Maybe this will work, maybe this will not.

By contrast, an XBRL processor only has to understand XBRL.  An XBRL processor can easily convert XBRL into RDF/OWL.  To understand the RDF/OWL syntax but more importantly the SEMANTICS as well as an XBRL processor, you would basically have to rebuild the functionality which exists in an XBRL processor into an RDF/OWL processor.  Why would you do that?  Besides, XBRL processors don't even work at high enough level, they still deal mostly with syntax, not enough with semantics.  And I am not talking about things like XBRL Formula which verifies if the semantics is correct.  I am talking about DOING USEFUL THINGS with the expressed semantics.  No general XBRL tool does this at the level a business person would find particularly useful these days.  I would also point out that XBRL is ahead of RDF/OWL in regard to having a working method of validating semantics.  XBRL has XBRL Formula, the Semantic Web folks have something in the works (i.e. they do realize that this is important).

It seems that no matter what happens, XBRL is going to have to fit into the Semantic Web.  Getting XBRL into RDF/OWL is trivial for an XBRL processor.  Doing anything useful with the RDF/OWL is going to take more than want XBRL processors offer these days.  What an RDF/OWL engine can learn from, say, a data dump from XBRL from something like the entire SEC XBRL filing database could be quite interesting.

So that is what I seem to be seeing.  Not totally sure if I am correct on all of this or any of this.  The next step is to float these ideas by some people who grasp these things better than I do and see what they have to say.  Discussions with some of them have yielded what I have thus far (i.e. this blog entry).  But, there is still a ways to go.

One thing that I can say with a pretty high level of confidence is that if you live in this information age and you are a business person and you don't understand what metadata is and what you can do with it you are at a distinct disadvantage.

What is your opinion?

 

Posted on Saturday, January 9, 2010 at 07:45AM by Registered CommenterCharlie in , , , | CommentsPost a Comment | EmailEmail | PrintPrint