« Perception that All Extensions are Bad is Misperception | Main | Understanding Why US GAAP Taxonomy is Hard to Understand »

Key to the Digital Kingdom: Secondary Use Ontologies and Other Metadata

In a prior blog post I explained a bit about what a semantic reasoner does. This presentation, Knowledge Representation for the Semantic Web Part I: OWL 2 by Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph; provides additional insight on how these technologies will be employed.

Slide 39 of their presentation provides the following summary, beyond simply editing and inferencing, which helps one grasp what semantic technologies are all about:

  • Explanation: reasoning task of providing axiom sets to explain a conclusion
    (important for editing and debugging)
  • Conjunctive querying: check entailment of complex query patterns
  • Modularisation: extract sub-ontologies that suffice for (dis)proving a certain conclusion
  • Repair: determine ways to repair inconsistencies (related to explanation)
  • Least Common Subsumer: assuming that class unions are not available, find the smallest class expression that subsumes two given classes
  • Abduction: given an observed conclusion, derive possible input facts that would lead to this conclusion

Focusing on XBRL as simply a means of some company to provide information to some regulator is missing the point of XBRL.  Company's provided information to regulators before XBRL even existed.  What has changed is that the information is structured, as contrast to a big blob of information which is not usable by computer processes.

Because the information is structured it can be reused.  As David vun Kannon points out on his blog, financial information is "bushy". Bushy basically means metadata is high because the relations between the different pieces of information is high.

Some of these relations are expressed in the US GAAP Taxonomy.  If I had to guess, I would speculate that the ratio between the relations expressed in the US GAAP Taxonomy to the total relations which could be expressed would be less than 10%.  Probably significantly less than 10%.  The exact number does not matter, what matters is that there are opportunities to express more relations.

That is part of what I am trying to do with the Financial Report Ontology, express more relations. I am making that information, that metadata, available free of charge.  The reason is that it is foundational to getting digital financial reports created correctly.  The Financial Report Ontology also can be leveraged by software to make working with digital financial reports easier, and I want that.  Basically, I see much of what you can see in the Financial Report Ontology as basic and necessary for getting digital financial reports created correctly.

One of the first uses of Financial Report Ontology metadata will be the first and fourth items in the list above: explanation and repair of digital financial reports. Many people are frustrated by an inability to reuse all that information provided by public companies to the SEC. Metadata such as "Assets = Liabilities and Equity" (i.e. balance sheets balance) and other metadata can be used to first expose and then correct errors.

But beyond the fundamental nature of the Financial Report Ontology are likely to be many, many secondary use ontologies which will not be free.  These secondary use ontologies will be knowlegebases of significant value.  Those who understand how to create those ontologies and have professional knowledge to put into those ontologies will have the proverbial key to the kingdom of our increasingly digital world.

It takes knowledge of two things to gain the required level of understanding: an understanding of the tools and an understanding of the domain knowledge.  If you don't have the domain knowledge, you will not understand the domain rules which might be expressible.  But if you don't understand the tools, you won't understand what the tools might have to offer or you won't understand how to express your domain knowledge using the tools.

The reason that I have been hammering and hammering those SEC XBRL financial filings is because it is a great way to understand the technology and tools.  I have a good grasp of the financial reporting domain being a CPA.  It has been an investment; and like any good investment you will eventually get a good return.

Investing and gaining a good understanding of the technologies and tools which will form our ever increasingly digital world does take effort, particularly these days.  Tools can be hard to use.  Understanding which path to follow can also be a bit of a challenge due to the misinformation which seems to be prevalent.  But not sorting through all this is becoming increasingly risky.  There is not much of a market for people who make buggy whips when there are no buggies.

One example of secondary use ontologies and metadata is something like the SEC's fraud fighting RoboCop.  If you think through the details of how something like that might work, it helps one see why secondary use ontologies and metadata will be useful. This is not something that a technologist can create on their own.  They will need domain experts to talk with who understand how to communicate with technologists.

And this situation does not just relate to the financial reporting domain. It will work similarly for other domains which go digital.

Posted on Sunday, September 15, 2013 at 12:17PM by Registered CommenterCharlie in | CommentsPost a Comment

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
All HTML will be escaped. Hyperlinks will be created for URLs automatically.