Project to Convert XBRL Financial Information to RDF/OWL
Tuesday, July 2, 2013 at 07:22AM
Charlie in Becoming an XBRL Master Craftsman, XBRL and the Semantic Web, RDF/OWL

This blog post is part of a project on the Semantic XBRLgroup in LinkedIn.  The goal of the project is to convert some XBRL formatted digital financial information into RDF/OWL. The intent is to use the proper financial report semantics to structure the RDF, enforce that structure to the extent possible using OWL, and then be able to do useful SPARQL queries against that digital financial report information expressed in RDF/OWL.

This blog post is intended to be a resource to help those working on that project to understand the financial report semantics, have access to information which helps them understand the process and steps, provide test cases and examples, etc.

NOTE!!! Because this is a work in progress things may not tie together as well as they need to.  For example, terminology is adjusted as new things are learned and examples may use a mixture of old and new terms.  The goal is to tune and get all these sorts of things in sync as part of this project.

Understanding the semantic representation

The first step in this process is to understand the semantics which we will be working with. The document Financial Report Semantics and Dynamics Theory summarizes those semantics. You can see those semantics implemented here within the XBRL Cloud Evidence Package. 

So that is a representation of this information in human readable form.  All those pieces relate to this reference implementation of an SEC XBRL financial filing. From that page you can get to all the pieces of that digital financial report. (NOTE!!! Something else which is helpful is this HTML version of that digital financial report. HOWEVER, that HTML is a older example so the HTML will not tie exactly to the current reference implementation.)

Another view of the exact same information is an XML infoset of the fact table and the model structure.  PLEASE NOTE!!! The element names are not totally correct in these examples.  There are extra attributes which don't belong in the infoset.  There are pieces which SHOULD be in the infoset which are not there.  Remember, this is a work in progress.

The Financial Report Ontologyis a greatly expanded set of financial report semantics. Again, this is a prototype.  Pieces of this are shown in OWL, but this is NOT what the end product will look like.  What you see today is the best that a CPA can do given a good understanding of financial reporting, a good understanding of XBRL, and a basic understanding of RDF/OWL.  This ontology will evolve to what it needs to become.

Something which is helpful in understanding the overall represention which the Financial Report Ontology will represent, or rather the best guess for now, is this VUE mind map. The point of this mind map is to provide something helpful to business users.  The formal representation of the ontology will likely be in UML or OWL or whatever the technical people decide works best.  But business users MUST be able to understand the ontology in their terms.  Eventually, I believe that will be within working software applications.  Until that software is build, some method to explain the moving parts of a financial report is necessary.  This is my best attempt to communicate such information.

Further, we are NOT ARTICULATING 100% of this financial report ontology in this project.  This project is focused on only the financial report level semantics at this phase.  The goal of this phase is to convert only the XML Infosets into RDF.  That is all for this first phase.  And so, we will be dealing with the report components, facts, characteristics which describe those facts.

Implementation model

The reference implementation is of an SEC XBRL Financial Filing.  Such filings make use of the US GAAP Taxonomy.  The US GAAP Taxonomy uses specific structures and terminology.  Those terms are not financial reporting related, nor are they totally XBRL related.  They basically bridge a gap.  This terminology uses term such as Network, Table, Axis, Member, Line Items, Concept, and Abstract.  You can see these terms explained here.

If you notice the XML infosets, they follow this representation within the fact table and model structure. That could be a mistake.

The XML infosets were generated by XBRL Cloud who has implemented these infosets.  The Arelle open source XBRL processor has an implementation of the model structure infoset, but not the fact table infoset at this time. What would be highly desirable is to have the Arelle open source XBRL processor output both the XML infosets and output RDF.

Prototype output

This blog post has some experimentation with serializing XBRL as RDF and running a SPARQL query to return some useful results.

This RDF, which uses this OWL ontology, when loaded into Protege, and you try and run these SPARQL queries, do work.

For phase 1, I propose a goal of the following:

  1. RDF of the reference implementation fact table and model structure which loads into Protege correctly.
  2. OWL ontology which supports the RDF.
  3. SPARQL query which returns the model structure of a report component.
  4. SPARQL query which returns the fact table of a report component.
  5. SPARQL query which tests the model structure and fact table against the OWL ontology to make sure the RDF follows the semantics of a financial report (relates to relations between report components, facts, characteristics as expressed by networks, tables, axis, members, concepts, abstracts.

 

Article originally appeared on XBRL-based structured digital financial reporting (http://xbrl.squarespace.com/).
See website for complete article licensing information.