BLOG:  Digital Financial Reporting

This is a blog for information relating to digital financial reporting.  This blog is basically my "lab notebook" for experimenting and learning about XBRL-based digital financial reporting.  This is my brain storming platform.  This is where I think out loud (i.e. publicly) about digital financial reporting. This information is for innovators and early adopters who are ushering in a new era of accounting, reporting, auditing, and analysis in a digital environment.

Much of the information contained in this blog is synthasized, summarized, condensed, better organized and articulated in my book XBRL for Dummies and in the chapters of Intelligent XBRL-based Digital Financial Reporting. If you have any questions, feel free to contact me.

Entries in RDF (8)

Experimentation with RDF, RDFS, OWL, and SPARQL

I have been doing some experimentation with RDF, RDFS, OWL, and SPARQL.  I have not been able to create a lot so far; the learning curve is rather flat now. What I have created is interesting from a number of perspectives.

I am working with this randomly selected SEC XBRL financial filing. What I want to do is two things.  First, generated the model structure of the filing in RDF and second, validate the model structure against an OWL ontology (or RDFS) to see if is possible.

To start, consider a few things.  This is the presentation linkbase of that filing.  That is expressed in XBRL using XLink.  It is VERY hard to work with this raw XBRL.  No problem, send the XBRL file to an XBRL processor and generate an easier to use XML infoset.  If you look at that XML infoset, you can start to understand the model.  It is way easier to use that the raw XBRL/XLink.  Next, I serialized that exact same information in RDF

If you look at that you might say two things. First you might say, "Now wait a minute, that is harder to read than the XML infoset." And you would be right.  The second thing you might see if you looked at this is the remarkable parallel between XLink and RDF.

All three of those technical syntaxes say EXACTLY the same thing: the XBRL presentation linkbase expressed in XLink, the XML infoset in just raw XML, and the RDF.  EXACTLY the same thing.

However, there are HUGE differences between the three serializations.

  1. The XBRL presentation relations expressed in XLink can be validated.  BUT, it can be validated only to the extent that the XBRL processor understands the information.  All XBRL processors understand the XBRL syntax of course.  HOWEVER, what XBRL processors do NOT understand is how the presentation relations should be structured, whether those presentation relations are consistent with the XBRL calculation relations and XBRL definition relations.  Why? Well, because XBRL only has "parent-child" (http://www.xbrl.org/2003/arcrole/parent-child) type relations in the presentation linkbase. What does the parent-child relationship mean?  What relations are allowed?  You cannot express that in XBRL beyond "parent-child" and you therefore cannot validate that you are building the relations correctly or consistently across all linkbases with XBRL.
  2. The XML infoset relations are WAY clearer.  They are WAY easier for a human to read, WAY easier for an XML parser to work with, and ALL the information you want to work with is there.  (If you go back to the XBRL presentation relations you will note that you have to go grab information from the XSD file to have information about the report elements.)  If this format is so much better, then why doesn't XBRL use this format?  Well, because the XML infoset format is not extensible.  That is WHY XBRL used XLink.  But there is something else wrong with the XML infoset format.  You still cannot tell if (a) the information expressed is CORRECT and (b) you cannot tell if the information is CONSISTENT with the XBRL calculations and XBRL definition relations.  You can write a validator very, very easily to perform the tests to see if the relations are CORRECT; but, that is work.
  3. RDF (work in progress), I think, can solve the validation problem.  I say "I think" because I have not actually gotten this to work yet.  THAT is what I am trying to make work. (I did get this to validate per the W3C RDF validator.)

To do this, I first built a simple ontology using OWL.  This is the ontology.  If you look at this and criticize how bad this is right now, you are totally missing the point.  Yes, it is bad.  I don't understand how to best use OWL yet.  (If YOU do, please rewrite the OWL ontology, sent it to me, and I won't have to spend the time figuring this stuff out. Please!)  Not helping things is the fact that if you think XBRL is flexible and hard to use, you should try RDF, RDFS, OWL, and SPARQL!!!  So why do I bother?  Well, (a) because it is far easier for me and other business people to use RDF, RSFS, and OWL than it is to learn to program all this stuff but more importantly (b) there are WAY, WAY, WAY more things that I want to be able to validate.

To fiddle around with the RDF I am using Protege.  (This is a web based version of Protege, works in Google Chrome, does not seem to work in Microsoft Internet Explorer)  Now, Protege is not a business user tool.  VERY hard to figure out.  But again, it is worth it.  Part of Protege is a SPARQL query tool. Here is my first SPARQL query (paste it into Protege, try it for yourself):

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

PREFIX model: <http://www.xbrlsite.com/2013/FinancialReportOntology/ReportElement.xml#>

SELECT ?subject ?object WHERE { ?subject model:hasAxis ?object }

Why am I doing all this? Folks, this is extremely powerful stuff! There is an absolute boatload of leverage which seems achievable from RDF, RDFS, OWL, and SPARQL. That is what the Financial Report Ontologyis all about.  Other domains have created ontologies.  Economics. Biomedical. Others.

Frankly, I don't get all the details of this stuff yet.  But, then again, I did not get XBRL when I first started either.  But, hundreds if not thousands of hours trying to figure this stuff out is paying off. I can tell you this...while it might be a lot of work for accountants to understand this technology stuff; it would be WAY more work for technology people to grasp accounting.  Business users don't need to learn everything about the technology; just enough to communicate effectively to the technology people who do understand this stuff.

Comparing XML, XBRL and RDF: Initial Observations

I mentioned in another blog post that I was creating a prototype State Fact book and using that to compare and contrast XML, XBRL, and RDF. This expands on my brainstorming in another blog post where I was trying to figure out which of the three (XML, XBRL, RDF) were "best".  I expanded that to also include XHTML and iXBRL.

Most of the "raw data" for this comparison is located on this page, the State Fact Book Index.  On this page there are 8 different sets of information.  Each of those sets are made available in XHTML so you can easily see what the data looks like.  I also made each available in XML.  Three sets are available in XBRL (but those three sets contain 100% of the information for the other sets).  The first set also contains an iXBRL (inline XBRL) example and an RDF example. I may, or may not, build out RDF, XBRL, and iXBRL for each set but I really don't need to because I am seeing the things I need to see.

The Story of the Three Bears

Remember the Story of the Three Bears? I am seeing a similar story when I compare and contrast XML, XBRL, and RDF.  Let me leave XHTML and iXBRL for later.

This is what I am seeing: 

  • XML is general purpose.  You can use XML for many, many different types of information exchange needs. When you use XML though, you generally start from scratch unless you create some sort of framework to work within. I talk about how, for example, NIEM building an XML framework in this blog post.  The framework helps you be more efficient and effective. That is a good thing.
  • RDF powerful, can become complex and is general purpose.  You can do anything with RDF also. RDF is for doing something different than XML.  RDF is for expressing semantic meaning which is way more powerful that XML which is a syntax.  RDF is also somewhat of a least common denominator.  If you wanted to get all the information in the world to be able to work together, you would use RDF to do that.  In fact, that is pretty much the goal of the Semantic Web. Like I said, RDF is extremely powerful and once you start getting into making use of that power, it can become complex.  RDF tools are not really usable by the typical business user.
  • XBRL seems like a compromise between XML and RDF. XBRL is a general purpose language also, but its "general purpose" is does not mean everything, the focus of XBRL is on business reporting.  So, it is more of a general purpose business reporting tool.  XBRL can become complex like RDF because it is quite powerful. It is not powerful in the same way that RDF is powerful.  Again, you can do anything with RDF.  XBRL is for business reporting, it is optimized for that scope.  Software which works with XBRL at the syntax level is easier to use than RDF software, but in my view it will still be over the heads of the typical business user. But, it is possible for business users to use XBRL tools to build quite sophisticated solutions using XBRL.  It would be hard for me to believe that these same business users can figure out RDF tools.  Personally, I struggle with RDF and RDF tools.

Making XBRL Less General Makes it Easier to Use

Like I said, XBRL is a general purpose tool for business reporting.  Business reporting is a broad category and XBRL has features or many different business reporting use cases.  But each use case will not need to leverage all the power of XBRL in all areas, different domains and use cases will need different pieces of XBRL.  And making XBRL less general makes it easier to use.  This is where application profiles are helpful.

While the average business user would never be able to use an XBRL tool which operates at the XBRL syntax level, applications build for more specific purposes can be very usable by the average business user.  For example, from what I have heard business users tend to struggle with XBRL software for reporting to the US Security and Exchange Commission in XBRL.  Extending the US GAAP Taxonomy can be challenging for business users. Building the XBRL instance can be challenging.  A lot of companies outsource the entire process to third parties, thus loosing the real value that XBRL could provide.  For this reason, people see XBRL as a regulatory mandate, not for what it really is.  I anticipate that this situation will come to an end when software applications are built specifically for US GAAP financial reporting using XBRL.  That is all they do, these will not be general XBRL tools.  But because they are more focused, they will be easier to use.

So, you can get what I see as an order of magnitude increase in usefulness by turning general XBRL into one or more specific XBRL profiles and building specific applications for each profile. This can happen for two reasons. First, some of these domains are quite large.  For example, US GAAP financial reporting is huge.  The SEC filers number about 20,000.  That is plenty large. Non SEC filers who use US GAAP is somewhere between 8 million and 16 million private companies (these are the estimates I have heard).

The second reason is leverage.  XBRL is an XML framework.  For the same reason you would build an XML framework if you are making use of different forms of XML; you will get leverage when you use XBRL as a framework.  XBRL does have well defined boundaries.  Tools can stay inside those boundaries and provide good bases of functionality.  Complexity can be absorbed by applications by really smart business users or technical users who understand a business domain building profiles other less skilled business users can make use of. This is done by the profile constraining the base application.

Not Either/Or

RDF is also an XML framework.  There are may different RDF syntaxes, but there is one in XML.  But the fact of the matter is that the syntax does not even matter.  There are many things that business reporting needs from RDF which XBRL cannot provide and XBRL should never provide.  There are things which XBRL has today that will eventually be built for RDF such as rules.  XBRL has business rules via XBRL Formula.  RDF is known to need this and things like RuleML, the W3C has an initiative to create a rules language.

Semantic "Haves" and "Have nots"

I think the bottom line here is that if you believe that Web 3.0, the Semantic Web, will exist and from what I can see the evidence is mounting that it will and this will be very useful technology; then you will want to be someone who can use this new power.

XML isn't really a solution, it is more part of the problem because XML is really is about syntax, not semantics. You can make XML "semantic" in many cases, that is what RDF is for.  RDF is quite powerful and there will be many very sophisticated things created using RDF for business users.  Most will not be created by business users themselves.

XBRL is a nice compromise.  It enables a business person to leverage these new  semantic technologies in ways they choose.  It is somewhat like the personal computer and the electronic spreadsheet empowered business users, setting them free from the IT department.

It is possible for a business user to use general XBRL. I can, I am a business user.  I know others who can. Most business users will interact with XBRL and harness the power of semantic technologies via application profiles for more specific use cases making things easier and therefore useful. These applications will likely combine XML, XBRL, RDF, and other technologies.

Semantic Web utopia may never be realized, I think it is a vision, something to be strived for.  But the Semantic Web is coming.  In fact, it is already here and ramping up.  Do you want to be a semantic have or semantic have not?

#######################

Second Coming of XML

 

Exchanging Business Information: XML, XBRL or RDF/OWL, Which is 'Best'?

This blog post summarizes several other blog posts.  It may seem rather stream of consciousness and be along the lines of brainstorming, if it does that is because that is what this blog post is.  I am summarizing this information to help myself understand it and learn to better communicate it to other business users. It is hard to say how many years of thinking have gone into this.  But I have to answer this question over and over and I wanted to understand the real answer for myself. The questions are which is 'better': 

  • Is XBRL is 'better' than XML, 
  • Is RDF/OWL is 'better' than XBRL
  • What considerations go into deciding which syntax (XML, XBRL, and RDF/OWL are all syntaxes) is 'best'

The first thing one needs to do is define the problem you are trying to solve.  In general, the problem which seems to need solving is getting business information out of one system, be it internal to your organization or external to your organization, and then automating the process of using that business information within another business system.

Helpful background information

Here is some information which provides helpful background in understanding the moving pieces of this issue. This may seem like a lot of stuff to know, and it is.  But if you want to understand the moving pieces and make the right choice, you do need to understand the moving pieces or have someone help you who does.  This is not about providing you with a two minute sound bite, this is about providing you with the information you need to truly understand the issues you need to consider.

  • Structured information, not unstructured information: Two points here.   First, I am talking about structured information.  Second, the world is moving toward structured information because computers cannot parse unstructured information reliably enough and it costs too much.  This video, How XBRL Works, helps you understand the difference between the two.
  • Structured for meaning, not structured for presentation: I am talking about information structured for meaning, not information structured for presentation.  Again, the How XBRL Works video helps you understand the difference.
  • Global standard, not point solutions: If information is structured you can always convert it into some other structure using some mapping process.  If everyone used their own structure, everyone would have to map to everyone else's structure.
  • Use XML:  There are many different data formats.  The world is standardizing on XML.
  • Many different forms of XML: There are many different forms of XML.
  • Many different forms of XBRL: There are many different forms of XBRL.
  • Many different forms of RDF: There are many different forms of RDF.

XBRL builds upon XML

In a previous blog post I explained how XBRL builds on top of XML.  Let me summarize these points here, you can go to that blog post to drill into this information further.  This is also explained in my book XBRL for Dummies (page 33)

  • XBRL is XML
  • XBRL expresses semantics (meaning) in a standard format
  • XBRL allows content validation against the expressed meaning
  • XBRL separates concept definitions from the content model
  • XBRL can express multiple hierarchies of explicit relations
  • XBRL provides prescriptive extensibility
  • XBRL easily fits into relational databases
  • XBRL provides multidimensional models
  • XBRL enables "intelligent", metadata driven connections to information

XBRL's "Sweet Spot"

XBRL has a 'sweet spot'.  This sweet spot is discussed in my book XBRL for Dummies (page 172) in detail, I summarize the points for you here. 

  • Flexibility within rigid systems
  • Reconfigurable information
  • Rules engine-based validation
  • Clear communication and sharing of rich business-level semantics
  • Metadata-driven configuration, no IT involvement required
  • Zero tolerance for errors
  • Achieving agreement with exterior parties

XBRL, RDF/OWL, and the Semantic Web

When people talk about the Semantic Web, terms such as RDF and OWL come up as the information formats of the Semantic Web. If RDF/OWL are the formats of the Semantic Web, then it seems obvious that all information should be expressed in RDF/OWL.  Right?  Do away with all other information formats, move everything to RDF/OWL and life will be good.  That is the only way where you can write "queries" on the information on the Web, if the information is in the same format.

RDF/OWL has benefits beyond what XBRL can provide.  These benefits seem to be:

  • OWL has way more power to express semantic meaning than XBRL.  That is what OWL is for, expressing semantic meaning within an ontology.  XBRL is more in the "taxonomy" expression business than the "ontology" expression business. (To understand the difference between a dictionary, a classification system, a taxonomy, and an ontology, see this blog post.)
  • RDF/OWL are the W3C information formats for the Semantic Web.
  • RDF/OWL can express anything, XBRL is more focused on business information. 

Bottom line: XML, XBRL, RDF/OWL; Which is 'Best'?

From what I can tell, the answer to the question of whether to use XML, XBRL, or RDF/OWL is that it depends on what you are using it for.  What is crystal clear is that XML, XBRL, and RDF/OWL are syntaxes.  What is important to business users is semantics, not syntax.  What ever syntax you choose, you should be able to convert it to any other syntax, be that an external exchange format or an internal storage format such as your relational database.  The semantics (meaning) must be the same in any business system or the information exchange simply will not work.

There are lots of obvious places where clearly XML is the way to go. It seems XML is perfect for specifying large, fixed documents such as DocBook, expressing Excel spreadsheets, XHTML, and such.  XML is also perfect for fixed transactions which rarely change.  What seems to be key here is "fixed".

For "ad hoc" projects, tightly controlled systems which are closed, XML will probably work fine.  When you start talking about enterprise class systems, lots of users, the need to scale, things which need to be rock solid, you need to have some sort of framework.  XML frameworks can be created. NIEM is such a framework (National Information Exchange Model).  The NIEM Introduction provides a very good explanation of why frameworks are important.  Basically, frameworks provide discipline and leverage.

XBRL is a framework.  It provides discipline and leverage.  A primary benefit of XBRL is XBRL Formula, the ability to model business rules in a global standard format.  Being able to express those business rules means that you can validate the semantics (not just the syntax) of information in a global standard  way and exchange those business rules with others. XML cannot do this, it probably never will be able to.  RDF/OWL cannot do this now, but the W3C seems to be working on this.

RDF/OWL offers a powerful tool to express complex semantics in a global standard way, far beyond the capabilities of  XBRL.  RDF/OWL will be the least common denominator of the Web, the way to get different syntaxes to be able to work together.

It seems as though the answer to the question about which is better is that it depends on the system you are implementing really.  What is clear is that clear semantics are critical.  RDF/OWL can help in this regard.  If you cannot clearly express your information model in RDF/OWL, then your information model is broken.  If you can express your model in RDF/OWL, the least common denominator of the Semantic Web and a very powerful tool for expressing semantics, then it will not matter what syntax you use because you will be able to convert to any syntax and the RDF/OWL will document exactly how to do that.

A lot of these details are discussed in my book XBRL for Dummies. The book lays many of these things out so business readers can get their heads around them and understand the right questions to be asking the technical people who have to help them use XML, XBRL and RDF/OWL within their business systems.  The area of RDF/OWL is rather weak in the book, but the key concepts are there.  Watch my blog for more information should you need such information.

Many Different Forms of RDF

This is a series of posts where I am providing information relating to figuring out what the best data format to use and why. Basically, when is XML better, when is XBRL better, and when is RDF/OWL better.

I have posted a number of blog entries relating to RDF, OWL, and the Semantic Web which you can find here. I want to summarize what I have figured out with regard to RDF here.

RDF (Resource Description Framework) is one of the cornerstones of the Semantic Web. RDF can be used to document pretty much anything.  The core to RDF seems to be the subject-predicate-object relation which was it seems used by Aristotle. This is what RDF looks like in one form, XML.  I am not going to explain RDF in any more detail, go look at the other blog posts for that.

What I do want to document is the forms of RDF:

  • Triples: There are lots of terms for subject-predicate-object relations. Here are some of those notations (syntax): N3, N-Triples, TRiG, TRiX, Turtle, RDF/XML, RDFa.  Each of these probably has their pros and cons.  The point here is that this is for the most part many different ways of doing the same thing.
  • RDF XML: So because I want to stick with XML, I will focus on RDF XML as "the" format for RDF for my purposes.  The format really does not matter, what matters is what I talk about below.  Using RDF alone is like using XML without a schema.  You can basically include anything, right or wrong.
  • RDFa: RDFa is an approach to embedding metadata into HTML web pages. Something similar to this is eRDF. RDFa and eRDF are similar to iXBRL.
  • RDF plus OWL: Web Ontology Language (called OWL) can be thought of as a schema for RDF, loosely similar to how XML Schema constrains XML.  But, OWL is much different in that it is used to constrain semantics, not syntax.  What this means is that RDF by itself seems somewhat useless really.  You have to both make sure you build your RDF relations correctly and you understand those relations.  That is what OWL seems to do. OWL defines a semantic model which both explains the RDF and constrains the RDF.
  • RDF and standard OWL ontologies: The next step in the spectrum is what I am calling standard OWL ontologies. It is one thing for someone to post an ontology to the web.  You could have hundreds or thousands of ontologies which express the same thing.  The ontologies could have different logical models and not even interoperate.  As compared to having one agreed to ontology for some specific model.

So, what the heck does all this mean.  Let me try and explain.  I will use a small data set which I have created to explain.  Browse through these different data sets which I found on the web.  I grabbed these data sets, I decided to grab 20 different sets of data. Imagine you had the following data sets:

  1. When states entered the union.
  2. State violent crime statistics.
  3. Miscellaneous population statistics by state.
  4. State capitals and largest cities.
  5. Population estimates by state. (This is the specific CSV file which will open in Excel.)
  6. Financial information by state. (This is the specific Excel file.)
  7. State areas.
  8. State symbols.
  9. State mottoes.
  10. State nicknames.
  11. Origin of state names.
  12. State GDP.
  13. State GDP per capita.
  14. State population density.
  15. State tax revenues.
  16. State unemployment rates.
  17. Gross state product per capita.
  18. State by most educated.
  19. State by health index.
  20. State by personal income.
  21. (Extra) Red, Blue, and Purple states

Suppose you wanted to use the data in one of those data sets, what would you do? Copy and paste into Excel most likely.  What if you wanted to use two of those data sets together. No problem, just copy and paste both sets into Excel and put them together.  When you try and do something like this you run into problems such as the key value (i.e. in this case the state name probably) could be different.  For example, this list uses the state abbreviation, not the state name.  Now, this is not a huge deal if you don't need this information on a timely basis, or if you have small sets of data like the 50 states, etc.

So what if this information was in XML like this data set of state population. It would be pretty easy to write a simple Excel macro to go get the data. But what if each set of data used a different XML syntax?  See this blog post on different XML formats. OK, so not a huge problem, just write multiple import Excel macros, one for each XML file.  Right?  Well, that will get old.

OK, so what if everyone used the SAME XML format?  Say, RDF.  Well, then you could read the RDF by just pointing an application at the file, right?  Not quite.  What if the RDF used different logical models (or ontologies) to describe the data? If that happens, well, then you are back to mapping one file at a time, adjusting the multiple logical models or ontologies into one common model. You can do this, but it is a lot of work.

But what if there were another way?  What if you created one standard logical model, documented in using OWL, and then made every piece of data available in a common format.  Check out this Data-gov Wiki. Look at this web site, or wiki.  More specially, look at this complete data set of RDF.  Per the web site, they have converted about 280 data sets into RDF.

OK, so what is the bottom line here with regard to RDF.  First, the Semantic Web is about making information on the web more readable to computers.  To do that, the best way is to have one data format (semantics and syntax).  Short of that, one can take the many different data formats and map them to one syntax. You have to be sure the semantics (the meaning) of the data is consistent.  Much of the data needs to work together. Most may never be used together, but come like the state information I pointed out, will be used together. XML is a syntax that pretty much most people on the web are moving to, so RDF in XML makes sense.  You need OWL to articulate your ontology, or your model, so people both understand your model and data made available complies with that model.

But my next question is when should XBRL be used and when should RDF/OWL be used?

Another Step Toward Understanding OWL, RDF and How They Relate To XBRL

This is another step in the journey of understanding OWL, RDF, and how they relate to XBRL.  You can see other blog posts relating to this here.

OK, so I think I am getting more and more dialed in as to what OWL, RDF, and XBRL offers.  At this point I am not totally sure about this.  Consider this blog post brainstorming. I will use this post to get feedback from others who understand these things better than I do and then tune my perceptions.  So be sure to check back later to see where these perceptions end up.

RDF, OWL, and XBRL are all "graphs".  Or maybe I should say that they all share the same approach, graphs, to articulate information of some sort.  That approach was chosen because of the flexibility of graphs to expressing information.  You can do a lot with "subject-predicate-object" relations.  These "subject-predicate-object" relations where not invented by computer science.  Aristotle and Socrates, long ago, used these types of relations in philosophy it seems.

RDF is a "global standard way" of articulating subject-predicate-object relations.  It is a general tool.  It can articulate any subject-predicate-object relation.  If is very flexible.  It is very verbose.  There were lots of different syntaxes for expressing the RDF, one syntax which seems to have been catching on is the XML format of RDF. (I found another useful primer on all this here.)

One issue with RDF is that is that you can express ANY subject-predicate-object relation, whether it is logical or not logical.  OWL is used to express constraints on the subject-predicate-object relations.  (It is also somewhat of a "short cut" approach to creating "common" relations it seems.  But, lets not focus on that).  It seems as though OWL is sort of like a "schema" for the RDF if you understand XML Schema or database schemas.  Basically OWL expresses constraints on the RDF which software can use to determine if the RDF expressed "follows the rules" so to speak.  This is critically important just like a database schema is important or an XML schema is important.

OWL is very, very "powerful".  It goes far, far beyond what a database schema can express or an XML schema can express.  OWL can be used to express semantic meaning, it seems like ANY semantic meaning which can be expressed can be expressed in OWL.

XBRL's architecture uses "graphs" to express many things.  Those that created XBRL did this because XBRL needed flexibility.  The approach of using graphs gives you the flexibility.  You have subject-predicate-object relations in XBRL.  Again, remember that pretty much anything can be expressed in this manner, so clearly XBRL should be expressible in this manner.  And in fact some technical people "expressed XBRL in OWL" (I say this loosely).  You can see those OWL ontologies here.  It is something only your mother could love.  Basically, what they did is run a style sheet over the XBRL schemas and converted them into OWL.  They took one syntax, XML Schema, and converted it into another syntax, OWL.  Not really that useful for business people.  Might have some sort of technical use, that will be seen later.

Now, XBRL took some additional "short cuts".  RDF is built to express anything.  XBRL is built to express business information, a subset of "anything".  Could RDF/OWL be used to express the XBRL syntax.  Yes.  Is this useful?  I think it does have some utility.  But, there are better uses for OWL than expressing a logical model of the XBRL syntax.

So, XBRL is a "short cut" to expressing business information in a form that computers can make use of.  XBRL is a general format.  No one uses "XBRL".  Everyone uses some subset of XBRL, some application profile. This is why the COREP taxonomy does not work with the US GAAP Taxonomy.  Every XBRL taxonomy or system has a different "application profile" because it uses a different architecture.

It seems to me that one thing OWL can be used for is to "express" or "document" those different application profiles.  Other things can be used to model an application profile of XBRL, UML for example could do this.  OWL could be a very, very valuable tool for documenting an application profile.  Today application profiles are either not documented at all or rather poorly documented in a Word document or PDF.  No computer can read those documents.  Computer programmers have to read the documents, extract information, and build applications to work with the different XBRL application profiles.  For example, here is the US GAAP Taxonomy architecture.  Here is the SEC test suite.  Here is the CEBS FINREP taxonomy architecture.

I really don't know if it is possible for an OWL ontology to be constructed and for software to automatically generate "tests" of the XBRL application profile.  That could be nice.  But, having a consistent way to document XBRL application profiles could be nice.  UML could be that way.  OWL could also be that way, it seems.

But there is another thing that OWL can be used for.  What I am seeing is that anything can be expressed in OWL.  Well...almost anything.  There is one big constraint.  What you want to express has to be logical.  If it is not logical, it cannot be expressed.  Therefore, if it IS logical it CAN be expressed.  Additionally, you can see WHAT is expressed!  That is even more interesting and has more utility.

For example, OWL can be used to articulate where a taxonomy can be extended, what the information model can look like, what is allowed, what is not allowed, and so forth.  OWL basically can document how things work.  For example, you could use OWL to document how the US SEC XBRL filings "work".  By seeing that, you can determine things like does it work the way you WANT it to work.  Or, is there a better way.

Another way OWL might help XBRL is adding additional information to XBRL.  Now, XBRL can be used to add additional information.  For example, the definition linkbase of XBRL can be used to express "arcroles" and things which can be used to express meaning.  For example, the XBRL Dimensions specification did this.  But is XBRL really the best way to express additional information?  The XBRL way has its pros and cons.  The OWL way also has its pros and cons.  Does it really matter?  XBRL and OWL are only syntax.  The important thing is that what is expressed is logical, it is done in as standard a way as possible, and it works.

One final thing which I want to mention before I wrap up this post is my realization that RDF/OWL does seem to have one very significant thing "missing".  I really don't know if I really should say it is missing, more it is something that to really make use of RDF/OWL information you have to build domain specific software on top of it.  Like I said, RDF/OWL can be used to express anything.  Its biggest strength is also its biggest weakness.  Because it can be used to express anything, you have to write software which understands the specific subjects, the specific predicates, and the specific objects and does useful things with those relations.  There is one thing which I admit that I don't really grasp yet (at least one thing, could be more).  Software can be built which "learns" from the relations, building additional relations.  This can be useful, but I cannot grasp this right now.  This is in the realm of artificial intelligence.  Maybe this will work, maybe this will not.

By contrast, an XBRL processor only has to understand XBRL.  An XBRL processor can easily convert XBRL into RDF/OWL.  To understand the RDF/OWL syntax but more importantly the SEMANTICS as well as an XBRL processor, you would basically have to rebuild the functionality which exists in an XBRL processor into an RDF/OWL processor.  Why would you do that?  Besides, XBRL processors don't even work at high enough level, they still deal mostly with syntax, not enough with semantics.  And I am not talking about things like XBRL Formula which verifies if the semantics is correct.  I am talking about DOING USEFUL THINGS with the expressed semantics.  No general XBRL tool does this at the level a business person would find particularly useful these days.  I would also point out that XBRL is ahead of RDF/OWL in regard to having a working method of validating semantics.  XBRL has XBRL Formula, the Semantic Web folks have something in the works (i.e. they do realize that this is important).

It seems that no matter what happens, XBRL is going to have to fit into the Semantic Web.  Getting XBRL into RDF/OWL is trivial for an XBRL processor.  Doing anything useful with the RDF/OWL is going to take more than want XBRL processors offer these days.  What an RDF/OWL engine can learn from, say, a data dump from XBRL from something like the entire SEC XBRL filing database could be quite interesting.

So that is what I seem to be seeing.  Not totally sure if I am correct on all of this or any of this.  The next step is to float these ideas by some people who grasp these things better than I do and see what they have to say.  Discussions with some of them have yielded what I have thus far (i.e. this blog entry).  But, there is still a ways to go.

One thing that I can say with a pretty high level of confidence is that if you live in this information age and you are a business person and you don't understand what metadata is and what you can do with it you are at a distinct disadvantage.

What is your opinion?

 

Posted on Saturday, January 9, 2010 at 07:45AM by Registered CommenterCharlie in , , , | CommentsPost a Comment | EmailEmail | PrintPrint
Page | 1 | 2 | Next 5 Entries