Understanding Expressive Power and Your Digital Future
You think XBRL is complicated? It is not XBRL that is complicated, it is the real world which is complicated.
Since the time of Aristotle, who built the first ontology, humans have been coming up with ways to describe the world. It was not information technology professionals who created the notion of an ontology, it was philosophers. And there was no need for computers to read these descriptions of the real world, computers did not even exist in the time of Aristotle, 300 BC.
Today computers do exist and it is important for ontologies to be machine-readable. Arguably, the state-of-the-art for machine-readable representations of knowledge is the global standard W3C: RDF/OWL 2 DL + SAFE SWRL.
If you go to the W3C page for ontology, the first thing you might note is that the title of the page says "Vocabularies". I guess that the W3C feels the same as XBRL International who considered the term ontology but went with taxonomy because it seemed less scary.
This is the definition of vocabularies provided by the W3C:
Vocabularies define the concepts and relationships (also referred to as “terms”) used to describe and represent an area of concern. Vocabularies are used to classify the terms that can be used in a particular application, characterize possible relationships, and define possible constraints on using those terms.
(I don't want to go down the path of explaining all of these technologies. See the sections below "Semantic Web Technologies" and "Semantic Web Metadata" to take a closer look at these technologies.)
Ask yourself a question: Why is the W3C going through the trouble of creating all of this stuff?
Machines, if made to understand what you want, can do work for humans. To make a machine understand you have to express things so that machines can first simply read what you are expressing, then once read; additional work needs to be done for them to understand what you are expressing. All the W3C semantic web technology stuff is an effort to express as much as possible to get machines to do as much as possible.
The Semantic Web Technologies the W3C are creating have a high-level of expressive power. That means you can get machines to do a lot of stuff for you.
By contrast, take the CSV (Comma Separated Values) information format. It is not very expressive and therefore you cannot really do much with it. You can do some things, like transfer tables of information between Excel spreadsheets, but that is about it.
Financial reporting is complicated. As such, XBRL had to be closer to what the W3C is creating in terms of semantic web technologies than to the rather impotent CSV.
XBRL allows you to express terms or concepts, relations between terms/concepts, and constraints on using terms/concepts. When I say "XBRL", what I mean is XBRL 2.1, XBRL Dimensions, XBRL Formula, and a set of arcroles used to express relations which I have been trying to get included in the XBRL International Link Role Registry. That is what I mean when I use the term "XBRL".
What I don't quite understand is which has better expressive power:
- XBRL or
- RDF/OWL 2 DL + SAFE SWRL.
I don't think this is an either-or type question or one against the other, RDF/OWL 2 DL + SAFE SWRL --or-- XBRL. What I want to understand is the gap in the expressive power between the two.
Why understand the gap? Because if you understand something's limitations, then you understand something's true power. This is what I do know:
- RDF/OWL 2 DL + SAFE SWRL does not do math. Mathematics was consciously left out of OWL 2 DL because parts of mathematics is not decidable.
- XBRL has XBRL Formula; so XBRL does do math. You can also limit how you use XBRL Formula to keep XBRL decidable.
- RDF/OWL 2 DL + SAFE SWRL is more flexible than XBRL. If you want to express anything and everything, it does not get much better than RDF/OWL 2 DL + SAFE SWRL. However, the trade off is that RDF/OWL 2 DL + SAFE SWRL is harder to use because of that flexibility.
- XBRL is flexible, but not remotely as flexible as RDF/OWL 2 DL + SAFE SWRL. But, if you want to represent a business report or financial report, XBRL is specifically tuned for that task. Therefore, XBRL can be easier to use for that specific task because it is limited to that specific task.
- RDF/OWL 2 DL + SAFE SWRL could be used to express dimensional information, but you have to add additional functionality to make that happen. How do I know? The W3C created RDF Data Cube vocabulary (see below) to enable that functionality.
But that information above does not tell the the specific "gap" between RDF/OWL 2 DL + SAFE SWRL and XBRL. I want to figure that out. It provides a lot of good information, but I cannot point to specific limitations quite yet.
What it looks like to me is that XBRL has a potential advantage, a "sweet spot". That sweet spot is business reporting including financial reporting. You could recreate 100% of what XBRL offers in RDF/OWL 2 DL + SAFE SWRL perhaps. Or if you cannot, what you cannot express is "the gap" that I am looking for. THAT is the answer to my question.
What could be really, really interesting is the controlled natural language provided by Fluent. I don't believe that is itself a global standard. However, it can be converted into RDF/OWL 2 DL + SAFE SWRL which is a global standard. What if Fluent could also output XBRL syntax? In my view, that would be the perfect world. That would be the smoking gun which shows that syntax does not matter. It would also clearly delineate any expressive power differences between RDF/OWL 2 DL + SAFE SWRL and XBRL.
Business professionals and accounting professionals: how much of what I am talking about do you understand? While most business and accounting professionals should not even care about this stuff; those trying to get XBRL to work correctly to serve business reporting and financial reporting should. This information will tell you if XBRL is working the way you need XBRL to work.
What do you think the chances are that digital financial reporting will succeed in replacing current financial reporting practices? Another term for digital financial reporting is disclosure management. (see the end of this blog post for more information on disclosure management)
While it is hard to understand exactly how all this digital business reporting and digital financial reporting will sort out, what is clear is that change is not only inevitable, it is imminent. Being on the wrong side of this equation will have consequences.
Semantic Web Technologies:
The W3C OWL 2 specification is a formal language for building formal vocabularies for specific problem domains so that information can be shared by a community of users. OWL 2 specifies the classes of things, the relations between classes of things, the properties of classes and relations, and can be used to express individuals which represent instances of things and relations.
The W3C RDF (Resource Description Framework) is a framework for describing information. RDF is about as flexible as you can get, it can be used to describe pretty much anything which is describable. OWL 2 is used to constrain information to make sure that what is described is consistent with some OWL 2 formal vocabulary of some problem domain.
The W3C SWRL (Semantic Web Rules Language, a proposal to the W3C) is a language which expresses rules which OWL 2 does not have the power to express. This gets a little complicated, this tutorial helps you understand the details; but basically OWL is limited and SWRL overcomes those limitations.
This is where things start to get really messy. OWL 2 and SWRL have limits. RuleML, while not a W3C standard, is a de facto standard for expressing rules. A business rule is basically a business requirement of some sort which can be implemented within software. W3C RIF (Rules Interchange Format) is a standard syntax for exchanging rules from one system to another system. SPIN (SPARQL Interface Notation) is a very low-level (i.e. SPARQL queries) approach to creating rules. It seems that you can create almost any rule using SPIN. However, SPIN is pretty low-level and not something that a business user would directly interact with.
Semantic Web Metadata:
Here are some global standard sets of metadata created by the W3C mainly:
- W3C organizations ontology which is used to define organizational structures.
- W3C time ontology which is used to define time related structures.
- W3C RDF Data Cube vocabulary which is used to publish multidimensional data. They also say that it can be used to represent spreadsheets and OLAP cubes.
- W3C SKOS (Simple Knowledge Organization System) which is a common language for linking and sharing knowledge systems.
- W3C FOAF (Friend of a Friend) which is a common language for linking people and information.
- Dublin Core which is best describe as metadata which turns the entire internet into a "library card catalog".
Reader Comments