BLOG: Digital Financial Reporting
This is a blog for information relating to digital financial reporting. This blog is basically my "lab notebook" for experimenting and learning about XBRL-based digital financial reporting. This is my brain storming platform. This is where I think out loud (i.e. publicly) about digital financial reporting. This information is for innovators and early adopters who are ushering in a new era of accounting, reporting, auditing, and analysis in a digital environment.
Much of the information contained in this blog is synthasized, summarized, condensed, better organized and articulated in my book XBRL for Dummies and in the chapters of Intelligent XBRL-based Digital Financial Reporting. If you have any questions, feel free to contact me.
Entries in Semantic web (7)
Comparing XML, XBRL and RDF: Initial Observations
I mentioned in another blog post that I was creating a prototype State Fact book and using that to compare and contrast XML, XBRL, and RDF. This expands on my brainstorming in another blog post where I was trying to figure out which of the three (XML, XBRL, RDF) were "best". I expanded that to also include XHTML and iXBRL.
Most of the "raw data" for this comparison is located on this page, the State Fact Book Index. On this page there are 8 different sets of information. Each of those sets are made available in XHTML so you can easily see what the data looks like. I also made each available in XML. Three sets are available in XBRL (but those three sets contain 100% of the information for the other sets). The first set also contains an iXBRL (inline XBRL) example and an RDF example. I may, or may not, build out RDF, XBRL, and iXBRL for each set but I really don't need to because I am seeing the things I need to see.
The Story of the Three Bears
Remember the Story of the Three Bears? I am seeing a similar story when I compare and contrast XML, XBRL, and RDF. Let me leave XHTML and iXBRL for later.
This is what I am seeing:
- XML is general purpose. You can use XML for many, many different types of information exchange needs. When you use XML though, you generally start from scratch unless you create some sort of framework to work within. I talk about how, for example, NIEM building an XML framework in this blog post. The framework helps you be more efficient and effective. That is a good thing.
- RDF powerful, can become complex and is general purpose. You can do anything with RDF also. RDF is for doing something different than XML. RDF is for expressing semantic meaning which is way more powerful that XML which is a syntax. RDF is also somewhat of a least common denominator. If you wanted to get all the information in the world to be able to work together, you would use RDF to do that. In fact, that is pretty much the goal of the Semantic Web. Like I said, RDF is extremely powerful and once you start getting into making use of that power, it can become complex. RDF tools are not really usable by the typical business user.
- XBRL seems like a compromise between XML and RDF. XBRL is a general purpose language also, but its "general purpose" is does not mean everything, the focus of XBRL is on business reporting. So, it is more of a general purpose business reporting tool. XBRL can become complex like RDF because it is quite powerful. It is not powerful in the same way that RDF is powerful. Again, you can do anything with RDF. XBRL is for business reporting, it is optimized for that scope. Software which works with XBRL at the syntax level is easier to use than RDF software, but in my view it will still be over the heads of the typical business user. But, it is possible for business users to use XBRL tools to build quite sophisticated solutions using XBRL. It would be hard for me to believe that these same business users can figure out RDF tools. Personally, I struggle with RDF and RDF tools.
Making XBRL Less General Makes it Easier to Use
Like I said, XBRL is a general purpose tool for business reporting. Business reporting is a broad category and XBRL has features or many different business reporting use cases. But each use case will not need to leverage all the power of XBRL in all areas, different domains and use cases will need different pieces of XBRL. And making XBRL less general makes it easier to use. This is where application profiles are helpful.
While the average business user would never be able to use an XBRL tool which operates at the XBRL syntax level, applications build for more specific purposes can be very usable by the average business user. For example, from what I have heard business users tend to struggle with XBRL software for reporting to the US Security and Exchange Commission in XBRL. Extending the US GAAP Taxonomy can be challenging for business users. Building the XBRL instance can be challenging. A lot of companies outsource the entire process to third parties, thus loosing the real value that XBRL could provide. For this reason, people see XBRL as a regulatory mandate, not for what it really is. I anticipate that this situation will come to an end when software applications are built specifically for US GAAP financial reporting using XBRL. That is all they do, these will not be general XBRL tools. But because they are more focused, they will be easier to use.
So, you can get what I see as an order of magnitude increase in usefulness by turning general XBRL into one or more specific XBRL profiles and building specific applications for each profile. This can happen for two reasons. First, some of these domains are quite large. For example, US GAAP financial reporting is huge. The SEC filers number about 20,000. That is plenty large. Non SEC filers who use US GAAP is somewhere between 8 million and 16 million private companies (these are the estimates I have heard).
The second reason is leverage. XBRL is an XML framework. For the same reason you would build an XML framework if you are making use of different forms of XML; you will get leverage when you use XBRL as a framework. XBRL does have well defined boundaries. Tools can stay inside those boundaries and provide good bases of functionality. Complexity can be absorbed by applications by really smart business users or technical users who understand a business domain building profiles other less skilled business users can make use of. This is done by the profile constraining the base application.
Not Either/Or
RDF is also an XML framework. There are may different RDF syntaxes, but there is one in XML. But the fact of the matter is that the syntax does not even matter. There are many things that business reporting needs from RDF which XBRL cannot provide and XBRL should never provide. There are things which XBRL has today that will eventually be built for RDF such as rules. XBRL has business rules via XBRL Formula. RDF is known to need this and things like RuleML, the W3C has an initiative to create a rules language.
Semantic "Haves" and "Have nots"
I think the bottom line here is that if you believe that Web 3.0, the Semantic Web, will exist and from what I can see the evidence is mounting that it will and this will be very useful technology; then you will want to be someone who can use this new power.
XML isn't really a solution, it is more part of the problem because XML is really is about syntax, not semantics. You can make XML "semantic" in many cases, that is what RDF is for. RDF is quite powerful and there will be many very sophisticated things created using RDF for business users. Most will not be created by business users themselves.
XBRL is a nice compromise. It enables a business person to leverage these new semantic technologies in ways they choose. It is somewhat like the personal computer and the electronic spreadsheet empowered business users, setting them free from the IT department.
It is possible for a business user to use general XBRL. I can, I am a business user. I know others who can. Most business users will interact with XBRL and harness the power of semantic technologies via application profiles for more specific use cases making things easier and therefore useful. These applications will likely combine XML, XBRL, RDF, and other technologies.
Semantic Web utopia may never be realized, I think it is a vision, something to be strived for. But the Semantic Web is coming. In fact, it is already here and ramping up. Do you want to be a semantic have or semantic have not?
#######################
Making your XBRL Unambiguous: Clues from the Semantic Web
In order for your XBRL information work on the Semantic Web or within your internal semantic web, or in any computer system for that matter, your data and metadata need to be unambiguous.
Before I get started here, I want to explain a few terms to business people. Business people need a working knowledge of these terms in order to understand what is important to making your systems work, to making your XBRL unambiguous.
Why is this Important?
You may have heard terms like "metadata" and "semantic web". But what do these terms mean and how do they relate to you. In his book Pull, David Siegel explains these two important terms and how they will change the Web. While the terms are defined in the book, what provides you the understanding are the countless examples of what having a "semantic web" will mean to you.
For anyone who lived through the beginning of the Web, to say there was hype surrounding the notion of how the Web would change life as we know it on planet earth is an understatement. However, you have to admit that a lot of things have changed. Just because there is hype does not mean that the Web is "empty", nor is it the case that "the Semantic Web" is empty. In fact as I understand it, the Semantic Web was Sir Tim Berners-Lee's vision of what the Web needs to be, the Web as we know it today is just an interim step in that direction.
Metadata
It has been my experience that technical people like to complicate the notion of "metadata". Perhaps they like to keep things mysterious. You can go search the Web for a definition, in fact here is an explanation of metadata on Wikipedia. I even hear techies use the term "meta-metadata"!
So what is metadata? Metadata is just data. It is just at a different level from what you normally thing of as data. Metadata, like data, describes something. That is it. What is more important is to understand why metadata or data is important. Computers are not magical things. They can do magical things, but all this is enabled by the data and metadata which is provided by and linked together by humans. For example, if you have a list of files on your computer you can only sort them in ways you have information about those files, the "data" or "metadata" about a file; such as the date you saved the file or the name of the file or the type of file. The more data or metadata you have, the more a computer can do with data.
Semantic Web
Metadata and data is the foundation of the Semantic Web. David Seigel gives a very simple explanation of the Semantic Web by posing two simple questions:
- Is it unambiguous?
- Is it on the Web?
So now we have a number of other terms floating around here: semantic and unambiguous. If you are a glutton for punishment, you can go here and read about semantic. Fundamentally, semantic is about unambiguous meaning.
When I say Semantic Web, you can look at the scope of the "web" in a number of different ways. It woulc be the "Semantic Web" meaning the open to the public on the Web, it could mean "semantic web" meaning only available within your organization, or it could just be some smaller subset of users in some sort of closed system, not open to everyone. The type of web makes no difference.
As David Seigel explains in his book,
Data that is semantic means exactly the same thing to any system or person who uses it.
That is the key to making data usable to a computer. You need to be unambiguous. This is not to say that everyone has to interpret the information in the same way, this is about consistency in the meaning of the data and metadata.
Making XBRL Unambiguous
So, if you are creating XBRL you want to be unambiguous. This does not mean unambiguous to you, it means unambiguous in general, to everyone. There are ways to test to see if your information is unambiguous. As a business person you don't actually have to do these things yourself, you can ask the technical people implementing the system if they have done any of these.)
- Try to express it in RDF (Resource Description Framework) /OWL (Web Ontology Language). If you can, and it makes sense, then it is unambiguous.
- Try and express your information model in UML (Unified Modeling Language) and see if it makes sense. Again, if the UML model makes sense, then the data and metadata will likely make sense.
- Try and use the data. If the system works, then the data and metadata are unambiguous and logical. If not, then something needs to be fixed.
If you have not tested your systems there is a high probability that your system is not providing information which is unambiguous. It is the testing which helps you achieve this desired goal and makes your system usable.
This is particularly true when your system allows extensibility, where those submitting information can make any sort of adjustment to an XBRL taxonomy. Those extending your XBRL taxonomy have to be given guidance to create those extensions as you had anticipated and consistently with one another.
This testing is best done during the time when you are architecting your XBRL taxonomy and other aspects of your system, not when your system is live. Waiting until your system goes live to do this testing may mean the need to re-architect your system.
Pull: The Power of the Semantic Web to Transform Your Business
I thought the readers of my blog would be very interested in the following book:
Pull: The Power of the Semantic Web to Transform Your Business
The book, written by David Siegel, explains the Semantic Web in business terms. The book has an entire chapter on XBRL. There are 12 reviews, 11 are "5 star" and 1 is "4 star". From one of the reviews:
"I was looking for something that explains the Semantic Web more from a strategic rather then a technical perspective. This book really helped me to understand how the Semantic Web can be applied. There are numerous real-live examples. From shipping products to health-care, tax, real estate, financial data (XBRL), search & security - everywhere you will find examples of businesses that already use or transition to the Semantic-Web."
The book has a companion web site which can be found here. One of the very interesting things about the web site is a series of "tours" of the Semantic Web. Entrepreneurs, managers, investors, and standardistas each have their own tour.
Like I said in a previous post, business professionals who don't understand what metadata is is at a disadvantage. The Semantic Web is important. It is strategic.
I want to point something out. XBRL has been referred to as "One of the more successful Semantic Web metadata formats." XBRL started 10 years ago. The US SEC and many other regulators around the world are "priming the Semantic Web pump", so to speak. You may not want to become an expert in XBRL or the Semantic Web. But it seems to me that at a minimum the "risk" that something is going on here is worth the price of the book and a few hours of reading it to see for yourself what is going down.
Continuing to Fiddle with RDF/OWL, Seeing Some Patterns
I am continuing to fiddle around with RDF/OWL. I updated my little index page which summarizes my brainstorming. I added an RSS feed to enable grabbing all of the ontologies with an application. In doing that a question that popped into my mind is whether the "RSS feed" really should also be an ontology or in RDF rather than in RSS. So, I did some checking. I don't know that I got this correct, but I created an ontology of ontologies. I will get to the correct approach, but if you look at this you should have the same questions that I had.
Another thing I tried to do is model all the pieces that I am seeing, which I did in this little box diagram. There are many things that I am seeing from that diagram which I will get to later in other posts. One primary thing I see is something which I touched on before, but now want to expand on a little.
There seems to be a "spectrum" of approaches to implementing a system for exchanging information:
- Something: Some approach, proprietary pieces and standard pieces, does not even have to be XML, it could be JSON based or CSV files, whatever.
- XML + Something: You could use XML and the old approach of a DTD, and a bunch of other proprietary or standard stuff.
- XML + XML Schema + Something: You could use XML and XML Schema plus some other proprietary and standard stuff.
- XBRL + Something: You could use XBRL and some other proprietary and/or standard stuff.
- XML-based Language + Something: You could use some other XML language and then add some other proprietary and/or standard stuff.
- RDF + Something: You could use RDF alone and then add some other proprietary and/or standard stuff.
- OWL/RDF + Something: You could use OWL/RDF and then some other proprietary and/or standard stuff.
This analysis may seem odd, but I see two things. First, each solution needs "something" more. The second is that the something can be a combination of standard and proprietary stuff. That "something" has a cost associated with it.
The truth is you probably have a multitude of systems which exchange information, maybe even one from every option in the spectrum. Well, that is what the Semantic Web is about, solving the problem of multiple formats and using the information from all the different systems as one big set of information. That grand vision is what RDF and OWL are for, the standard format.
So, what is that "something"? Think about it. Knowing that is really the $64,000 question. Knowing the answer helps one choose between the different options. What, you don't exchange business information with anyone? Really.
I will look at breakind down that "something" in later blog posts. Stay tuned...
A Contract for Meaning
I make it my business to understand how certain things work. What I have discovered is that if you understand how something works, then you have a better understanding of what you can do with it and what it will not work for. Back when the Web was growing in popularity (early and mid 90's), I took a year off from work to study the Web because I thought it was important and I did not have the time to make the investment that I felt I wanted to make while having a day job. That investment paid huge dividends for me in my career as a CPA.
Web 3.0, referred to as the Semantic Web, is a similar deal. There is a difference this time; I don't have to take a year off. I actually get paid to understand these sorts of things now. What I want to understand is exactly how XBRL fits into the Semantic Web.
Probably for five or more years I have been trying to truly grasp what is meant by "the Semantic Web" and how the Semantic Web will impact CPAs, internal and external financial reporting, and business in general. Plus I wanted to know how XBRL will fit into the Semantic Web. I cannot say that I am totally clear on the impact that the Semantic Web will have on business, but I am getting an idea. What is becoming clear is how and why the Semantic Web will work and XBRL's role.
Resources to Understand the Semantic Web
I have read a number of books on the Semantic Web. Two books in particular stand out, providing the most useful information. These two books are:
- Semantic Web for Dummies: This book provides a concise summary of key information in terms that a business reader can really relate to. It really focuses on the issue of exchanging business information between business systems.
- Programming the Semantic Web: Don't be scared off by the word 'programming' in this book. The book has an excellent big picture summary of the problems the Semantic Web is trying to solve and how it will actually solve those problems. It talks a lot about data modeling and has an excellent section titled A Contract for Meaning which is itself worth the price of the book. You can skip the programming related sections in the book.
A Contract for Meaning
Business users should be able to relate to the notion of a business contract. What does a contract provide? It documents understanding between parties for a certain business transaction. Having no contract at all can lead to misunderstandings and misinterpretations. Not always. A well written contract has fewer misunderstandings and misinterpretations, let's call those "issues". Those issues cost you time and money. If you don't agree, then you have to get expensive lawyers involved to sort things out. Not good, for anyone except the lawyers.
Misunderstandings and misinterpretations (i.e. issues) can come up with business information transactions also. Except that when the issues occur you don't call the lawyers, you call the IT department to solve the issue or more often you may just throw a person at the problem, that is called rekeying. Both the IT department and those people that do the rekeying cost time and money.
How do you know if you got the business information exchange contract correct? The system works. That is the ultimate test. You get consistent actions based on a set of data, the results you get are expected. People take this for granted, in fact I think that people SHOULD take this for granted, at least the business users should. But what makes it work?
What makes this work is that contract for meaning. How do you write that contract? Who writes the contract? One of the things that I learned about in business is that if you are doing an important contract then you want your lawyer to write the document. That way, you have better control of what goes into that business contract, the subtleties. When you are creating an information exchange contract, would these better be written by business people or IT people? If the IT people create that information exchange contract, how do you know that they got it right? Business people need to be able to understand this stuff.
XBRL Taxonomies: A Contract for Meaning
One form of writing an information exchange contract is an XBRL taxonomy. How many business people actually understand XBRL taxonomies? I mean really understand them, not look at them and glean a little information. Are you sure the XBRL taxonomy works the way you want it to work? How can you tell if it does?
XBRL and the Semantic Web
There are a lot of forms in which information is stored: relational databases, multidimensional databases, columnar databases like spreadsheets, hierarchical databases, graph databases, object databases, and so forth. Each type of database has its pros and cons, none is perfect, certain types are better in certain situations.
XML is a serialization or representation of information. XBRL, an XML language, is likewise a representation of information. XML is a web technology for exchanging information on the Web. It is not the only one, there are others.
Web of Information
Let's say you want to query information but the information is in two (or more) different types of databases in two different businesses or government agencies. That is what the Semantic Web is about, hooking all that information together. The Semantic Web is about having some least common denominator that lets the databases work together to answer your questions.
How is that done? That is what RDF (Resource Description Framework) does. RDF uses a technique which was also used by Aristotle during his lifetime in 384 BC to 322 BC where information is broken down into fundamental building blocks of "subject-predicate-object" relations. Aristotle's logic was used in philosophy, long before computers even existed. These relations are expressed in the form of very flexible graphs. (Note that XBRL networks are also graphs.)
Basically these subject-predicate-object graphs can be used to express any information. They are quite flexible. An ontology provides a precise meaning of the relations which can or must exist within these graphs, it is a way of controlling the graphs. These models are expressed using OWL (Web Ontology Language). Ontologies express formal rules. This is the contract, a contract for meaning. The purpose is to make it so different software performing specific actions get the same result. Fundamentally what this means is that if you have the right ontology then things work as expected, if you don't have the right ontology then things don't work. It is really as simple as that.
An XBRL Taxonomy is an Ontology
They are called taxonomies in XBRL, but they can really be a list, classification, taxonomy, or ontology depending what meaning you put in them. (To understand the difference see this blog post.) Frankly, the syntax (XBRL, OWL, whatever) is not really relevant. What is relevant is the meaning that is expressed and whether that meaning can be consistently interpreted by software applications. An XBRL taxonomy and the related XBRL instance could be expressed using RDF and OWL and the business meaning of either syntax should be exactly the same.
Why this is Important to Business People
Remember what I said about the information contract earlier? The part about the IT department getting involved and the rekeying. That means time and money, something that business people can relate to. OWL, RDF, graphs, and subject-predicate-object relations may be harder to relate to.
Building the right information models and understanding those models and how they work is important to business people. Partly because it helps you get the right information model. Another reason is because understanding how these sorts of things work and why they work can mean that you can use these tools strategically and/or tactically to make things better, faster, and cheaper.
XBRL is only one data format which will be part of the Semantic Web, note the capital letters, that is like the Web, the public network. You will also have your internal semantic web or webs. Your internal webs will likely interact with the external public Semantic Web.
Everyone is a ".com"
I never really understood when people referred to a company as a ".com". I think that people realize that every organization is a ".com" and needs to leverage the power of the Web, whether you are public, private, government, not for profit, or some other type of company. It can mean losing sales if your company shows up on page 5 of a Google search, as compared to page 1. Using semantic information can provide advantages over those what do not, be that information in XBRL or some other format.
If you thought that XBRL was only a regulatory mandate from the SEC or some other regulator, think again. Just as people eventually recognized that everyone was a ".com", they will likewise recognize that fitting into the Semantic Web has its advantages. The book Pull: The Power of the Semantic Web to Transform Your Business helps explain this. You can find more information here.
XBRL is part of this, in fact some refer to XBRL as one of the most successful Semantic Web metadata formats.