BLOG: Digital Financial Reporting
This is a blog for information relating to digital financial reporting. It is for innovators and early adopters who are ushering in a new era of digital financial reporting.
Much of the information contained in this blog is summarized, condensed, better organized and articulated in my book XBRL for Dummies and in the three documents on this digital financial reporting page.
The following is a post I made to an XBRL news list which walks you through, step-by-step, extracting financial facts from SEC XBRL financial filings from public companies. The explaination provides a "happy path" which eliminates details to make the explaination easier. Extracting any fact works in this same manner, it just involves additional details. Use this prototype to work through this process if you desire. You can turn the debugger on and step through the process as it unfolds. Basically, extracting other information is a variation of this same process.
STEP 1: You need to know WHERE the information is (i.e. what companies, what filings).
The first thing you need to do is have a list of companies or know the list of companies you want to pull information for and which filings contain that information. So XBRL does not deal with this issue at all and is, in my view, a flaw in XBRL because you cannot express this and therefore every software tool consuming information works differently.
In the SEC's case, they provide an RSS feed thich contains lists of reoprting entities and the submissions of those entities. You can find that RSS feed here:
If you go into one of those RSS feeds (look at the XML, THAT is what a computer application will read; you can VIEW this via a browser).
And so, if you notice my Excel application; it eliminates that step by providing a workbook which contains all this information.
STEP 2: What information do you want?
The next thing is that you need to know WHAT information you are looking for. Because the US GAAP Taxonomy uses multiple XBRL elements to express the exact same financial reporting concept, you are forced to "map" the XBRL concepts to what you are looking for.
For example, the US GAAP Taxonomy uses the elements "us-gaap:LiabilitiesAndStockholdersEquity" and "us-gaap:LiabilitiesAndPartnersCapital" to represent the one financial reporting concept "LiabilitiesAndEquity" which I am using within my system. This is a pretty basic example, "Revenues" is a much more complex example as you pointed out. But, the issue is EXACTLY the same: you need to understand WHAT to look for in the SEC XBRL financial filing.
I did this for 51 concepts that I am looking for, see the VBA code. I did this the WRONG way in my code, basically hard coding it. I have subsequently moved the "tree" of IF statements into a database.
STEP 3: Which period, which piece of the entity, which reporting scenario?
The next thing you need to do is for each concept that you want, you have to get the correct fact within the correct context. For example, you have "current period" and "prior period" and "year to date" and "quarter to date". You also have "consolidated entity" and different subsidiaries or business segements or geographic areas and so forth.
The SEC did a VERY, VERY nice thing. They made it EASY to identify the "current period" and the root reporting entity. So you can leverage the way the SEC says the information needs to be structured and the reported concept "dei:DocumentPeriodEndDate" (which is basically the balance sheet date of the current period). From that you can find the YEAR TO DATE period of the income statement. And because of the way the "root entity" or consolidated entity is consistently articulated because of the SEC rules, yuo can leverage that consistency to get the "consolidated" information.
What may be confusing you here is that I am "taking the happy path" through the information. I ONLY want (a) the CURRENT period, (b) the CONSOLIDATED entity, (c) of the ROOT entity. (Clearly if you want EVERY period and EVERY entity breakdown and EVERY scenario this is more complicated, but the same general ideas apply.)
STEP 4: Impute missing information
Because filers inconsistently report facts, I impute or "plug" missing information based on existing information and known relationships. For example, not all filers report "us-gaap:Assets". If a filer, say, has only one asset, "us-gaap:Cash", many leave out the concept "us-gaap:AssetsCurrent", "us-gaap:AssetsNoncurrent" and "us-gaap:Assets". However, I know that for 100% of all companies who report a classified balance sheet that "Assets = CurrentAssets + NoncurrentAssets". That is universally true, that relation NEVER changes and the fact that 98+ percent of filers FOLLOW that rule PROVES that the rule is true. The other 2% have confirmed errors in their filing or do strange things which are beyond the capabilities of my algorithm to handle.
STEP 5: TEST, TEST, TEST
To be SURE that I am getting the correct information I run the information through a battery of 21 tests which are universally true about balance sheets, income statements, and cash flow statements. Again, these tests are played out in the 7160 SEC filings that I am looking at (10-K only, not the other forms). There is SOME variability that I have dealt with such as a classified versus an unclassified balance sheet and a single-step or multi-step income statement.
STEP 6: Populate Excel
That is it, put the result in Excel for each of the concepts that you want.
Again, like I have said, this is "the happy path" to a small set of information. Getting at any OTHER information is some variation on the steps above: what entity, what filing, what fact, what part of the entity, what period, what scenario. There are sometimes other characteristics; but these ideas work for EVERYTHING. I can PROVE that, it is a fact.
YES, the more you deal with the other variations, the more complicated the computer code necessary. Things can be made easier by improving the US GAAP Taxonomy and eliminating duplicate instances of the same information, better organizing the taxonomy to enable the ability to more easily map the concepts used to the information you want. The system can also be improved by better inbound validation by the SEC to eliminate errors in the information. The system can be improved by minimizing the need to imply things because a filer did not explicitly make it available. But, these issues do exist now so they need to be dealt with.
All of the above is reflected in the VBA code shown in this prototype and in the working Excel application which you can run, turn your debugger on if you want, and WATCH the entire process unfold in front of your own eyes:
Does this get you to where you want to be? If not, tell me what I am missing and I will attempt to fill in those gaps.
Much of this stuff can be simplified by using a web service or someone who sorts this out for you. Data aggregators do this including the mappings, that is HOW they get that information. There is no magic, only hard work. That is the "high end". Other web serivces such as the XBRL Cloud web service don't provide the "final" interface, but they do all the "stuff" necessary to get you the information you want, you build your own interface.
The GOOD news is that you can now get the financial information of thousands of companies in minutes. Think of how long this process took three years ago when this task was either (a) 100% manual or (b) the parsing necessary was so complicated and cost prohibitive that only big companies could afford to do it. Maybe this is not perfect, but it works and is useful. In time, things will get even easier!
Accounting research just got a whole lot easier because of XBRL. You should start seeing commercial quality software which will enable accountants and other business users to perform tasks with far less effort than in the past in the near future.
How do I know? Well, because I built a working prototype of such software. The three videos below will walk you through what that software does. I have been maintaining a tool which I used to analyze SEC XBRL financial filings. What I could do was really interesting, but very limited due to my programming skills.
But now, I reconfigured much of my tool to use the XBRL Cloud Web Service which (a) replaces all of the infosets which I was working with and (b) provides independent renderings of EVERY SEC XBRL financial report component.
Way back in 1999 when we first started down the XBRL trail, I had created prototypes which showed how XBRL could be used to compare financial information across financial reports. I compared that to the very popular AICPA product Accounting Trends and Techniques. I have tried over, and over, and over to make this work how I wanted it to work. Well folks, this dream is now reality!
As I see it, everything else is about "the details." Does my prototype work perfectly? No, pretty much every aspect of this needs improvement. Is the data public companies are providing perfect? No, lots of room for improvement there also. But fundamentally, this stuff works and if you look closely you can get a glimmer of what financial reporting will look like in the future. Digital financial reporting does have its advantages.
Check out the videos and screen shots below and decide for yourself if XBRL might be useful to accountants after all:
- General overview: Provides a general walk through of the "Financial Reporting Trends" working prototype accountant's analysis tool.
- Closer Look at Filtering Entities and Disclosures: Focuses on showing you how you can filter the long list of entities into different helpful groupings. Also, shows how disclosures are organized in various ways to make working with them easier.
- Closer Look at Viewing Financial Report Information: Focuses on showing you what information you have to work with once you get to the component you want to have a look at within a reporting entity's financial report.
- Closer Look at Fundamental Accounting Information Extract and Working With Components: Focuses on showing you how a set of 51 fundamental accounting concepts are extracted from SEC XBRL financial filings and how a set of 21 tests are used to make sure the relations between those concepts are tested to be sure the information retrieved is correct. This Excel version of this prototype lets you try out extracting fundamental accounting information from SEC XBRL financial filings. Looking at the code helps you grasp how to do it.
- Screen shots: This is a set of screen shots to let you take a closer look at what my working prototype does.
And yes, all this stuff is in one application. The application was built using Microsoft Access which is a rapid prototyping tool. As I said above, a lot of the heavy lifting is done using the XBRL Cloud Web Service. My tool was NOT built to be a product. You could probably look at the tool I created and find 50 different useful products. Literally.
I am not trying to win any awards with the video, just trying to quickly show what the working prototype does and learn better how to communicate this information so keep that in the back of your mind. The prototype is helpful in showing what XBRL can enable and more importantly what it takes to make digital financial reporting work.
A commercial tool which lets you do some of what I can do right now is Calcbench. Note their tools for auditors. Tools for M&A and benchmarking. Tools for academia. Others will likely create other useful tools for accounting research.
My first pass yielded an information accuracy rate of 95%. The next pass pushed that up to 96.6%. The next realized 97.3%. Focusing only on commercial and industrial companies, basically creating sub categories but focusing on that primary category, yielded 98.4% accuracy. These incremental improvements were realized by tuning my algorithm and adding additional metadata used by that algorithm.
All this work ultimately resulted in a 98% overall accuracy rate for the extraction of fact values for 51 fundamental accounting concepts proven by the 21 relations between those facts expressed within SEC XBRL financial reports. This graphic provides a summary of these results:
All this work also resulted in a more precise understanding of why SEC XBRL financial filings to not pass the 21 tests which I have specified and ultimately in an inability to reuse information from an SEC XBRL financial filing using automated processes. That is the goal: robust, reliable, predictable automated reuse of the information.
And while I do want to be able to reuse the SEC XBRL financial information, the over-arching goal is an understanding of how to build systems which make use of XBRL-based information which are robust, reliable, predictable and the information within the system can be reused dependably. Without that, what good is the system?
If you have not tried it already, experiment with the Excel-based prototype for extracting information using my algorithm for grabbing SEC XBRL financial information. Also, you can see the algorithm which I am using.
The following is a summary of the specific issues which I have encountered which cause information reuse issues:
- Discovery of the root reporting entity for a small number of filers (about 58)
- Significant variability in the concept used by SEC filers to report revenues.
- Lack of clear totals for the sub categories which make up operating income (loss).
- Lack of clear totals for distinguishing between nonoperating income (loss) and interest and debt expense.
- Filers crossing categories of fundamental concepts (for example, including the total of one category within another category).
- Inappropriate extension of these high-level concepts.
By narrowing the focus of my analysis to commercial and industrial companies only (i.e. excluding banking and savings entities, insurance companies, brokers/dealers, real estate) and dropping two tests which I just cannot get get a high pass rate because of the variability within SEC filings; I was able to achieve a 98.4% accuracy rate. Basically, 4,048 of 7,160 filings fit this criteria (57% of total filings). In addition to that narrower set of 4,048 filings passing 98.4% of the tests, a total of 1,113 filings pass 100% of the tests.
Now, narrowing the testing to commercial and industrial companies seems very reasonable. All I am saying by this is that if the total of 7,160 reporting entities were broken into more specific groups of reporting entities and targeting my information extraction algorithm to specific groups of reporting entities the algorithm will be more accurate. Clearly it is easy to grasp why this is true. For example, banking and savings entities report their revenues in different ways than commercial and industrial companies.
However, I don't know whether I should just be discarding two tests which I know should work. The two tests are the following:
- Gross Profit = Revenues - Cost Of Revenue (IS1)
- Operating Income (Loss) = Gross Profit - Operating Expenses + Other Operating Income
The reason that SEC filings don't pass these tests is not that they don't follow the spirit of these equations. Rather, they don't pass because I cannot find the facts which report that fundamental concept. For example, Revenues is found for only 5,640 of the filers which is 79% of the total. That is because of the variability of the US GAAP Taxonomy concepts used to express this fact. I speculate that I can probably improve the success rate if I focused on trying to find more of the concepts used. But, that may not be true.
Also, if I eliminate developing stage companies and companies which have going concern qualifications, then I get 97.3% accuracy for the overall set of 7,160.
Hey software developers! Want to create a nice little business? Build tools which help accountants do digital financial reporting, such as SEC XBRL financial reports. There is no need to even understand XBRL. XBRL Cloud has a nice little web service which hides XBRL in the background. There are other XBRL processors out there such as Arelle (open source), Semansys, ReportingStandards, CoreFilings, and others. Pick your poison.
Me, I used XBRL Cloud to create this working prototype tool using Excel. Granted, I am not a programmer (i.e. that is why I am encouraging you to consider building tools). But take a look at what I was able to do in a matter of about 10 minutes using my tool: compare the balance sheets of the most current 10-K (on the LEFT side) of a reporting entity with the prior 10-K (on the RIGHT side). Here is the result:
- Apple: Note how they went from an extension concept to a US GAAP taxonomy concept for PPE.
- Boeing: They also went from an extension to a US GAAP taxonomy concept, a few other interesting things to notice.
- Home Depot: Moved the detail of PPE off the balance sheet, likely to the disclosures. Tuned their inventory concept.
- IBM: Went from using an explicit [Table] to using an implied table.
- Intel: Looks like they have no accrued taxes in the current period.
- McDonalds: Changed their network identifier for some reason.
- Microsoft: A few changes.
- Walt Disney: Interesting line item, Television Costs.
Easy to use, nothing really technical about this, most of XBRL is hidden in the background. Accountants can relate to this.
Grab my Excel prototype, it works and all the code is there. Clearly my code is not the best (not much error checking, nothing sophisticated, perhaps not efficient). If you even just improve this working prototype to make it more stable, make it work for ANY SEC filing, make the comparison work to compare a pre-submission filing, or even better; compare the RENDERING (i.e. the model structure AND the facts not at the syntax level but rather at the SEMANTIC level) you will have created a very, very useful product.
Accountants need these niche tools for specific tasks like this. Sure, business report creation tools are important but we need utility tools like this also. And this comparison utility is just one idea; I have hundreds of others.
Digital financial reporting is just getting started, the window of opportunity is open. For some things, sure, you need to understand XBRL. For others, it is far more important to understand the accounting aspects. Don't understand accounting? Ask an accountant.
If you create anything interesting please let me know.