BLOG: Digital Financial Reporting
This is a blog for information relating to digital financial reporting. This blog is basically my "lab notebook" for experimenting and learning about XBRL-based digital financial reporting. This is my brain storming platform. This is where I think out loud (i.e. publicly) about digital financial reporting. This information is for innovators and early adopters who are ushering in a new era of accounting, reporting, auditing, and analysis in a digital environment.
Much of the information contained in this blog is synthasized, summarized, condensed, better organized and articulated in my book XBRL for Dummies and in the chapters of Intelligent XBRL-based Digital Financial Reporting. If you have any questions, feel free to contact me.
Entries from March 1, 2013 - March 31, 2013
Summary of SEC XBRL Financial Filing Verification/Validation Results
The following is a summary of verification/validation results for a set of 7,199 SEC XBRL financial filings. All filings are 10-Ks for the fiscal year 2012 filed with the SEC between March 1, 2012 and February 28, 2013.
Here is some information helpful in understanding these validation/verification results:
- XBRL technical syntax. This information comes from the XBRL Cloud EDGAR Dashboard. Basically 99.9% of all SEC XBRL financial filings in the set of filings I looked at were valid per the XBRL technical syntax specification. What is a little odd is that there were 8 filings which were not valid. What that means is that there are some XBRL interoperability issues. The XBRL validator(s) used by the SEC for inbound validation did not detect the 8 filings as not being valid against the XBRL specification. XBRL Cloud believes that there are XBRL technical syntax errors in these 8 filings. I validated each one against another XBRL processor and sure enough, I found errors in each of the 8 filings reported by that XBRL processor also. As such, I would concur with XBRL Cloud's assessment. (I also posted a message to the XBRL Specification Working Group informing them of these errors. Hopefully a conformance suite test will be added for each of the offending errors, which would improve consistency between XBRL processors.
- Edgar Filer Manual (EFM) automatible rules. Again, this information comes from XBRL Cloud. Why would XBRL Cloud's EFM validation differ from SEC EFM validation? Mainly because of different interpretation of the rules by software vendors (i.e. the ones with the errors), the SEC, and XBRL Cloud. It would be very helpful if all software vendors and the SEC reported the same consistent results. This is getting better, but still has a ways to go given that almost 20% of all filings have EFM rule violations. And these are only the automatable rules.
- US GAAP Taxonomy Architecture Model Structure. These rules relate to the relationships between [Table]s, [Axis], [Member]s, [Line Items], [Abstract]s, and Concepts within an XBRL taxonomy. They also relate to a degree to differences (i.e. ambiguity) between presentation, calculation, and definition relations. These relations are explained in the US GAAP Taxonomy Architecture (see section 4.5, Implementation of Tables). Of all the 10-K filers, 98% follow these rules.
- US GAAP Domain Level Rules. These are a small set of rules (i.e. there are many more) which every SEC XBRL financial filing must follow: assets must be reported, liabilities and equity must be reported, assets = liabilities and equity, equity must be reported, net income (loss) must be reported, and net cash flow must be reported. Why are these rules true? Well, given that 96% of all filers follow these rules and that every filing which I looked at clearly did NOT follow the rule (but should have); it is pretty safe to say that all filers should pass these rules. That is what "domain level" means, US GAAP specifies these sorts of things. Ever hear of the accounting equation: Assets = Liabilities + Equity?
- Balance sheet roll up rules missing. Balance sheets have "assets" and "liabilities and equity". Balance sheets also balance. "Assets" adds up, or "rolls up". Same for "Liabilities and equity". 93% of all filers provide business rules in the form of XBRL calculations which are used to prove that the roll ups are working correctly. 7% did not. If the rules are provided, then automated processes can be used to prove that the line items of the balance sheet roll up. If they are not provided, you can only resort to manual error to make sure things add up. A clue as to which process might work best: automated processes save time and money.
- Income statement roll up rules missing. As with the balance sheet, the income statement rolls up. It foots. Part of the reason why there are so many missing XBRL calculation relations is that filers put numbers in their XBRL filings backwards because they try to get the polarity (positive or negative) to match the presentation. I speculate that they give up in frustration or believe that they just cannot get them to add up correctly. A clue that you CAN make these add up is that about 80% of all filers seem to be able to provide these XBRL calculation rules.
- Cash flow statement roll up rules missing. Again, same deal for the cash flow statement; the line items add up and filers seem to be having problems with the polarity.
This is just a small set of the automatable rules which one can test for. Why would you do that, seems like more work? Well, it is actually less work. Automating the testing of correct modeling, computations, and other rules and relations makes it easier to prove to yourself that you did a good job creating your SEC XBRL financial filing.
As the automated rules grows the quality of SEC XBRL financial filings will improve. Why? Click here.




Correlation Between Quality and Software Shown by Missing Balance Sheet Rollups?
I believe that I am seeing my first definitive correlation between the quality of an SEC XBRL financial filing and the software used to create that filing.
What I did was look at all 7,199 SEC XBRL 10-K filings which I have been analyzing to try and find how many filers provide XBRL calculations for their balance sheets. All SEC filers provide balance sheets. All balance sheets balance, that is "the accounting equation". The XBRL Cloud Edgar Dashboard shows that this is true (see the column Domain Level Rules).
Well, balance sheets also foot. Assets foot. Liabilities and equity foot. And so I ran a test against all 7,199 SEC XBRL financial filings, the 10-Ks in my set, and this is what I found:
- Of the 7,199; a total of 6,679 (92.8%) do provide XBRL calculations for the assets and liabilities & equity roll ups.
- Of the 7,199; a total of 520 (7.2%) do not.
But wait, this is the interesting part. There seems to be a pattern in the software used to create SEC XBRL financial filings. This is a breakdown of the same information by software used to create the SEC XBRL financial filing: (This Excel spreadsheet has the detailed results.)
There are two important things that I see in this data. First, notice all the software vendors with zero missing XBRL calculations relations. That means consistency. Nothing falls through the cracks.
The second thing to notice is the high rates for a number of filers as compared to the rest of the pack. Reach your own conclusions about what this means. What it means to me is this: Why would there be any statistical correlation between whether an SEC filer needs business rules expressed and the software used to create the SEC XBRL financial filing?
This is, I believe, the first clear evidence that I have seen which shows a correlation between software and SEC XBRL financial filing quality. I specuate that there will be more such correlations revealing themselves in the future as people dig into the filing information more.
What do you think?
What I think is that this is another clue that a financial report quality model will emerge.




Prototype Grabs Expanded Set of Reported Facts from SEC XBRL Financial Filings
I cherry picked from a set of 7,199 SEC XBRL financial filings, all of which are 10-Ks filed for fiscal years ended in 2012 (approximately). You can get to this data set here in HTML, or you can download the data in Excel here.
My data set includes 2,213 filings, about 31% of the total. The criteria which I set for picking the filings were as follows: (these criteria were arbitrarily set my me to improve the chance of all the numeric relations both working and PROVING to myself that they work using automated processes)
- The filer had to have core financial integrity correct meaning that I had to be able to find assets, liabilities and equity, equity, net income, and net cash flows in their filing.
- The filer is a commercial and industrial company, meaning that they report current assets and current liabilities.
- The filer reported total liabilities.
- The filer reported cash flows from operations, investing, and financing activities.
- The balance sheet balances.
- On the balance sheet, liabilities + commitments and contingencies + temporary equity + equity = assets.
- On the cash flow statement, operating cash flows + investing cash flows + financing cash flows + cash flows from discontinued operations + exchange gains/losses = net cash flow.
- On the income statement, income from continuing operations before taxes - income taxes + income/loss from discontinued operations + extraordinary items = net income.
The 2,213 filings that you see on the HTML page and in the Excel spreadsheet made that cut. There are links to both the SEC filing and to the XBRL Cloud Free Viewer so you can go look at the financial statements if you want.
What does this mean? Well, I am simply experimenting so I don't know that there are any definitive conclusions; but I think that the fact that I can pull all this information from those filings is a positive thing.
Granted, I cherry picked the SEC filers. This is the "happy path" through the data. I am avoiding filings which have errors or which have odd reporting situations. I am sure I can get more filings which pass all the rules, for example if I look at filers who report but don't use a classified balance sheet, I speculate I would get a longer list. (Well, in fact, just to check I ran this query and I got 3,272 filers which do not have classified balance sheets but to pass all the other rules; that is 46% of all filers.)
One area where I am not able to effectively pull information from is the income statement above the line "income from continuing operations before taxes". I will look into that next perhaps. If I can do that I will be able to get high level financial information from the balance sheet, income statement, and cash flow statement. Again, I am not using an XBRL processor. My tool is simply a Microsoft Access database.
Play with the data. If you find anything interesting let me know. If you have any ideas for experiments to run, let me know that also.




Finding Revenues in SEC XBRL Financial Filings
Looking at how one would find the reported fact which represents total revenues within an SEC XBRL financial filings helps you understand finding reported facts more generally.
I took a look at how each of the 30 reporting entities which make up the DOW reported revenues and this is what I found. Note that I am looking at the 10-K reported for fiscal years ended in 2012 generally.
Of the 30 reporting entities which make up the Dow, revenues was very easy to determine using automated processes for 27 of those entities. These are the exceptions:
- American Express was the only member of the DOW which did not report one total amount which represented total revenues. If you look at the American Express income statement you will see that AMEX reported total non-interest revenues and total interest income, but did not provide a total for the two. For contrast, look at the JP Morgan Chase income statement; they had basically the same situation as AMEX but JP Morgan Chase DID provide total revenues. It is exponentially harder to be sure that you got the correct revenues fact if the reporting entity does not provide total revenues such as using the concept "us-gaap:Revenues" or "us-gaap:SalesRevenueGoodsNet". Another way to see this is to see how consistently easy it is to find reported facts such as Assets, Liabilities and Equity, Equity, and Net Cash Flow.
- Du Pont was one of two reporting entities which created an extension concept to express total revenues. If you look at the Du Pont income statement you will see that they created the extension concept "dd:TotalNetSalesAndOtherIncomeNet". There is no better way to make discovering a high level concept such revenues impossible then by creating an extension concept. Again for contrast, General Electric had virtually the same reporting situation and did not find the need to create an extension concept. I did not look into whether the extension by Du Pont was appropriate or not.
- EXXON was the other reporting entity which created an extension concept for total revenues. If you look at EXXON's income statement you will see that they reported the line item "Total revenues and other income" using the extension concept "xom:TotalRevenuesAndOtherIncome". Again, I point out that General Electric did not find this need.
With regard to Du Pont and EXXON either one of two things must be true: (a) there is a concept missing from the US GAAP Taxonomy which should exist which they could have used or, (b) Du Pont and EXXON could have made a better concept selection choice.
Perhaps why EXXON chose to create an extension concept is because it does something that the US GAAP Taxonomy does not seem to provide for. EXXON also created an extension concept for the line item "Sales and other operating revenues" (xom:SalesAndOtherOperatingRevenueIncludingSalesBasedTaxes). Then EXXON deducts "sales-based taxes" in the expense section. Comparing this with a few other reporting entities in the same industry:
- Chevron does not use this approach.
- Phillips does not either.
- Marathon DOES use this same approach for excise taxes, however they do NOT create an extension concept for their revenues line item (EXXON did), but Marathon, like EXXON, does create an extension concept for total revenues "mpc:TotalRevenuesAndOtherIncome".
- Hess excludes excise taxes from revenues and uses the same approach as Chevron and Phillips.
Based on the facts that I see, it appears to me that EXXON could be right to create an extension concept for total revenues and a concept is missing from the US GAAP Taxonomy for this type of situation. The expenses concept appears to exist. If the revenues concept existed it would be easy enough to write an algorithm to net these two numbers and create comparable revenues numbers for all reporting entities, even if they use different approaches.
EXXON and Marathon using different ways to express the same accounting approach using XBRL. Both of these companies might have better explained why they are doing this in the documentation for the concepts added to their taxonomy.
What this does show is the power of XBRL to handle variety such as this. This is how US GAAP works (allowing variety such as this), this is how XBRL was designed, and this flexibility can work. US GAAP is not random free-for-all, but it does allow for different approaches all of which are allowed. US GAAP is flexible. Financial analysts understand all this sort of stuff and make adjustments to their financial models for such different accounting approaches.
The US GAAP Taxonomy, SEC filings, and data extraction algorithms will all evolve and eventually all this will work and information extraction will work, it will work predictably, and automated information reuse will become a reality; not just for revenues. This revenues example is just that, an example. This type of situation exists for many other financial report line items and disclosures.
These types of situations just need to be uncovered and dealt with. Also, this example points out that not all extension is bad. Extension can point out missing concepts from the US GAAP Taxonomy. An extension such as that provided by EXXON can make important differences stand out. What is best? (a) automating information extraction without knowing this difference or (b) forcing those trying to extract information to recognize this difference and properly adjust for it?
What do you think?




Two Prototypes Using SEC XBRL
(Note that I updated the S&P 500 information to correct an issue where I was not finding "Equity" for a handful of reporting entities. This correction is reflected in the current data set.)
As part of some other things that I am doing and some experimentation I have created two prototypes of using SEC XBRL information provided in 10-Ks. I am only trying to use very basic financial information (assets, equity, revenues, net income, net cash flow), but I think these helpful in seeing the possibilities.
Here are links to the two prototypes:
Information extracted from SEC XBRL financial filings was done without the help of an XBRL processor. What I am trying to do is see how reliably I can extract very basic financial information. Places where I am having extraction issues are clearly indicated. What is NOT indicated as well is where I am pulling the wrong information. For example, it is hard to know for sure if I am getting the "revenues" numbers correctly because of the way filers put this fact in their SEC XBRL filings. I know that revenues for American Express Company is incorrect because they break out non-interest and interest revenues and do now provide a total (i.e. most companies provide a total for revenues using a common set of concepts).
Other issues related to uncommon uses of [Axis] cause issues. For example, while most filers use the legal entity [Axis] and indicate the legal entity as either the consolidated entity or parent holding company; this SEC filer does something radically different, (a) they use the name of their company as the value of the legal entity [Axis], but complicating things even more they (b) do not make this the default dimension. This sort of inconsistency makes using the data much more complicated and increases the risk of picking the wrong information to use. This filer does something slightly different. Personally, I see these sorts of inconsistencies as both unnecessary and they clearly increase the risk of automating the reuse of the information.
The first prototype; the summary information for the Dow 30, Fortune 100, and S&P 500; shows the error rate to be fairly low for these key pieces of information. For the Dow, there are 150 pieces of extracted information (30 companies times 5 data points) with only one occasion where I could not find the fact which I was looking for. General Electric chose to muck up the works by providing an extension concept for "net cash flow". I am NOT saying that all the numbers are 100% correct. That is a lot of work to test and I am not to that point yet. But, finding things which seem to work are a very good first step to achieving the XBRL vision of reusing the information.
Likewise, the error rates for the Fortune 100 and S&P 500 are fairly low. I calculate a .6% error rate for the Fortune 100 and a 1.9% error rate for the S&P 500. Not bad, but again; any error rate of more than 0% will yield a less than satisfactory result. There were some other issues relating to the SEC RSS feed which showed themselves from trying to use the S&P 500 information. A number of filings don't seem to show up in the SEC RSS feed. Not sure why, but this is sure annoying. Also, I have some duplication of some companies. I have not yet gotten to the bottom of how that was caused, still working on that.
The raw data is provided in Excel. Fiddle with it. If you find anything interesting please let me know.
The second prototype shows even more possibilities. The S&P 500 Additional information links to a number of other web pages creating a nice mashup. Most of the information which I used came from the Wikipedia list of S&P 500 companies web page. What I had to do though was manually put the SEC CIK number on the Wikipedia list in order to cross reference the information which I had with the information Wikipedia had. The reason is (a) the Wikipedia information did not provide the CIK number which was the key I had to use and (b) the SEC filings did not provide the company ticker symbol for every company nor did they provide the exchange on which the stock was traded.
Metadata like this CIK number is critical for putting lists of things together. Another piece which I added (and I am not done yet) is the auditor. That is not provided anywhere in the XBRL. I had do go read the HTML page where the name of the auditor does exist. Perhaps the audit report will eventually be expressed using XBRL and then the auditor will be easy to grab. Be way, way easier to use this information if the SEC required it in the SEC XBRL financial filing. For example, I have all sorts of interesting information about the generator software used to create the SEC XBRL filing. That is provided by software vendors and can be gleaned from the XBRL (it is in a comment).
This stuff is going to be so useful (and cool!) when it works correctly. By looking at this sort of prototype it is easier to see the gaps between what exists and where we will end up.



