« Excel Users: Easy to Use Web Service for Accessing SEC EDGAR Information | Main | SEC XBRL Information 95% Correct for Tested Set of 51 Information Points »

Accuracy Rate 96.6% for Automated Reuse of Select SEC XBRL Information

SEC XBRL financial reports are 96.6% accurate for a select set of 51 fundamental accounting concepts.

In my prior post I stated that I had an accuracy rate of 95% for SEC XBRL financial information for a select set of 51 information points that I was going after.  These 51 information points are fundamental accounting concepts which every financial filing has whether they are reported or not. For example, if comprehensive income is not reported, that means that comprehensive income is the same as net income and that other comprehensive income must be zero.  Why?  Because:

Comprehensive income = net income + other comprehensive income

These fundamental accounting concepts have very precise definitions and very precise relations which are not changeable.  Accountants have no discretion here.  These are the rules.  Check any intermediate accounting text book or the acconting rules.  But even better proof is that over 90% of SEC XBRL financial filings follow these rules.  Here is my proof:

If you look at that graphic you will see that for the 21 tests that I have, for 20 of those tests about 90% or more SEC XBRL financial filings pass the test.  Only one test, IS3, falls below that threshold.  More about that later.

If you do a little calculation which you can follow on that graphic, you can see that I am now up to 96.6% accuracy.

Further, I am understanding exactly why information from an SEC XBRL financial filing is incorrect.  I can look at the information from any SEC XBRL financial filing, look at that filing by going to the XBRL Cloud Viewer, and see why the filing is not passing the test.  I use that information to either (a) adjust my algorithm, (b) adjust my test, or (c) confirm that there is, in fact, an error.

Here is a screen shot of the interface I created to perform this work (click on the image to see a larger image):

Here is a link to the Adobe filing within the XBRL Cloud viewer.

Now I will be the first to admit that I have taken the "happy path" through this information.  I am looking at only 10-K filings, not all the various forms reporting entities file.  I am looking at only the current balance sheet date and the year-to-date information, not all the information for all periods provided in a filing.  I am not looking at all the different entity breakdowns provided, only at the consolidated entity or the parent holding company which is the "root" reporting entity.  I am ignoring the minority of reporting entities which don't provide a balance sheet, but rather provide a statement of net assets.  I am not considering certain discretions which accountants do have, such as including the exchange gains (losses) in the roll forward of cash (which a handful of reporting entities do, which is allowed per US GAAP) rather than within net cash flow like most reporting entities do.  Yes, a happy path.  But, all those other things are just additional use cases of the exact same fundamental approach I am using.

The relations for the balance sheet and cash flow statement are very straight forward and even non-accountants can grasp that those probably make sense: "Assets = Current assets + Noncurrent assets" is easy to understand.  Even the income statement is not that challenging for most of its areas.

The one place where I am having difficulty in computing the fundamental accounting relations correctly using the technique which I am using is all the stuff which is included within "Operating Income (Loss)".  There are exactly five reasons for this difficulty:

  1. Discovery of the root reporting entity for a small number of filers (about 54)
  2. Significant variability in the concept used by SEC filers to report revenues.
  3. Lack of clear totals for the sub categories which make up operating income (loss).
  4. Filers crossing categories of fundamental concepts (for example, including the total of one category within another category).
  5. Inappropriate extension of these high-level concepts.

The list of SEC filers who pass all of my tests has grown from 294 to 584. That is less than 1% of all filers. Not that great.  However; it is a start and it proves that it is in fact possible to pass all my tests.

My next step is to get this entire process repeated by someone else to see if they get the same results that I obtained.  Also, I would like to figure out how solid that number of 96.6% really is.  If it is correct, the day that "reuse" of all this financial information has arrived or will arrive very soon.  Understanding what information is correct actually serves two purposes: (a) knowing that the information that you are reusing is correct and (b) information about what error needs to be fixed to make a filers information correct and reusable.

There are many, many other things which I have learned during the process of figuring this stuff out.  Stay tuned to my blog for additional information.

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
All HTML will be escaped. Hyperlinks will be created for URLs automatically.