I have updated Excel tool and expanded the set of reporting styles covered by the tool which extracts information from XBRL-based financial reports submitted to the SEC. The set of tools below covers 50% of all public companies that file with the SEC. The tool accurately extracts fundamental high-level financial information from the 3,015 covered economic entities.
So in my commercially available metadata set, my coverage is higher, 90% or more which is explained here in this document. But only about 87.9% report all of their high-level financial information correctly.
In this set of 3,015 public company financial reports, all the information is consistent with expectation and FOUR different report styles are covered. So the proof is in the pudding here. If you REALLY want to understand what it takes to get these reports correct and to extract information from the reports correctly; these Excel tools will help you understand that:
This step-by-step explanation helps you understand how the Excel extraction tool works. Note that while I am showing the extraction of primary financial report information, this same scheme applies to the disclosures also. And, while this Excel extraction tool only pulls information for the current reporting period, this same scheme can be used to get prior period information in a report also.
Each Excel spreadsheet has the information preloaded. But if you press the button on the "Compare" spreadsheet, the application will re-extract information directly from the public company's XBRL-based financial report.
My goal is to provide similar spreadsheets for a handful of other reporting styles and to synchronize the business rules in this application 100% with the commercial quality tool metadata used by XBRL Cloud and Pesseract. So, I already solved one of the two problems I am having. The commercial tools use declariative business rules that are provided via the XBRL technical syntax. You will note that my business rules (mappings, impute rules, consistency check rules) are hard-coded. That is not good. But, because I am not a very good programmer, that is the best that I can do. But, what I was able to achieve is to generate the VBA code from reading the XBRL-based information. For example, here are the mapping rules for the SPEC6 reporting style. I am not auto-generating the VBA code for the mapping rules from those XBRL definition relations.
I am also going to reorganize the XBRL files that provide this information. I have a lot of unnecessary duplication right now. That duplication resulted from the fact that I really did not know exactly how this process of creating the metadata would turn out when I started. I know now, so I am going to refactor the way I physically represent the information so that it is easier to debug and maintain.
One very important thing that I am observing is the difference between a "top down" and "bottom up" approachto creating XBRL-based taxonomies. I am still trying to figure out exactly how to articulate this. I don't know that "top down" and "bottom up" are the right terms. Other terms are "publisher focus" as contrast to "consumption focus". Another term is "restrict then loosen" as contrast to a "slack then restrict".
If you understand the principles at workwhen you take an existing reporting scheme and decide how you will make that work using XBRL, there are lots of things that need to be considered. Theoretically, the best way to implement is to create a restrictive as possible model; then incrementally loosen what is allowed to allow more within the system. That keeps everything in control. Alternatively, the approach where you provide a very loose model; but then you incrementally restrict the model to make it tighter and tighter looks very "sloppy" because you initially get a lot of errors.
While I am noticing that I am doing 80% (maybe more) additional work having to overcome things that were not included, but should have been included, with XBRL-based financial reporting; I really cannot see any other way to make this work other than how the SEC is approaching it. The "loose model" that the SEC started with was smart. Why? Because if a restrictive as possible model was used initially, public companies would likely have rioted.
Those incremental restrictions are emerging slowly but surely. Things like my fundamental accounting concept relations continuity cross checks, the disclosure mechanics rules, the reporting checklist rules, the XBRL US Data Quality Committee rules, etc.; those are incremental restrictions what will improve information quality and therefore make the information more useful.