BLOG:  Digital Financial Reporting

This is a blog for information relating to digital financial reporting.  This blog is basically my "lab notebook" for experimenting and learning about XBRL-based digital financial reporting.  This is my brain storming platform.  This is where I think out loud (i.e. publicly) about digital financial reporting. This information is for innovators and early adopters who are ushering in a new era of accounting, reporting, auditing, and analysis in a digital environment.

Much of the information contained in this blog is synthasized, summarized, condensed, better organized and articulated in my book XBRL for Dummies and in the chapters of Intelligent XBRL-based Digital Financial Reporting. If you have any questions, feel free to contact me.

Entries from February 2, 2020 - February 8, 2020

Future Directions for Semantic Systems

In his paper, Future Directions for Semantic Systems, John Sowa points out the following:

No one can predict what innovations will be discovered in the future, but commercially successful systems must satisfy two criteria: first, they must solve problems for which a large number of people need solutions; second, they must have automated and semi-automated methods for acquiring, analyzing, and organizing the required knowledge.

Hamed Mousavi and I created a working proof-of-concept that is a semantic system.  We call the software application Pesseract. You can download the software (requires Windows), fire it up, and give it a spin.  We created a bunch of demo scripts that you can walk through.

If you cannot or don't want to download the application, you can watch a set of videos that show the semantic system in action.

To get the semantic system to work, we had to create the "semantics".  Those semantics are in the form of information related to several financial reporting schemes represented in XBRL.  There is at least one commercially available software application that also uses those semantics, XBRL Cloud. There is at least one open source software application that uses these same ideas, XBRLQuery.com.

We have taken information from these three and other implementations to create a high-level logical conceptualization of a business report. Keep in mind that a financial report is a type of business report.  There have been several iterations of this logical conceptualization.  OMG's Standard Business Report Model (SBRM) is the most current version of that logical conceptualization.

All of this is proven to work.  I can also explain how it works. The document Special Theory of Machine-based Automated Communication of Semantic Information of Financial Statements explains how it all works.

The Digital Financial Reporting Manifesto explains why all this is useful.  While all of my metadata works and allows for these ideas to be explored and tested, the metadata is not sufficient to be a complete financial reporting scheme.  A lot more metadata is necessary to provide a complete system.

Whether there is enough existing metadata to begin enabling machine learning based approaches to help create new metadata is unclear.  I don't think that there is quite enough yet to semi-automated metadata creation.  But, I could be wrong.

The next step is to turn the working proof-of-concept into one or more products.

Posted on Thursday, February 6, 2020 at 10:36AM by Registered CommenterCharlie in | CommentsPost a Comment | EmailEmail | PrintPrint

Storing Financial Information in a Database

Most people that look at XBRL-based financial reports and then want to store XBRL-based financial information from those reports in some sort of database make the fundamental mistake of "storing XBRL".  This is, as contrast to, storing financial facts or information that is helpful in using those financial facts in some sort of "fact database".

This blog post provides information that helps you not repeat this sort of mistake.

First, have a look at the UML model of a business report created by SBRM (keeping in mind that a financial report is a type of business report). Note that if you want to work with XBRL-based financial reports, you are going to want to understand the OMG Standard Business Report Model (SBRM).

Second, while you can store properly modeled financial information within a relational database very effectively; relational databases might not always be the best storage alternative.  For example, TigerGraph and ArangoDB are a graph database that seems to be a rather interesting information storage alternative.  That may, or may not, be an appropriate data storage scheme.

Years ago, I helped a company named 28msec put all of the XBRL-based financial reports submitted to the SEC into a NoSQL database.  They really studied XBRL and frankly, I learned a tremendous amount from this experience.  After this project was done, Ghislain Fourny who is a data scientist and database expert wrote this paper, Cell Stores, and the book, The XBRL Book: Simple, Technical, Precise.  If you want to put XBRL-based financial information into a database; I highly recommend that you read both that paper and the book.  That offers clues about the best information storage formats.

In the Cell Stores paper, Ghislain makes an analogy of financial information to the different states of matter:

Cells are the primary citizens and exist in different forms, which can be explained with an analogy to the state of matter: as a gas for efficient storage, as a solid for efficient retrieval, and as a liquid for efficient interaction with the business users.

Now, I personally do not understand how to store information the most effectively in a database.  But I can tell you the general use cases that business professionals have.  You can distill those use cases down into three query patterns: 

  1. Specific information: Getting some set of information from one specific report. For example, "Give me the most current balance sheet of Microsoft" or "Give me 'assets' for Microsoft as of December 31, 2019." (Example 1 Microsoft 10-K | Example 2 Microsoft 10-K |Example 3 XBRL Cloud | Example 4 SEC RSS Feed)
  2. Period comparison: Comparing some set of information across a set of reports of a specific economic entity across periods.  For example, "Give me a comparison of the balance sheet of Microsoft for the last 10 years." (Example 1 Cash Flow | Example 2 Comprehensive Income)
  3. Entity comparison: Comparing some set of information across a set of reports across a number of economic entities for some specific period.  For example, "Give me a comparison of the balance sheets of every economic entity in the DOW 30 for December 31, 2019." (Example 1 Dow | Example 2 Airlines)

That is really about it.  The "magic" is to filter all that XBRL-based information to get exactly what you are looking for.  To do that, the information needs to be stored approprately to allow for effective information retrieval.  And so, you want to store the information such that it can easily be reconfigured into the three types of queries above at a minimum. 

There are a lot of details involved in that process.  Key is storing meaning, not syntax.  What is critically important to understand is that financial reports ARE NOT FORMS!!!.  Does that make the work involved harder?  Absolutely it does.

But, the task for all three common query use cases can be distilled into three specific sets of information (XML infoset examples provided for each information set) that "convert" the "gas" into a "solid" per the analogy above: 

  1. What structure? Do you want the balance sheet, the income statement, the revenue recognition policy, or the long-term debt maturities disclosure?  Each set of information is essentially some "structure".
  2. What facts?  Do you want information for fiscal year 2019, fiscal year 2018, year-to-date, quarter-to-date, actual, budgeted, etc?
  3. What rules?  What are the mathematical relationships between the facts in the structures?  If you want to render this information in human-readable form, what are the rendering rules that should be used?

All of this is explained in detail in the document Fact Sets. A financial report is not one big thing, it is a system of lots of little things.  That system is logical.  That system has logical patterns. A pattern is any form of correlation between the states of elements within a system. Software engineers can leverage patterns.

XBRL Cloud does a lot of this in their Edgar Report Information database.  Watch this video to get a general idea of the information you can extract for any piece of any XBRL-based financial report.  What XBRL Cloud has is not perfect, but it is pretty close and is available via an easy to use REST API.

Here is specific information for the 2017 Microsoft 10-K. In the ZIP file are Excel spreadsheets showing the fact sets discovered by XBRL Cloud which does a pretty good job and Pesseract which does an excellent job.  Also included are the XML Infosets that were generated by each application.  You can look at a human readable rendering of the 2017 10-K of Microsoft here.

When represented correctly, XBRL presentation relations, XBRL calculation relations, XBRL definition relations, and XBRL formula become information that a query can use to extract facts.

For example, the Microsoft 2017 10-K contains 128 networks, 128 hypercubes, and 194 different "blocks" of information. A "block" and a "fact set" are synonyms.  Facts are fine to work with, when you are after individual facts such as "Assets" or "Net Income (Loss)".  But if you want to work with a specific disclosure, you need some specific set of facts.  XBRL presentation, calculation, and definition relations are helpful in extracting that information and XBRL formulas and other rules are helpful in making sure you put the pieces together correctly.  (here is similar information for Microsoft's 2016 10-K report)

The facts within an XBRL instance are organized per the rules of the XBRL technical specification.  But, you can reorganize those facts to make the information easier to work with.  You can specify how to reorganize those facts using global standard XBRL!

Now, again, it is unclear to me how to best store this sort of information in a database or what type of database is best.  But what is crystal clear is that if you want to work with that information, you want to work with it logically and therefore it should be stored logically.  Packing a bunch of XBRL into a database will be unsatisfying.

Here is some information about how 28msec queries worked which uses my model of the meaning of XBRL-based financial reports which is similar to how XBRL Cloud implemented a fact table and how Pesseract implemented a fact table and consistent with where SBRM is going:

 

 

More information coming soon.

Posted on Sunday, February 2, 2020 at 08:30AM by Registered CommenterCharlie in | CommentsPost a Comment | EmailEmail | PrintPrint