Blog: Digital Financial Reporting (using XBRL) - XBRL-based structured digital financial reporting

BLOG: Digital Financial Reporting

This is a blog for information relating to digital financial reporting. This blog is basically my "lab notebook" for experimenting and learning about XBRL-based digital financial reporting. This is my brain storming platform. This is where I think out loud (i.e. publicly) about digital financial reporting. This information is for innovators and early adopters who are ushering in a new era of accounting, reporting, auditing, and analysis in a digital environment.

Much of the information contained in this blog is synthasized, summarized, condensed, better organized and articulated in my book XBRL for Dummies and in the chapters of Intelligent XBRL-based Digital Financial Reporting. If you have any questions, feel free to contact me.

Entries from July 1, 2015 - July 31, 2015

Understanding the Utility of a Reasoner or Inference Engine

A reasoner is software that is able to infer logical consequences from a set of asserted facts. Every reasoner uses some sort of logic. For example, first-order predicate logic is a type of logic. Every reasoner works with some set of axioms. An axiom describes some logical fact. The capabilities of a reasoner depend on the expressiveness of the kind of logic that the reasoner uses and the axioms provided for the reasoner and logic to work against.

Reasoners are sometimes referred to as inference engines because while, as stated above, reasoners work with asserted facts; reasoners can also use the rule of logic to deduce theorems. Theorems are indirectly deduced facts. Theorems are deductions which can be proven by constructing a chain of reasoning by applying axioms. Basically, a reasoner and an inference engine is the same thing.

A rules engine is also a reasoner. Another name for a reasoner or inference engine or rules engine is semantic reasoner.

An XBRL Formula Processor is basically a reasoner. Did you realize that? I will get back to that in a moment.

Clearly a human's capacity to apply logic is greater than a computer's capacity to apply logic. In fact, computers are machines and really can't think or apply logic. All that a computer can do is mimic or simulate or emulate a human's ability to think. Some computer programs that mimic human thought or perform some task for humans are called expert systems. Every expert system uses a reasoner to figure out what that system needs to do for the human and how to do it.

I pointed out that care has to be taken in order to express facts in a form that is safe, reliable, predictable, and repeatable. There are four catastrophic problems that a computer can run up against;

Undecidability (i.e. must be decidable)
Infinite loops (i.e. must eliminate possibility of cycles)
Unbounded structures or pieces (i.e. must have known set of structures)
Unspecific or imprecise logic (i.e. things like fuzzy logic is not allowed in this type of system)

Correctly balancing the expressiveness of a logic and the safety, reliability and predictably of a piece of software to return useful information takes conscious, skillful effort and execution. Years of experimentation in the area of expert systems and artificial intelligence has yielded invaluable information in achieving this balance.

First-order predicate logic is a formal way of expressing logic in a manner that is machine-readable.

While first-order predicate logic is expressive and powerful in performing work, it is not decidable and other problems can occur.

PROLOG is an attempt to address issues with first-order logic. In creating PROLOG, the problem of decidability and cycles was partially addressed by limiting which first-order predicate logic statements can be used to a Horn clause. But even PROLOG had issues and so further restrictions were made to first-order logic expressed using Horn clauses and Datalog was created.

DATALOG is a restricted subset of PROLOG. DATALOG is described as a query language based on logic. People are combining relational databases and DATALOG and creating what they call "deductive databases". Datomic is one such database. It seems that DATALOG is a de-facto standard deductive query language. (Here is more information on DATALOG.)

The semantic web folks seem to have had a similar evolution. They started with OWL FULL or older versions of OWL and then created limitations to deal with the problem of decidability. State-of-the-art semantic web technologies such as OWL 2 DL have been limited to solve the problem of decidability by limiting the logic to SROIQ description logic which is decidable.

OWL 2 DL has a boatload of reasoners. What I don't understand is the relative expressive power of an OWL 2 reasoner and something like DATALOG.

However, SROIQ description logic does not support expressing mathematical relations. The reason is, some math is not decidable. Eventually they will fix that most likely.

Back to XBRL Formula Processors. An XBRL Formula Processor is generally seen as something that validates XBRL instance facts. Says so right here in the XBRL Formula 1.0 Specification, see the Abstract section. But it is becoming pretty clear to me that what an XBRL Formula processor really is, is a business report reasoning engine. Or rather, that is what it SHOULD be in my opinion.

XBRL Formula has some distinct advantages over something like OWL 2 DL. The first advantage is that XBRL Formula does math. The second thing is that XBRL Formula has an understanding of XBRL Dimensions. That means that not only can XBRL Formula do math, it also supports a dimensional model.

However, there are several deficiencies in XBRL Formula processors:

XBRL Formula processors do not support process chaining. Supporting chaining was discussed but they decided not to do it. PROLOG and DATALOG support chaining. Not sure is OWL 2 DL supports chaining.
XBRL Formula processors do not understand and use the "general-special" or "alias-essense" standard XBRL arcroles. Basically, XBRL Formula processors don't understand class relations.
XBRL Formula processors are focused on XBRL instances, they don't provide much functionality for working with XBRL taxonomy information.

My personal opinion is that the world would be a better place if something that had the combined functionality of something like DATALOG and an XBRL Formula Processor; if that combined piece of software struck the correct balance between expressive power and safety/reliability/predictability (i.e. it avoided those four logical catastrophes); and if there was a layer build that helped business professionals work with all this stuff effectively and successfully.

Per the law of conservation of complexity and the idea of irreducible complexity; not until this business report reasoner exists can XBRL ever really be usable by the average business professional. But imagine if such software did exist. Any business professional could build their own little or even big expert system inexpensively.

Posted on Thursday, July 30, 2015 at 07:37AM by

Charlie in Becoming an XBRL Master Craftsman |

Post a Comment |

Email |

Vision of a Semantic Spreadsheet Getting Clearer

This is the definition of a spreadsheet provided by Wikipedia:

A spreadsheet is an interactive computer application program for organization, analysis and storage of data in tabular form. Spreadsheets developed as computerized simulations of paper accounting worksheets. The program operates on data represented as cells of an array, organized in rows and columns. Each cell of the array is a model–view–controller element that may contain either numeric or text data, or the results of formulas that automatically calculate and display a value based on the contents of other cells.

A spreadsheet is essentially a domain-specific programming language. What? A spreadsheet is a programming language??? There are essentially two fundamental pieces to a spreadsheet: (a) the model and the spreadsheet language and maybe (b) macro language that can be used to relate one cell with another cell. The model is expressed via a modeling language which expresses the rules that outline the structure of a spreadsheet. The language states things like a workbook is made up of spreadsheets, a spreadsheet is made up of rows and columns which intersect to form cells. The macro language used for expressing relations between cells and manipulating the values of cells or even the structure of the spreadsheets, columns, rows and cells of a workbook or even set of workbooks.

There are three key things about spreadsheets that one should be aware of:

Note the statement "data in tabular form" in the Wikipedia definition of a spreadsheet.
Note that "workbook" and "spreadsheet" and "column" and "row" and "cell" are presentation oriented terms and structures.
Note that the programming language or macro language specifically understands what a workbook is, what a spreadsheet is, what a column is, what a row is, and what a cell is. The programming language also has general features such as "if...then" statements, "case" statements, and other such common programming functionality.

This is my definition of a semantic spreadsheet:

A semantic spreadsheet is an interactive computer application program for organization, analysis and storage of multi-dimensional information. Semantic spreadsheets developed as computerized simulations of set of paper accounting worksheets. The program operates on information represented as cells of an array, which can be visualized in rows and columns of something similar to a dynamic pivot table. Each cell of the array is a model–view–controller element that may contain either numeric or text information.

Unlike a spreadsheet which is connected presentational via the rows, columns, and cells of a sheet which don't have names but rather labels such as "Row 1" or "Column B"; semantic spreadsheets are connected together via the meaning and logic of the information itself.

Unlike a spreadsheet whose cells are manipulated by a programming paradigm that is generally procedural in nature; a semantic spreadsheet is described and verified to be represented correctly against that description using a logic-based programming language. PROLOG is an example of one such logic based language. Procedural and other types of programming paradigms can still be used to manipulate a semantic spreadsheet; but rather than interacting with the row numbers and column letters of the spreadsheet programs interact with the meaning of the information.

The best semantic spreadsheets support the import and export from/to global-standard information exchange formats such as XBRL or OWL 2 DL. Support for global-standard formats enables the exchange of information between different semantic spreadsheet implementations.

Semantic spreadsheets allow for the use of OLAP-based information but do not require the use of OLAP. Semantic spreadsheets overcome many of the problems of OLAP and problems of presentation-oriented electronic spreadsheets.

While semantic spreadsheets are very powerful and in the class of software deemed to be expert systems; they semantic spreadsheets are also very easy to use for three specific reasons:

Semantic spreadsheets are business domain specific tools rather then general purpose tools.
Business users making the use of semantic spreadsheets interact using business domain terms familiar to their business domain.
Semantic spreadsheets strike an optimal balance between expressive power, reasoning capacity, and the reliability/predictability demanded for many business use cases.

Functionality is achieved by burying most knowledge engineering principles deep within software platforms and software applications used by business professionals (see the law of conservation of complexity). What business professionals loose in terms of the flexibility to solve any problem using general purpose tools; they gain in ease of use by both the absorbing of complexity within software and generous doses of the 80/20 rule.

Enterprise-class software extends the sound base established by global-standard semantic spreadsheets enabling business use cases that have additional needs to both leverage the solid foundation, but also extended that foundation to meet additional needs.

A digital financial report is a specific type of semantic spreadsheet and follow their same architecture however metadata is specific to the financial reporting scheme used by the economic entity creating the financial report.

The first semantic spreadsheet tool was created by _____(insert company here)_____ .

Posted on Monday, July 27, 2015 at 12:15PM by

Post a Comment |

Email |

Brainstorming Idea of Logical Catastrophes or Failure Points

I have mentioned the notion of "decidability" when I did a blog post related to description logic. When I discussed the notion of decidability with others, many times they seemed to be lumping other things in with decidability.

And so, I am tuning and I think improving my ability to express what I am trying to say. This is an improved attempt to summarize and synthesize these ideas.

There appears to be four "logical catastrophes" or "failure points" that the type of business system that I am working to create and many other similar types of business systems MUST NEVER HAVE. These characteristics are so catastrophic to the system they must never exist. Besides, these characteristics never exist in the reality that the system is trying to represent in machine-readable form.

This is a summary of these four logical catastrophes or "failure points" which must never exist:

Undecidability: "I don't know" or "unknown" is NOT an option as an answer to any question. A big part of this is making the closed world assumption rather than the open world assumption. What is interesting is that XBRL 1.0 and I am pretty sure XBRL 2.0 allowed for explicitly stating whether the closed world assumption was being made. Also, relational databases make the closed world assumption. On the other hand, many of the "anyone can say anything about anything" folks working to build the semantic web take the open world assumption by default. It is not to say that one assumption is right and the other is wrong. It is to point out that one assumption works one way and the other works another way and business systems generally need to be decidable and would make the closed world assumption. And this is not an "either/or" type question. All one needs to do is be explicit and not make others guess. Digital financial reporting needs to make the closed world assumption and therefore be decidable for the reasons explained here. Why? For the exact same reasons OWL 2 DL makes the closed world assumption.
Infinite loops: It is not hard to understand the problems caused by getting into an infinite loop from which a system can never escape. The reality represented by the business systems that I want to create don't have infinite loops. Therefore, there is no need to express something as an infinite loop. Another term that helps to understand loops from graph theory or network theory is a cycle. This stuff hurts my head to think about, but the basics of what a business professional needs to know is the difference between a directed cycle and an undirected cycle. Basically, never use directed cycles, they cause potentially infinite loops. Again, what is interesting is that XBRL consciously provided a means to eliminate directed cycles from ever appearing in an XBRL taxonomy. I don't think OWL 2 DL has this ability.
Unbounded pieces; unbounded sets: First-order logicor also known as first-order predicate logic can only work on finite systems. An infinite system can never be explained successfully using first-order logic. The pieces that make up XBRL are: fact, characteristic, parenthetical explanation (XBRL Foot note), network, hypercube ([Table]), dimension ([Axis]), member, primary item ([Line Items]), abstract, and concept. EVERYTHING in XBRL is one of those things. You can add any number of those things. You cannot invent new things and arbitrarily add them to a system. While the XBRL Technical Specification does allow for the addition of new things; most systems created have some bounded set of pieces. Further, every set is likewise bounded. There is a specific, countable, number of facts in an XBRL instance; always. This blog post and this blog posthelp you see that the pieces that make up XBRL are well bounded. Likewise, every set of such pieces is finite.
Unspecific logic: It is not expected that the business system at the level of describing the things in the system be able to support "fuzzy logic" or "probabilistic reasoning" or other such stuff. Now, when you use the information from the system, you can do whatever you want. But, describing what is in the system and what is not is not a "probability", it is a fact and the answer is it is there or it is not there; there is no in between and the answer is not a statistical probability. For example, "What is the value of assets?," is a number, not a probability.

That is my best attempt at describing the requirements of the type of business system that digital financial reporting needs to be. This is not a personal preference, this is about science. This is the only type of system what will work the way professional accountants need the system to work. There are a lot of other systems which would have similar requirements. And this is not to say that other systems can have different requirements.

And so the question is this: What logic or calculus should be used to represent such a system? OWL 2 DL might work, but OWL 2 DL does not support mathematical computations because some such computations cause a system not be decidable.

XBRL can do this. There are exactly ZERO catastrophic failures if you read the entire set of XBRL-based financial filings which have been submitted to the SEC. Not one. While the closed world assumption is not explicitly stated, it is assumed. No infinite loops. A bounded set of pieces can be used to construct an XBRL-based report. A bounded, finite number of pieces exist for each XBRL-based report. Fuzzy logic was not used to create the reports, the creation rules are specific.

Now, within the XBRL-based reports there are mistakes in the articulation of meaning. Many inconsistencies such as that. But, that is not remotely close to a catastrophic failure. That is a detail. Those inconsistencies are being detected and corrected.

I would be very interested in the thoughts of others who are knowlegable about how to make such a system work. I am not 100% sure that I am describing this correctly. But I am 100% certain about what I am trying to achieve. What I really don't understand is exactly the best way to achieve it. I have some ideas, but I don't know the answer yet. So please let me know what you think.

Posted on Saturday, July 25, 2015 at 07:35AM by

Charlie in Becoming an XBRL Master Craftsman |

Post a Comment |

Email |

Deming: A Theory of a System for Educators and Managers

The video, A Theory of a System for Educators and Managers, discusses Dr. W. Edwards Deming's view of systems in general by looking at the education system. If you don't know who Deming is; he is the guy that the U.S. automotive industry ignored but the Japanese automotive industry did not ignore, allowing the Japanese automotive industry to overtake the U.S.

I don't understand the relation between Deming's ideas and Six Sigma, but they seem very related.

These ideas were developed to improve production processes. But they are just as applicable to services and even to information systems.

Cooperation and collaboration is key to systems:

Working together is the main contribution to systemic thinking as opposed to working apart separately.

Posted on Friday, July 24, 2015 at 08:50AM by

Charlie in Becoming an XBRL Master Craftsman |

Post a Comment |

Email |

Understanding the Importance of PROLOG to Digital Financial Reporting

There are different programming paradigms that exist and can be used: imperative, declarative, object-oriented, functional symbolic, and logic programming.

PROLOG (download here) a general purpose logic programming language that is also declarative. Prolog is based on first-order logic. The syntax of PROLOG is derived from Horn clauses which is a subset of first-order logic. Because PROLOG is declarative, program logic is expressed represented by facts, relations, and rules. Questions are asked and then answers are provided based on the facts, relations, and rules.

The name PROLOG is derived from the phrase "PROgramming in LOGic". Academics in computer science community consider PROLOG an important language because it is about executable logic. Academics in philosophy or logic like to teach PROLOG for the same reason. Academics in business...Well, I have an MBA with an emphasis in information systems and they never mentioned PROLOG in any classes I took but that was 30 years ago.

So, most business professionals have probably never heard of PROLOG. However, if you have read Knowledge Engineering Basics for Accounting Professionals, you might be like me and get the sense that PROLOG might be very, very important.

The article, Prolog Under the Hood: An Honest Look, examines the inner workings of PROLOG. In the first paragraph of the conclusion of that article the author states:

In the AI [artificial intelligence] industry we have created many neat programming tools, none of which have taken off as we hoped. Quite possibly it is because the languages try too hard to be something they are not.

There is a lot to that statement. First off, it is unfortunate that the author chose to use the term "AI" or artificial intelligence. It seems that artificial intelligence and expert systems tends to get grouped together. This is unfortunate in my view. Personally, I don't like the "artificial" part of artificial intelligence. The term expert system resonates with me well though. Expert systems are "neat" ideas.

The "...because the languages try too hard to be something that they are not." I completely agree with that statement. Computers cannot think. They are machines and machines, at least today's machines, cannot think. The best that machines can do is mimic some things what humans can do. Machines cannot mimic intuition and creativity. But they can mimic rudimentary tasks, if give the right knowledge in the right form.

That form is machine-readable metadata. Facts, relations, rules. That is what a machine uses to mimic tasks humans perform, successfully automating those tasks. But if the metadata is nonsense, then the results of the machine automated tasks will likewise be nonsense.

PROLOG is a tool for making sure the important machine-readable metadata is not nonsense. That is the first step in automating work. So, am I saying that every accountant needs to go and learn PROLOG? No. Not at all. PROLOG or things like PROLOG such as the Fluent Editor will be buried deeply within the bowels of software. While the law of conservation of complexity states that complexity can never be removed from a system, the complexity certainly can be moved.

Using techniques that I have outlined in my document Understanding Blocks, Slots, Templates and Exemplars, the "neat" functionality provided by PROLOG and other such tools for making sure facts, relations, and rules are described correctly in taxonomies and ontologies and consistent with those descriptions in something like a digital financial report; machines will effectively mimic humans and perform useful work.

The alternative? Nonsense. Describing the facts, relations, and rules is a basic function of the system. If the facts, relations, and rules are not described correctly then inconsistencies result. It really is that straight forward.

But if both the information in something like a digital financial report is correct and consistent with what people believe is a standard machine-readable representation of a financial report; machines can do incredibly useful work for humans because the information does not contain nonsense.

Tools like PROLOG and the Fluent Editor minimize and hopefully eliminate nonsense.

XBRL US and others have created what they call the "Data Quality Committee". Data quality is the "end". What they are really doing is creating machine-readable rules necessary to reduce the nonsense. Those rules are the "means" to the "end", data quality. Think of what the Data Quality Committee creates as being an extension of the US GAAP XBRL Taxonomy, adding important missing machine-readable business rules.

Ask yourself a question. How do you know that the machine-readable rules are correct? Tools like PROLOG can help.

If you are ambitious, a great way to understand what I am talking about is to fiddle around with PROLOG. This blog post, Try Logic Programming: A Gentle Introduction to PROLOG, points you to a free downloadable version of a PROLOG implementation, tutorials, examples, and other useful information.

Learn PROLOG Now! has other very useful information.

Really anxious to get started with PROLOG? Try this online version of PROLOG.

Of course, all this begs the question, "Why would you need PROLOG? Why can't an XBRL processor and/or XBRL Formula processor handle all these details?" PROLOG is basically a general purpose reasoner. Why isn't there a business report reasoner or even better a financial report reasoner?

Stay tuned!

Posted on Thursday, July 23, 2015 at 06:31AM by

Charlie |

Post a Comment |

Email |