Understanding the Importance of Description Logic
Thursday, January 8, 2015 at 12:37PM
Charlie in Becoming an XBRL Master Craftsman

An important objective when sharing or exchanging information, particularly within a distributed system, is to share or exchange that information without disputes as to the precise meaning of the information.  A lack of discipline and rigor or a lack of formality in expressing precise meaning soon leads to arguments as to information meaning.

A knowledgebase contains models (entities and relations between entities) which are common to all information that could possibly exist within that knowledgebase. The expressive power of the representation format determines how much meaning (generally the many types of relations between entities) a knowledgebase can hold.

As expressive power increases, computational complexity increases and reasoning problems can result in unforeseen complexity-caused blowups. Expressive power should be useful-yet-harmless.

The goal is to properly balance the system with carefully chosen constructors and axioms such that typical applications with a requirement for reliable and efficient reasoning support which would include knowledge description constraints and knowledge entering or quality control constraints.

The best balance between expressiveness and complexity of reasoning depends on the intended application.

An extreme case is when a knowledgebase is not satisfied by any interpretation of any model in the knowledgebase. Such a knowledgebase is considered unsatisfied or inconsistent with the knowledgebase.  Another term used to describe these situations is undecidable.  In such vacuous cases where there are no interpretations, information is proven to be ambiguous.  Such information clearly has no utility. 

Therefore, avoiding such undecidable (unsatisfied or inconsistent) cases during information representation is prudent.  Rather than undecidable, a conclusion should always be reached as to the interpretation of a result from information within a knowledgebase.  A result should always be decidable. 

"Decidable" means that no interpretations that are not satisfied (unsatisfied or inconsistent) by at least one interpretation of the information in the knowledgebase exists. If a representation of information is not decidable then the represented information is ambiguous.  If any ambiguity exists, a meaningful exchange of information between the creator of the information and the consumer of information has not occurred.

There are two perspectives which can be adopted when evauating infomation in a knowledgebase: open world assumption and closed world assumption. In the open world assumption a statement cannot be assumed true on the basis of a failure to prove the statement. On a World Wide Web scale this is a useful assumption; however a consequence of this is that an inability to reach a conclusion (i.e. not decidable). In the closed world assumption the opposite stance is taken: a statement is true when its negation cannot be proven; a consequence of this is that it is always decidable.  In other applications this is the most appropriate approach.

The only way a meaningful exchange of information can occur is the prior existence of agreed upon technical syntax rules, domain semantics rules, and workflow/process rules.

First-order logic can be used to express a theory which fully and categorically describes structures of a finite domain (problem domain).  No first-order theory has the strength to describe an infinite domain.  Therefore, all problem domains must be made finite, boundaries must be established.

There are two key parts of first-order logic. First, the syntax which determines which collection of symbols is legal expressions in first-order logic.  Second, the semantics which determine the meaning behind these legal syntax expressions.

Description logics (DL) is a family of formal knowledge representation languages based on first-order logic. There are many varieties of description logics each with specific characteristics which are necessary for specific types of applications.  SROIQ is a description logic variety which is always decidable.

SROIQ description logic is a syntax.  OWL 2 DL is a syntax. XBRL is a syntax. The semantics of SROIQ description logic and OWL 2 DL were consciously made consistent.  The XBRL syntax is not consistent with SROIQ description logic and OWL 2 DL semantics in terms of expressive power and therefore semantics.

XBRL's ability to express semantics should be made consistent with OWL 2 DL and SROIQ description logic's ability to express semantics.

I am not sure that I have all of this correct.  I am trying to get it correct, to dial all of this information in.  I know that I am not there yet.

This is a summary of the key points from above:

For the past 50 years or so, people have been trying to figure out how to use computers to perform useful tasks or work. Things were created and these things matured independently in many cases. Even before computers existed; philosophers, mathematicians, and biologists tried to create ways to describe things and classify things using tools such as first-order logic (theories, axioms, theorems). 

When computers were invented other things were invented to make these sorts of things "machine readable" or readable by computers: description logics, UML, RDF, OWL, relational database schemas, XBRL, object oriented programming, and so forth. These tools evolved and continue to evolve.

Some of the problems these things solved were sometimes different; some of the problems were the same.

Along came the internet which both pushed things along more quickly because there were more things a computer could do and focused people on standard approaches to accomplishing tasks and performing work.

OWL originally had only one version and it was hard to get that one version to meet every different use case. Ultimately OWL was "stratified" into several versions.  One version does everything the original version of OWL did (OWL FULL). The Description Logic people provided their input to get an OWL variant that did what they needed (OWL 2 DL). OWL 2 DL and SROIQ Description Logic provide the same semantic tools and was made to be able to do specific things.  The same things that OWL 2 DL and SROIQ description logic needs appear to be exactly what digital financial reporting and therefore XBRL need.

OWL 2 DL and SROIQ Description Logic were consciously and thoughtfully made consistent.  Why would it be the case that XBRL should not be consistent with OWL 2 DL and SROIQ Description Logic? What sort of representations that are required for other representation schemes are not necessary for representing a digital financial report?

It is perhaps true that XBRL does not need to express everything.  XBRL only needs to represent what would be contained in a business report which would include a financial report (i.e. a financial report is a type of business report).

XBRL only needs to represent certain specific things related to financial reporting and business reporting.  It seems that XBRL could leverage its specialized use case and provide a constrained set of higher-level building blocks to fulfill the patterns of known functionally required by financial reporting.

Blockly and Scratch (watch the video) are examples of what I mean by building blocks.  Why would software vendors enabling business professionals not do this? Higher-level building blocks will be easier to use by business professionals to make use of sophistocated functionallity. There is NO WAY business professionals will ever learn low-level tools such as OWL 2 DL, SROIQ Description Logic, or XBRL+DL (description logic) syntax. Those low-level tools are far too complex for business professionals.

For example, business professionals should not need to deal with complex to understand things such as symmetric and inverse and complement.  Rather, building blocks could be created the enable business professionals to put things together which have the proper low-level structures, but achieve this using higher-lever building blocks. Building blocks could likely be effective for creating common patterns that represent 80% of the tasks a business user would perform.  Perhaps 18% could then be created using wizards which guided a business professional through the creation process.  In the rare 2% of business use cases, business professionals could rely on IT professionals to satisfy their needs.

Why would this building block-type of approach be worse than requiring a business professional to construct 100% of what they need using low-level tools which use the OWL 2 DL-type or SROIQ Description Logic-type or XBRL+DL-type low-level semnatics which is and harder-to-use complex, low level constructors?

Learn more about description logic here:

If you go to this analysis of the disclosures of Fortune 100 public companies you will notice GREEN and YELLOW cells.  GREEN is consistent with the model.  YELLOW is inconsistent.  The goal is to get everything GREEN.  A big part of the YELLOW relates to lack of coordination related to the classes and relations between classes.

This DRAFT comparison shows what in necessary to synchronize XBRL with OWL 2 DL and SROIQ description logic.

This DRAFT set of XBRL arcrole definitionsshows an example of what is necessary to make this synchronization. Clearly this would need to be world class.  Then, software vendors understand what they need to implement in their software to truly support what business professionals need from digital financial reporting.

Article originally appeared on Intelligent XBRL-based structured digital financial reporting using US GAAP and IFRS (http://xbrl.squarespace.com/).
See website for complete article licensing information.