Understanding the Importance of Description Logic

Thursday, January 8, 2015 at 12:37PM

Charlie in Becoming an XBRL Master Craftsman

Charlie in Becoming an XBRL Master Craftsman

An important objective when sharing or exchanging information, particularly within a distributed system, is to share or exchange that information without disputes as to the precise meaning of the information. A lack of discipline and rigor or a lack of formality in expressing precise meaning soon leads to arguments as to information meaning.

A knowledgebase contains models (entities and relations between entities) which are common to all information that could possibly exist within that knowledgebase. The expressive power of the representation format determines how much meaning (generally the many types of relations between entities) a knowledgebase can hold.

As expressive power increases, computational complexity increases and reasoning problems can result in unforeseen complexity-caused blowups. Expressive power should be useful-yet-harmless.

The goal is to properly balance the system with carefully chosen constructors and axioms such that typical applications with a requirement for reliable and efficient reasoning support which would include knowledge description constraints and knowledge entering or quality control constraints.

The best balance between expressiveness and complexity of reasoning depends on the intended application.

An extreme case is when a knowledgebase is not satisfied by any interpretation of any model in the knowledgebase. Such a knowledgebase is considered unsatisfied or inconsistent with the knowledgebase. Another term used to describe these situations is undecidable. In such vacuous cases where there are no interpretations, information is proven to be ambiguous. Such information clearly has no utility.

Therefore, avoiding such undecidable (unsatisfied or inconsistent) cases during information representation is prudent. Rather than undecidable, a conclusion should always be reached as to the interpretation of a result from information within a knowledgebase. A result should always be decidable.

"Decidable" means that no interpretations that are not satisfied (unsatisfied or inconsistent) by at least one interpretation of the information in the knowledgebase exists. If a representation of information is not decidable then the represented information is ambiguous. If any ambiguity exists, a meaningful exchange of information between the creator of the information and the consumer of information has not occurred.

There are two perspectives which can be adopted when evauating infomation in a knowledgebase: **open world assumption** and **closed world assumption**. In the open world assumption a statement cannot be assumed true on the basis of a failure to prove the statement. On a World Wide Web scale this is a useful assumption; however a consequence of this is that an inability to reach a conclusion (i.e. not decidable). In the closed world assumption the opposite stance is taken: a statement is true when its negation cannot be proven; a consequence of this is that it is always decidable. In other applications this is the most appropriate approach.

The only way a meaningful exchange of information can occur is the prior existence of agreed upon technical syntax rules, domain semantics rules, and workflow/process rules.

First-order logic can be used to express a theory which fully and categorically describes structures of a finite domain (problem domain). No first-order theory has the strength to describe an infinite domain. Therefore, all problem domains must be made finite, boundaries must be established.

There are two key parts of first-order logic. First, the syntax which determines which collection of symbols is legal expressions in first-order logic. Second, the semantics which determine the meaning behind these legal syntax expressions.

Description logics (DL) is a family of formal knowledge representation languages based on first-order logic. There are many varieties of description logics each with specific characteristics which are necessary for specific types of applications. *SROIQ* is a description logic variety which is always decidable.

*SROIQ *description logic is a syntax. OWL 2 DL is a syntax. XBRL is a syntax. The semantics of *SROIQ *description logic and OWL 2 DL were consciously made consistent. The XBRL syntax is not consistent with *SROIQ* description logic and OWL 2 DL semantics in terms of expressive power and therefore semantics.

XBRL's ability to express semantics should be made consistent with OWL 2 DL and SROIQ description logic's ability to express semantics.

I am not sure that I have all of this correct. I am trying to get it correct, to dial all of this information in. I know that I am not there yet.

This is a summary of the key points from above:

**Meaningful information exchange**: The only way a meaningful exchange of information can occur is the prior existence of agreed upon technical syntax rules, domain semantics rules, and workflow/process rules.**Describe and constrain are two sides of the same coin**: Rules both describe and constrain. Knowledge*description*constraints describes information and knowledge*entering*constraints control the quality of the information as it enters a system. Description and quality control are two sides of the same coin.**Expressing domain semantics rules**: First-order logic can be used to express a theory which fully and categorically describe structures (entities and relations between entities) of a finite domain (problem domain).**Semantics of a financial report**:*Financial Report Semantics and Dynamics Theory*is a theory that describes the finite semantics of a financial report in human readable terms which an accounting professional can generally understand. It is desirable to represent the axioms and theorems of this theory in machine readable terms such that machines can verify the internal consistency of the theory, verify using automated machine-based processes to determine if a digital financial report is represented consistently with the theory, and enable software programs to leverage the machine readable information to assist a business user making use of a digital financial report using software applications.**Decidable**: Decidable means that a conclusion can always be reached. Decidable means that there are no interpretations that are not satisfied (unsatisfied or inconsistent) by at least one interpretation. Part of achieving a "decidable" state is to use the closed world assumption (CWA) as opposed to the open world assumption (OWA). It is not acceptable/appropriate for a result to be "undecided" in the system of digital financial reporting.**Axiom**: An axiom is a premise so evident as to be accepted as true without controversy.**Theorem**: A theorem is a statement that has been proven on the basis of previously established theorems or generally accepted axioms.**Theory**: A theory is an analytical tool for understanding, explaining, describing, and making predictions about a given subject matter or domain. A theory is described using axioms and theorms (entities and relations between entities) and checking the theory against the real world. A theory can be expressed in natural language which is understood only by humans. A theory can also be expressed in machine readable terms such as decision logic, OWL 2 DL, or XBRL.**Expressive power**: A syntax can be used to express the semantics of a theory in machine-readable form. The expressive power of the syntax determines the language's ability to express the axioms of a theory.

For the past 50 years or so, people have been trying to figure out how to use computers to perform useful tasks or work. Things were created and these things matured independently in many cases. Even before computers existed; philosophers, mathematicians, and biologists tried to create ways to describe things and classify things using tools such as first-order logic (theories, axioms, theorems).

When computers were invented other things were invented to make these sorts of things "machine readable" or readable by computers: description logics, UML, RDF, OWL, relational database schemas, XBRL, object oriented programming, and so forth. These tools evolved and continue to evolve.

Some of the problems these things solved were sometimes different; some of the problems were the same.

Along came the internet which both pushed things along more quickly because there were more things a computer could do and focused people on standard approaches to accomplishing tasks and performing work.

OWL originally had only one version and it was hard to get that one version to meet every different use case. Ultimately OWL was "stratified" into several versions. One version does everything the original version of OWL did (OWL FULL). The Description Logic people provided their input to get an OWL variant that did what they needed (OWL 2 DL). OWL 2 DL and *SROIQ *Description Logic provide the same semantic tools and was made to be able to do specific things. The same things that OWL 2 DL and SROIQ description logic needs appear to be exactly what digital financial reporting and therefore XBRL need.

OWL 2 DL and *SROIQ *Description Logic were consciously and thoughtfully made consistent. Why would it be the case that XBRL should not be consistent with OWL 2 DL and *SROIQ* Description Logic? What sort of representations that are required for other representation schemes are not necessary for representing a digital financial report?

It is perhaps true that XBRL does not need to express everything. XBRL only needs to represent what would be contained in a business report which would include a financial report (i.e. a financial report is a type of business report).

XBRL only needs to represent certain specific things related to financial reporting and business reporting. It seems that XBRL could leverage its specialized use case and provide a constrained set of higher-level building blocks to fulfill the patterns of known functionally required by financial reporting.

Blockly and Scratch (watch the video) are examples of what I mean by building blocks. Why would software vendors enabling business professionals not do this? Higher-level building blocks will be easier to use by business professionals to make use of sophistocated functionallity. There is NO WAY business professionals will ever learn low-level tools such as OWL 2 DL, *SROIQ *Description Logic, or XBRL+DL (description logic) syntax. Those low-level tools are far too complex for business professionals.

For example, business professionals should not need to deal with complex to understand things such as symmetric and inverse and complement. Rather, building blocks could be created the enable business professionals to put things together which have the proper low-level structures, but achieve this using higher-lever building blocks. Building blocks could likely be effective for creating common patterns that represent 80% of the tasks a business user would perform. Perhaps 18% could then be created using wizards which guided a business professional through the creation process. In the rare 2% of business use cases, business professionals could rely on IT professionals to satisfy their needs.

Why would this building block-type of approach be worse than requiring a business professional to construct 100% of what they need using low-level tools which use the OWL 2 DL-type or *SROIQ *Description Logic-type or XBRL+DL-type low-level semnatics which is and harder-to-use complex, low level constructors?

Learn more about description logic here:

: 17 page PowerPoint-type brief overview of description logic.**Description logic in a nutshell**: A little technical but a good 17 page introduction to description logic.**A Description Logic Primmer**: More technical 40 page introduction to description logic.*An Introduction to Description Logics*: Wiki page.**Description Logic**: Paper which describes the impact that description logic had on OWL 2 DL and why OWL 2 DL needed to exist, 31 pages.**From SHIQ to RDF and OWL: Making of a Web Ontology Language**: 11 pages, very technical formal proof which shows why*The Even More Irresistible SROIQ**SROIQ*is the right variant of description logic (i.e. decidable fragment of first-order logic).

If you go to this analysis of the disclosures of Fortune 100 public companies you will notice GREEN and YELLOW cells. GREEN is consistent with the model. YELLOW is inconsistent. The goal is to get everything GREEN. A big part of the YELLOW relates to lack of coordination related to the classes and relations between classes.

This DRAFT set of XBRL arcrole definitionsshows an example of what is necessary to make this synchronization. Clearly this would need to be world class. Then, software vendors understand what they need to implement in their software to truly support what business professionals need from digital financial reporting.

Article originally appeared on Intelligent XBRL-based structured digital financial reporting using US GAAP and IFRS (http://xbrl.squarespace.com/).

See website for complete article licensing information.