Overcoming "Metadata Ignorance", Achieving Semantic Interoperability
The terms "semantics" and "metadata" are increasingly showing up in initiatives which are attempting to properly position governmental organizations and private companies in our new "digital economy". Two good examples of this are the W3C Government Linked Data Working Group and the Asset Description Metadata Schema initiative in Europe.
The Asset Description Metadata Schema folks have a great document, Towards Open Government Metadata which provides some very nice definitions of semantics and metadata.
Paraphrasing from that document, they explain that semantic interoperability is an essential precondition for open, flexible information exchange. Not the only precondition, but an essential precondition. This is consistent with what the HL7 folks say, pointing out that technical, semantic, and process interoperability are all important.
That document further describes why their "Semantic Interoperability Assets" (i.e. metadata) are important to achieving semantic interoperability:
"… the meaning of data elements and the relationship between them. It includes developing vocabulary to describe data exchanges, and ensures that data elements are understood in the same way by communicating parties."
This promotional video walks you through why semantic interoperability and appropriate metadata are essential ingredients for effective business to business information exchange.
Another part of that paper which is of particuar value in understanding the term metadata is what they call the five levels of maturity for metadata management:
- Level 1: Metadata Ignorance – Metadata is not documented, mainly because administrators are not aware of its importance.
- Level 2: Scattered or Closed Metadata – Metadata may be partially documented but a) not in a centralised and structured way or b) it is not available and accessible under an open license framework, in other words as "Open Metadata" for developers to share and reuse.
- Level 3: Open Metadata for Humans– Metadata is documented and becomes available as "Open Metadata" for reuse, but are not systematically published in a reusable format, e.g. may only be available in .pdf or .doc documents.
- Level 4: Open Reusable Metadata– Metadata is centrally managed, and published as "Open Metadata", in a machine readable format and/or an API is provided for computers to access, query and reuse the available metadata repositories, catalogues, libraries, etc.
- Level 5: Linked Open Metadata – Semantic Assets are documented using linked data principles and are managed by advanced Metadata Management Systems.
When you build your XBRL based metadata to achieve the semantic interoperability described above in order to achieve business system to business system information exchange keep these ideas in the back of your mind.
One thing that is becoming increasingly unclear how XBRL best fits into the linked data initiatives such as the two above. These seem to be the spectrum of options:
- Everyone should ditch their technical syntax and use the XBRL technical syntax. XBRL zealots push for this. Yeah, do you really think that is going to happen? I doubt it, nor does it need to.
- XBRL ditches its technical syntax and moves to RDF like the two groups above and others. That is what the Government Linked Data folks are calling for, see their list of the ingredients of high quality linked data; they say everything needs to be in RDF.
- All these groups agree on the business semantics and all the metadata is made available which is necessary to convert from one technical syntax such as XBRL to any other technical syntax such as RDF/OWL.
Technical syntax is important, but less important than agreement on semantics. Or, maybe I am saying this incorrectly. Perhaps that it is not about which is more important, technical syntax or semantics; it seems to be that there are multiple "layers" which are necessary to achieve effective interoperability. If you have all the meaning, nothing prevents conversion from one technical syntax to another.
It seems as though some people are even questioning the XML syntax as the base for all other technical syntaxes. For example, JSON seems to be a more compact syntax than XML with some distinct advantages. Seems to me that the important thing is whether the technical syntax works correctly over HTTP and whether the syntax can be used globally. XML is global.
Reader Comments