Understanding Database/Query Options (Part 3)
Continuing where Understanding Database/Query Options (Part 2) left off...
Looking at all the moving pieces in parts 1 and 2, it seems to me that these are the things business users might care about. Or rather, maybe business users don't care about some of these things but someone who pays the bills cares about all of these. All these need to be in balance:
- Easy for business user to use (intuitive): Something should be EASY to use as opposed to HARD to use.
- Query power and query sophistication: Queries should be POWERFUL rather than UNSOPHISTICATED. (The more you can do, the better, as long as what you can do is useful to you.)
- Performance, query speed: Performance should be FAST rather than SLOW.
- Expressive power: The expressiveness of the system should be EXPRESSIVE as compared to INEXPRESSIVE. (The more you can do, the better, as long as what you can do is useful to you.)
- System flexibility, agility: A system should be FLEXIBLE as compared to INFLEXIBLE. (Flexibility should be judged by where the user needs the flexibility. Flexibility in the wrong places causes a system to be harder to use than necessary. Unnecessary options are a bug, not a feature.)
- System scalability: A system might need to SCALE as compared to DOES NOT SCALE.
- Global standard: A system might be better if it is more STANDARD than PROPRIETARY.
- Cost effective: A system could either EXPENSIVE or INEXPENSIVE.
- Maintainability: A system could be either HARD TO MAINTAIN or EASY TO MAINTAIN.
It has been my observation that may people's expectations are at an inappropriate level. You have probably heard the saying, "Ignorance is bliss." Not knowing something is often more "comfortable" than knowing it. For example, it has been my experience that a lot of people overuse Microsoft Excel for things that are much, much more easily achieved using Microsoft Access.
Not knowing Access does not make Excel better than Access. What it means is that someone who understands both Access and Excel can be more effective and efficient than someone who knows only Excel, all other things considered. Frankly, I see people doing extremely foolish things in Excel.
Basing your expectations on what you currently know can be dangerous and limiting. For example, people who only understand SQL and have never used XQuery and say SQL is better (even though they have never used XQuery) are being foolish. Why would someone like the W3C go through the trouble of creating XQuery if SQL can do everything that XQuery does?
People tend to agree that data comes in different sorts of structures. The following is a summary of the spectrum of data or information structures:
- Unstructured Data: Unstructured data follows no specified data model or structure. The truth is that even unstructured information has some structure. Computers can ONLY deal with information which has been structured. For example, text tends to exist in a file. That file is a structure. Text files have mechanisms and techniques for structuring even unstructured information such as line feeds, blank lines, etc.
- Semi-structured Data: Semi-structured data is a cross between the two. It is a type of structured data, but lacks the strict data model structure.
- Structured Data: Structured data or highly-structured is data or information that conforms to some rigid data model. For example, data in a relational database is structured data.
What is the right approach? Well, that seems to depend. Everything in life doesn't always necessarily fit into neat little boxes. On the other hand, life is not totally random either. There are advantages to information which can be formally structured. There are also advantages to unstructured information. Each, likewise, has disadvantages. The trick is to pick the appropriate approach for the specific situation.
People tend to agree that information can be structured for presentation or for meaning.
- Structured for presentation: HTML, word processing documents, PDF, and believe it or not even electronic spreadsheets are structured for presentation. For example, the sections of a document, a paragraph, a sentence, or the format of a word are all presentation structures. The workbooks, spreadsheets, columns, rows, and cells of a spreadsheet are presentation structures.
- Structured for meaning: XBRL is something that is structured for meaning, not presentation. There is presentation oriented information within the sstructured meaning, such as the XBRL presentation linkbase which communicates the ordering of report elements, etc.
This video, How XBRL Works, walks you through the difference between information structured for presentation and information structured for meaning.
People tend to agree that information is comprised of "things" and those "things" can be related. There are different ways to express those "things" (sometimes called "entities") and the relatinos between the things.
- Subject-predicate-object: The approach used by RDF is subject-predicate-object. This description enhances this description a bit, Subject (start) - Predicate (connector) - Object (end). The subject and object are both things and the predicate is how the things are related. Things can be related in many different ways.
- Entity-attribute-value: This approach is very similar to subject-predicate-object.
- Entity-relations model: An entity-relationship model is a systematic way of describing and defining a business process. It seems that subject-predicate-object and entity-attribute-value are approaches to implementing an entity-relations model.
- Set theory: Another approach to describing things and relations between things.
- Conceptual model or domain model: It seems that a conceptual model is a general notion that a model is anything used in any way to represent anything else. This is a great description of what a conceptual model is: "Conceptual modeling is the activity of formally describing some aspects of the physical and social world around us for the purposes of understanding and communication."
The "things" that you are working with are commonly grouped into three different "levels":
- Conceptual model: High level description of how something in the real world works
- Logical model: Lower level desciption of the conconceptual model
- Physical model: Details of how the logical model is implemented using some technology.
Reader Comments