Continuing on from where I left off in Understanding Database/Query Options (Part 1)...
People tend to agree that there are different types of databases. Another term for this is DBMS (database management system) or database or database model. Now, keep in mind here that you have databases and you have modeling approaches used by databases. These are different. For example, a relational database can use a multidimensional approach to representing information within that that database.
- RDBMS: Relational database management system, which is a database based on the relational model or set theory. The relational model is a two dimensional structure: rows and columns. (Note that you can use a multidimensional structure in a relational database.)
- Hierarchical database: A hierarchical database management system is a system which follows the hierarchical model.
- Object database: An object database is a database management system in which information is represented in the form of objects (follows object model), similar in approach to how objects are used in object-oriented programming.
- Network database: A network database is a database management system which follows the network model.
- Multidimensional database: A multidimensional database or multidimensional engine is a system which is fundamentally to work using the multidimensional model. (i.e. this means it is not a relational database which is then structured to mimic the multidimensional model (explained here), it inherently uses the multidimensional model; this video explains the difference)
- NoSQL database: A NoSQL (not only SQL) database provides a system which is based on an open data structure (e.g. tree, graph, key-value, document) which is generally something other than tabular (nothing really prohibits the use of table like structures). Basically, a NoSQL database is very flexible and you have to manage the structure yourself. (For more information on NoSQL databases see here, here, here, here, here)
- Triplestore: A triplestore or RDF triplestore is a purpose-built database for the storage and retrieval of triples, such as RDF, which is a graph of subject-predicate-object relations.
- Flat file database: A flat file database is a system where in essence one or more files are used to store data.
- Linked data: Linked data is basically seeing the entire internet as a database. So the "system" is the internet itself.
People tend to agree that the categories of processing of information can broken down into at least two groups: (this presentation compares and contrasts OLTP and OLAP)
- OLTP: Online Transaction Processing, transaction oriented where the system responds immediately to user requests.
- OLAP: Online Analytical Processing, an approach to answering multi-dimensional analytical (MDA) queries swiftly
OLTP is what relational database did when they were first created. OLAP was invented as an approach to making analysis of information, many times from OLTP systems data, easier. OLAP uses the notion of "cubes".
OLAP can be broken down into categories also:
- ROLAP: Relational OLAP which stores information in a relational database.
- MOLAP: Multidimensional OLAP which store information in a multidimensional array rather than a relational database.
- HOLAP: Hybrid OLAP which combines the best of ROLAP and the best of MOLAP, allowing a tradeoff of the advantages of each
- NOLAP: The term NOLAP means NoSQL Analytical Processing or NoSQL OLAP. NOLAP uses NoSQL and XBRL to express cubes which do what OLAP does but have fewer constraints.
- RDF Data Cube: The W3C published a data cube model. That cube model states, "The model underpinning the Data Cube vocabulary is compatible with the cube model that underlies SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations."
- SDMX: The RDF data cube references this SDMX cube model as mentioned above.
Ran across a few things which I want to investigate further relating to semantic model:
Most of what I have pointed out so far in Part 1 and Part 2 are facts which few people tend not to dispute. Some people might add something to a list, change a term, or maybe even remove something from the lists that I have provided. When you have to choose something from a list, then things change. There is an interesting phenomenon that I have noticed when one tries to select something on the list. Specific software vendors always have "the best solution". Always! It is unreal. I don't think I have ever run across a software vendor who says, "Yeah, sounds like you need a multidimensional database based on what you are describing and we sell a relational database type system..." You ever have that experience?
Hadoop is a set of tools for working with "big data". Hadoop is open source and based on Linux; the direction of Hadoop is not determined by any software vendor. This video explains hadoop. This also explains hadoop.
Gotta walk the dog again...Guess there will be a Part 3.