What goes around comes around

Reading “A co-Relational Model of Data for Large Shared Data Banks” (a title inspired by Codd‘s “A Relational Model of Data for Large Shared Data Banks“) reminded me of M. Stonebraker’s and J. Hellerstein’s “What goes around comes around” which is one of the coolest papers one can read if one wants to know how life was before the Relational Model (that is before SQL for those who do not love theory). It is also a cool reading for the NoSQL crowd because while you might think this is all new, it’s actually a bit of a return to the past.

It seems that this co-relational model tries to answer my favorite question directed to NoSQLers: “Where is the math?” It does so by using category theory. However:

“Obviously, the precise formalization of SQL and noSQL as categories is outside the scope of this article”

Although my category-fu is non-existent I could not help not remembering Pauli‘s response to Heisenberg’s claims that they (Pauli and Heisenberg) had found a unified theory but the technical details were missing:

This is to show the world that I can paint like Titian.

Pauli's Titian

Only technical details are missing

It will be really nice if this duality is formally proven for I had a hunch that a relationship existed, only I could not really determine it.

[Pauli’s painting from here]

Greenspun’s Tenth Rule and variations

For those who have not heard Greenspun’s Tenth Rule, it states that:

Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

By the way, Greenspun‘s rules 1 to 9 do not exist.

Seven months ago, during a discussion about Prolog, I asked Ozan S. Yigit to reformulate Greenspun’s tenth rule for Prolog. Oz replied:

Any sufficiently complicated modern program contains a buggy, informal implementation of prolog that casual observers confuse with lisp.

Just hours earlier I was basically a listener in a discussion that involved NoSQL. While clearly I am not a NoSQL advocate, I am no hater either, but what I heard lead me to the following reformulation of Greenspun’s rule, this time involving the relational model:

Those who blindly adopt #NoSQL will discover a variation of Greenspun’s tenth rule

I am sure that many other variations exist. In fact the Wikipedia page on Greenspun’s Tenth Rule contains a Prolog variation similar to Ozan’s and an Erlang version. So if you know of (or can make up) any other, please post it here (or somewhere).

What I like about the NoSQL crowd

Although I am not a big fan of the NoSQL movement (mostly because many of its advocates use arguments I do not agree with) there are a few things that I like about the NoSQL crowd and I want to write them down*. Most of what follows stems from discussions through the years with our DBA and some friends who are members of the “Greek Database Mafia“.

For more than two decades the dominance of the relational model (even though no commercial system fully implemented it) was undisputed. Nobody ever got fired for choosing a commercial RDBMS for an application, where instead one would look suspicious if one dared to propose something different. This situation is no different than what Rob Pike described in “Systems Software Research is Irrelevant” for Operating Systems:

  • For example it took 10+ years for the R-Tree to enter the commercial systems, although it was solving a real problem. In the meantime if you were lucky and your system offered extensibility you could write it on your own.
  • No matter how novel the system, for it not to be marginalized it had to have an “SQL layer”. No SQL queries, no sales. Provide an SQL layer and all innovation of the product stays unused.
  • Ones proposal for an RDBMS purchase had to be among three or four commercial products. Anything else would likely be considered “a hacker’s choice” because “We make money! We cannot go that way!”

You could say that databases outside academic research had come to a halt. You don’t believe me? Just ask Yannis Ioannidis who uses to say that “Databases are dead in the most emphatic way when he wants to stir things up in a conversation.

And this is what I like about the NoSQL crowd (== implementers, advocates and integrators) . They do not care about established standards. They are not afraid to experiment in a “real environment”. Some of them may focus on a single problem and solve it well. Others may aim at a wider range of problems. But no system is stopped from being developed and deployed because it not “SQL compilant” or not relational. And even though some of these solutions resemble CODASYLo, once again there is action in the field.

But please people, stop marketing them as a “one solution fits all”. For we will again end up in a stagnation era, just like when everyone was storing stuff in an RDBMS for lack and fear of better suited solutions. They do not invalidate relational systems. They fill in the gap left by them.

[*] – As I had promised.

[†] – By the way, did you know that GROUP BY works outside of the “relational box”?

[‡] – I have heard him say this while giving a speech.

[o] – “one usually gets a low-level record-at-a-time DBMS interface“, says Mike Stonebraker.