NoSQL, Lucene and Solr

The other day, Michael Coté asked me where Apache Lucene and Solr fit in with the NoSQL movement (having heard about the Guardian’s use of Solr), to which I replied:  I haven’t used SQL in any significant way since I started using Lucene in 2004 (and I started my career doing Oracle DBA work, etc. way back when.)  We just didn’t have a fun name for it “back in the day”.

All kidding aside and at the risk of jumping on the buzzword bandwagon, let’s take a look at Wikipedia’s definition of NoSQL:

NoSQL (Not only SQL) is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases. These data stores may not require fixed table schemas, usually avoid join operations and typically scale horizontally.

Now let’s apply that definition to Lucene (we’ll get to Solr in a moment):

  1. NoSQL – Check.  Ironically, many people have also layered SQL on top of Lucene as well.  Guess we should also argue for inclusion in the No-NoSQL (as Coté suggests) movement too!
  2. Loosely defined class of non-relational data stores that break w/ long history of relational dbs: Check.  Once again, ironically, Lucene covers both sides of the aisle here.
  3. No fixed schemas: Been there, done that, bought the t-shirt.  Once again, Lucene also supports fixed schemas as well.
  4. Avoid joins:  Check.  Denormalization frees your mind.  (at least in many cases).  You can do joins in Lucene with some work.
  5. Scales horizontally:  Yes and no.  Let’s be honest, Lucene scales quite well, but you’re going to have to do some work to make it so.  Enter Solr.

Since Solr is “Lucene Best Practices” all wrapped up in an easy to use server, it covers items 1-4 no problem.  And, get this, it also scales horizontally in terms of both data size and query volume.  Plus, with the recent addition of Apache ZooKeeper to Solr (aka Solr Cloud), scaling has never been easier.  At the end of the day, you get all the benefits of NoSQL (under an eventually consistent model) plus you get built in things like free text search, faceting, spell checking, similar item search, hit highlighting and a whole host of other things that have been proven out in thousands of installations around the world.

NoSQL never looked so good.

You Might Also Like

New survey: 67% of shoppers want AI to explain products, not buy them

Consumer-centric data reveals shoppers don't want AI to shop for them. They...

Read More

Top 5 Use Cases for ACP in B2B Commerce

The rise of agentic commerce opens compelling new frontiers for B2B businesses.

Read More

The Role of Open Standards in MCP and ACP — Why Interoperability Matters

Open standards are what make MCP (Model Context Protocol) and ACP (Agentic...

Read More

Quick Links