Blog, Open Source, SearchHub, Technical Article

Solr Dev Diary: Solr and Near Real-Time Search

by Mark Miller
April 9, 2011

Solr’s UpdateHandler has gotten a little crusty. Many of the implementation details are there due to old, tired, and removed requirements and functions. For those that do not know, documents that you add to Solr are actually put into the index by the UpdateHandler.

There are two details about the current UpdateHandler implementation that are particularly limiting.

First, Solr uses it’s own lock’s on top of Lucene, adding a courser, unnecessary layer of locking on top of the IndexWriter. These locks had a reason to exist once upon a time, but really, they no longer do. There is no reason to block additional document adds while performing a commit, but currently this is what Solr does. Removing these locks will reduce complexity and maintenance costs by allowing us to ‘mostly’ just use Lucene’s locking. Solr will also more easily simply inherit improvements from Lucene in this area.

Second, because of historical requirements, Solr will close and open a new IndexWriter on every commit. This means that every commit waits for all background Index merging threads to finish merging. This can be a non insignificant amount of time – and during this time you cannot add any documents to the index. You also cannot see the documents that have just been added to the index until the merges and commit are complete. Really, the UpdateHandler should simply commit and open a new SolrIndexSearcher – with the background threads happily merging *in the background*.

There are a few other things that bug me as well.

Well I’m going to fix them all now. Time to remove the crust and introduce Lucene near-real-time support to Solr. You should be able to open a new view on recently added content with Solr in a fraction of the time possible right now. It’s not right that you have to juggle SolrCore’s to attempt near real time index updates – it’s time to make things easier. Time to makes things faster.

And when Lucene finishes it’s real-time support and stops IndexWriter flushes from blocking document additions, Solr will be even more ready to take advantage where it can. There will still be more to do – not everything Solr does is yet per segment, and replication is not currently very near-real-time friendly – but we will keeping moving things in the right direction.

I’m tackling these changes here: https://issues.apache.org/jira/browse/SOLR-2193

– Mark

About Mark Miller

LEARN MORE

Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.

Fusion Platform Overview

Fusion Platform Pricing

AI Hub

Lucidworks Features and capabilities (all Included)

Product Discovery

Searchandising

Site Search

Workplace Search

Ingest Data and Capture Signals

Employee Search Experience

Customer Service and Case Resolution

AI and Large Language Models

Solutions

Commerce

Customer Service

Knowledge Management

Industries

Retail

Government and Public Sector

Healthcare

B2B Commerce and Distribution

B2B Manufacturing

Financial Services

EXPLORE OUR CONTENT

Ebooks & Reports

Blog

Videos

Press

Resources

About Lucidworks

Documentation

Careers

LucidAcademy

Contact Us

Technical Support

Solr Dev Diary: Solr and Near Real-Time Search

About Mark Miller

LEARN MORE

Fusion Platform Overview

Fusion Platform Pricing

AI Hub

Lucidworks Features and capabilities (all Included)

Product Discovery

Searchandising

Site Search

Workplace Search

Ingest Data and Capture Signals

Employee Search Experience

Customer Service and Case Resolution

AI and Large Language Models

Solutions

Commerce

Customer Service

Knowledge Management

Industries

Retail

Government and Public Sector

Healthcare

B2B Commerce and Distribution

B2B Manufacturing

Financial Services

EXPLORE OUR CONTENT

Ebooks & Reports

Blog

Videos

Press

Resources

About Lucidworks

Documentation

Careers

LucidAcademy

Contact Us

Technical Support

About Mark Miller

Related Articles

LEARN MORE