The Lucene community has recently decided to merge the development of two of its sub-projects – Lucene->Java and Lucene->Solr. Both code bases now sit under the same trunk in svn and Solr actually runs straight off the latest Lucene code at all times. This is just a merge of development though. Release artifacts will remain separate: Lucene will remain a core search engine Java library and Solr will remain a search server built on top of Lucene. From a user perspective, things will be much the same as they were – just better.

So what is with the merge?

Because of the way things worked in the past, even with many overlapping committers, many features that could benefit Lucene have been placed in Solr. They arguably “belonged” in Lucene, but due to dev issues, it benefited Solr to keep certain features that were contributed by Solr devs under Solr’s control. Moving some of this code to Lucene would mean that some Solr committers would no longer have access to it – A Solr committer that wrote and committed the code might actually lose the ability to maintain it without the assistance of a Lucene committer – and if Solr wanted to be sure to run off a stable, released version of Lucene, Solr’s release could be tied to Lucene’s latest release when some of this code needed to be updated. With Solr planning to update Lucene libs less frequently (due to the complexities of releasing with a development version of Lucene), there would be long waits for bug fixes to be available in Solr trunk.

All and all, there would be both pluses and minuses to refactoring Solr code into Lucene without the merge, but the majority have felt the minuses outweighed the pluses. Attempts at doing this type of thing in the past have failed and resulted in diverging similar code in both code bases. With many committers overlapping both projects, this was a very odd situation. Fix a bug in one place, and then go and look for the same bug in similar, but different code in another place – perhaps only being able to commit in one of the two spots.

With merged dev, there is now a single set of committers across both projects. Everyone in both communities can now drive releases – so when Solr releases, Lucene will also release – easing concerns about releasing Solr on a development version of Lucene. So now, Solr will always be on the latest trunk version of Lucene and code can be easily shared between projects – Lucene will likely benefit from Analyzers and QueryParsers that were only available to Solr users in the past. Lucene will also benefit from greater test coverage, as now you can make a single change in Lucene and run tests for both projects – getting immediate feedback on the change by testing an application that extensively uses the Lucene libraries. Both projects will also gain from a wider development community, as this change will foster more cross pollination between Lucene and Solr devs (now just Lucene/Solr devs).

All and all, I think this merge is going to be a big boon for both projects. A tremendous amount of work has already been done to get Solr working with the latest Lucene API’s and allow for a seamless development experience with Lucene/Solr as a single code base (the Lucene/Solr tests are ridiculously faster than they were as well!). Look for some really fantastic releases from Lucene/Solr in the future.

About Mark Miller

Read more from this author


Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.