Guest Blog Post
Many Lucene implementations, of a variety of vintages, date back to a time where Solr lacked core capabilities once available only by building from scratch all underlying services needed by the Lucene search libraries. Happily, Solr today offers a very complete (with some small, but meaningful exceptions) implementation of Lucene functionality.
Does moving from Lucene to Solr make sense for you? Here are some thoughts on the advantages of doing so.
- Using Standards by Default
As a server application running in Jetty, Tomcat or any other Servlet Container, Solr easily installs integrates and runs in existing production environments. Solr’s RESTful APIs and its XML-driven implementation simplify configuration, operation, and search application development. With a rich array of client libraries, from standard SolrJ for Java through JSON, Python, and many others, the base of programming skills needed for the search application is much narrower.
- Makes Lucene Best-Practices ready to use
From caching Filters, Queries or Documents via Spell-Checking to warming Searchers in the background, Solr offers an enormous set of search features which would require lots of Lucene experience and expertise. Solr makes you immediately benefit from these low-level developments, simplifying creation and development of your search environment.
- Underlying services are built-in
As a server application running in Jetty or Tomcat, Solr easily installs and runs, and its underlying services are addressed in the scope of the server. File operations, memory management, I/O configuration, and many more platform capabilities you would have to write yourself to build and deploy a Lucene application are already taken care of, simplifying creation and management of your search environment.
- Service Oriented Search
Solr’s RESTful APIs and its XML-driven implementation simplify configuration, operation, and search application development. With a rich array of client libraries, from standard SolrJ for Java through JSON, Python, and many others, the base of programming skills needed for the search application is much narrower.
- Re-use some of your Lucene Libraries and indexes
Because Lucene is at the core of Solr, your implementation of Lucene can re- use many of the same libraries; just plug in details for your handlers or analyzers into Solr’s config.xml and start testing. Likely as not, you’ll find you can move much of your code as is from Lucene or even use you existing indexes with Solr directly.
- Lower maintenance and revision costs
Many of the low-level, highcontrol advantages of implementing directly with Lucene are negated if anything changes in your environment. As a server, Solr insulates you from much of that, and helps remove the temptation to hard code optimistically or to skip abstractions.
As large data sets grow in scope and distribution, search services will necessary rely on a much higher level of abstraction, above not only data and I/O, but for more elastic distribution of search resources and operations, i.e., shards and insert/updates. If your business could be in a place where hardware might scale via a transition into some kind of cloud environment, you will benefit by taking advantage of forthcoming Solr cloud enabling capabilities, including distributed node management, relevancy calculations across multiple collections, etc.
It’s important to note that there are many good reasons not to migrate to Solr from Lucene, whether they have to do with the cost of a new abstraction model in your general application implementation, or with no real need for serving search over HTTP. With the merger of the Lucene and Solr development projects, you won’t be shortchanged on any of the underlying functionality. But at the same time, the stronger functional affinity between the two means you’ll have to give careful thought to your long term deployment goals in order to pick the right one.
Simon Willnauer is a Lucene committer and search application developer living in Berlin.