The Lucene Stack is a convenient paradigm for talking about the libraries and applications organized around the Lucene core library that make development faster and easier for search application developers.

The Lucene Stack is a Solution Stack designed to solve common search and text analysis problems. Centered around Apache Lucene and Solr, the stack brings together many components that can be for content extraction, indexing and search.

lucene-stack

The Lucene Stack provides a suite of tools for solving common search problems like faceting, content extraction, crawling, database connectivity, etc. While many of the tools can be used standalone, combining them together or leveraging them via Solr will make development even easier.

Each layer of the stack provides key functionality to build out a powerful search-based application. These capabilities are outlined below:

  • Java (TM) – The ubiquitous, platform-independent compiler and virtual machine.
  • Application Infrastructure – The framework for deploying the application into, e.g. Apache Tomcat, Jetty, JBoss, standalone or others.
  • Connectors and Crawlers – Tools like Droids, JDBC, are responsible for acquiring content to be added to the application.
  • Apache Lucene Java – The well-regarded, high performance core search library used in production by countless applications.
  • Apache Tika – An easy to use content extraction framework that makes working with MS Office (TM), Adobe PDF (TM) and many other file formats a snap.
  • Apache Solr – A scalable, enterprise ready search server that combines Lucene and Tika with powerful features like faceting, replication, easy install and much more.
  • Application – The driving force for it all: your ideas, inspiration and know-how. You provide the ideas, the Lucene Stack provides the capabilities to accelerate those ideas!

Additionally, for those who need tools for classification, extraction and clustering, the Apache Mahout project is an early-stage Lucene subproject aimed at bringing commercial friendly, scalable machine learning capabilities to the Lucene stack. Also note, the Lucene project contains several ports of the core Lucene library, including .NET, PyLucene, C and C++. More than likely, the Lucene Stack and it’s PyLucene, C and C++. More than likely, the Lucene Stack and it’s affiliated tools can meet most search and text-based applications’ needs.