You’re using Solr, or some other Lucene-based search solutions, … or you should and will be!  You are (or will be) building your solutions on top of a top-notch search library, Apache Lucene.

Solr makes using Lucene easier – you can index a variety of data sources easily, pretty much out of the box, and you can easily integrate features such as faceting, highlighting, and spellchecking – all without writing Java code. And if that’s all you need and it works solidly for you, awesome! You can stop reading now and attend one of our other excellent training courses that fit your needs. But if you are a tinkerer and want to know what makes Solr shine, or if you need some new or improved feature read on…

Deeper down, Lucene is cranking – analyzing, buffering, and indexing your documents, merging segments, parsing queries, caching data structures, rapidly hopping around an inverted index, computing scores, navigating finite state machines, and much more.

So how do you go about learning Lucene deeper? You can start with our “Understanding Lucene” DZone refcard.  And let’s not forget Lucene in Action, as it’s the most polished, detailed, and well crafted documentation available on the Lucene library. And of course there’s the incredibly vibrant and helpful Lucene open source community. Those resources will serve you well, but there’s no substitute for live, interactive, personal training to get you up to speed fast with best practices.

I’m in the process of overhauling our Lucene training course, that I’ll personally be delivering at Lucene EuroCon 2011 in Barcelona next month. This new and improved course takes an activity-based approach to learning and using Lucene’s API, beginning with the common tasks in building solutions using Lucene, whether you’re building directly to Lucene’s API or you’re writing custom components for Solr.

One area that I’m particularly jazzed about teaching is “query parsing”, the process of taking a user (or machine’s) search request and turning it into the appropriate underlying Lucene Query object instance.  Many folks developing with Lucene are familiar with Lucene’s QueryParser.  But did you know there are a couple of other query parsers with special powers?  There’s the surround query parser, enabling sophisticated proximity SpanQuery clauses.  And there’s the mysterious “XML query parser” (don’t let the ugly sounding name dissuade you) that slots dynamic query parameters, such as coming from an “advanced search” request, into a tree structured query template.   There’s some more insight into the world of Lucene query parsers an “Exploring Query Parsers” blog post.

What about all the Lucene contrib modules activity in the Lucene 3.x releases?   Here’s a bit of the goodnesses: better Unicode handling with the ICU tokenizers and filters, improved stemming, and many other analysis improvements, field grouping/collapsing, and block join/query for handling particular parent/child relationships.

Come learn the latest about the amazing Lucene library at Lucene EuroCon!  You, your boss, and your projects will all be glad you did.

About Erik Hatcher

Read more from this author


Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.