Learn Lucene… deeper

by Erik Hatcher
September 12, 2011

You’re using Solr, or some other Lucene-based search solutions, … or you should and will be! You are (or will be) building your solutions on top of a top-notch search library, Apache Lucene.

Solr makes using Lucene easier – you can index a variety of data sources easily, pretty much out of the box, and you can easily integrate features such as faceting, highlighting, and spellchecking – all without writing Java code. And if that’s all you need and it works solidly for you, awesome! You can stop reading now and attend one of our other excellent training courses that fit your needs. But if you are a tinkerer and want to know what makes Solr shine, or if you need some new or improved feature read on…

Deeper down, Lucene is cranking – analyzing, buffering, and indexing your documents, merging segments, parsing queries, caching data structures, rapidly hopping around an inverted index, computing scores, navigating finite state machines, and much more.

So how do you go about learning Lucene deeper? You can start with our “Understanding Lucene” DZone refcard. And let’s not forget Lucene in Action, as it’s the most polished, detailed, and well crafted documentation available on the Lucene library. And of course there’s the incredibly vibrant and helpful Lucene open source community. Those resources will serve you well, but there’s no substitute for live, interactive, personal training to get you up to speed fast with best practices.

I’m in the process of overhauling our Lucene training course, that I’ll personally be delivering at Lucene EuroCon 2011 in Barcelona next month. This new and improved course takes an activity-based approach to learning and using Lucene’s API, beginning with the common tasks in building solutions using Lucene, whether you’re building directly to Lucene’s API or you’re writing custom components for Solr.

One area that I’m particularly jazzed about teaching is “query parsing”, the process of taking a user (or machine’s) search request and turning it into the appropriate underlying Lucene Query object instance. Many folks developing with Lucene are familiar with Lucene’s QueryParser. But did you know there are a couple of other query parsers with special powers? There’s the surround query parser, enabling sophisticated proximity SpanQuery clauses. And there’s the mysterious “XML query parser” (don’t let the ugly sounding name dissuade you) that slots dynamic query parameters, such as coming from an “advanced search” request, into a tree structured query template. There’s some more insight into the world of Lucene query parsers an “Exploring Query Parsers” blog post.

What about all the Lucene contrib modules activity in the Lucene 3.x releases? Here’s a bit of the goodnesses: better Unicode handling with the ICU tokenizers and filters, improved stemming, and many other analysis improvements, field grouping/collapsing, and block join/query for handling particular parent/child relationships.

Come learn the latest about the amazing Lucene library at Lucene EuroCon! You, your boss, and your projects will all be glad you did.

About Erik Hatcher

LEARN MORE

Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.

Fusion Platform Overview

Fusion Platform Pricing

AI Hub

Lucidworks Features and capabilities (all Included)

Product Discovery

Searchandising

Site Search

Workplace Search

Ingest Data and Capture Signals

Employee Search Experience

Customer Service and Case Resolution

AI and Large Language Models

Solutions

Commerce

Customer Service

Knowledge Management

Industries

Retail

Government and Public Sector

Healthcare

B2B Commerce and Distribution

B2B Manufacturing

Financial Services

EXPLORE OUR CONTENT

Ebooks & Reports

Blog

Videos

Press

Resources

About Lucidworks

Documentation

Careers

LucidAcademy

Contact Us

Technical Support

Learn Lucene… deeper

About Erik Hatcher

LEARN MORE

Fusion Platform Overview

Fusion Platform Pricing

AI Hub

Lucidworks Features and capabilities (all Included)

Product Discovery

Searchandising

Site Search

Workplace Search

Ingest Data and Capture Signals

Employee Search Experience

Customer Service and Case Resolution

AI and Large Language Models

Solutions

Commerce

Customer Service

Knowledge Management

Industries

Retail

Government and Public Sector

Healthcare

B2B Commerce and Distribution

B2B Manufacturing

Financial Services

EXPLORE OUR CONTENT

Ebooks & Reports

Blog

Videos

Press

Resources

About Lucidworks

Documentation

Careers

LucidAcademy

Contact Us

Technical Support

About Erik Hatcher

Related Articles

LEARN MORE