The once and future history of open source and enterprise search
Lucid Imagination founder Marc Krellenstein kicked off the Lucene Revolution yesterday with a keynote address covering the history of search. Here are the slides, followed by some highlights:
Much as we might think of search technology as a 21st century internet thing, its back to when IBM was sued by the US government. By the early days of the Internet, search—Lycos, Infoseek, Excite, and Alta Vista–began to accelerate the virtuous cycle of requirements and innovation. Marc describes the evolution of the technology from centralized to distributed indexes; among the early players, only Lucene, Google, and Fast had distributed architectures in their initial releases.
There’s no denying the influence of Google on search, and their Page rank algorithm—a popularity-based authority metric—represented a breakthrough in search precision. Along with speed, it helped Google to its now-familiar market position. But in contrast to the public perception of internet search experience as the be-all and end-all, it is essential to understand the differences between “generic” public internet search and enterprise search. One important, little understood virtue of Google is that it is actually a combination of multiple search applications and techniques, tailored to a particular set of user behaviors; the magic is not in one technique or another, but in their deliberate combination.
Marc continues with a review of search fundamentals, precision and recall, and discusses what makes for high scores on each with an emphasis on enterprise requirements, resources, and methodologies. With the stage thus set, the focus turns to the emergence of Lucene and Solr.
He makes the case that Lucene and Solr are industrial strength search technology and are as good as, if not better than, Google for search in terms of precision, performance, and relevance. This derives from the superior attributes of open source as a development model.
That said, as good as Lucene and Solr are, they are not perfect. Issues include a focus on core functionality, at the expense of gaps in certain areas of enterprise requirements. The good news is that these addressed in a number of ways: community, consultants, commercialization, and an enterprise’s internal resources. Marc concluded his remarks with an overview of the competition and the competitive landscape, along with some thoughts on the future.
You can download the slides here.
Cross-posted with Lucene Revolution Blog. Tony Barreca is a guest blogger.This is one of a series of presentation summaries from the conference.
Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.