With a near doubling over the prior year, Apache Lucene Eurocon Barcelona 2011 came to a successful conclusion this past week, featuring over 40 presentations reflecting the breadth, depth, and ambition of open source Apache Lucene / Solr search. Slides can be viewed here, as well as on Slideshare.com.
Nearly 300 attendees converged on one of Europe’s most beautiful cities for two days packed with use cases, implementation discussions, roadmap challenges, hard-won best practices, cutting-edge innovation, and more. A sampling:
- Lucid Imagination Chief Scientist Grant Ingersoll talked about how Search is driving the era of Big Data by evolving from a matchup of queries, documents and content relationships to a more nuanced, analytics-enabled stack that leverages user interactions for improved results.
- HortonWorks CEO Eric Baldeschwieler delivered a comprehensive history of Hadoop and its impact from Yahoo! and beyond, and set forth the challenges of transforming the leading data scaling platform.
- Twitter Search Infrastructure Technical Lead Michael Busch pushed past the frontiers of real time search, recounting how Twitter uses its Lucene search infrastructure to harness its 230 million tweets/day with 2 billion search queries, indexed in under 10 seconds, with average query response time under 50 milliseconds.
Cutting edge use cases:
- Ivan Provalov of Cengage presented a multi-lingal search relevance assessment methodology with Solr/Lucene, spanning languages from Spanish to Chinese
- Aaron Binns of the Internet Archive walked through search and archiving 1.3 billion web documents across 2,272 collections
- Charlie Hull of Flax told how Reed Specialist Recruitment dumped their legacy job search environment, moving over 3000 users from search of 100s of millions of resumes to a Solr/Lucene search app with sub-second facted response running on a single server
- Giovanni Fernandez-Kincade of Etsy.com showed how they scaled Solr to deliver 500 queries per second across 10 million listings
Real World Best Practices:
- Tyler Tate of TwigKit set forth strategies for designing mobile search (hint: on the small screen, precision trumps recall)
- Rafal Kuc of Solr.pl presented the basics of using Solr’s ‘explain’ function to make sense of Lucene relevance rankings and put boosting to work for you
- Trovit’s Marc Sturlese, Barcelona hometown search hero, presented how global classified advertising leader Trovit harnessed Hadoop HDFS and Solr to get the best of search and big data
16 Apache Lucene Committers looking ahead:
- Lucid Imagination’s Mark Miller and Robert Muir laid out new features in Solr 4 and Lucene 4, respectively
- PMC Chair Simon Willnauer questioned where open source search technology goes next
- Lucid Imagination’s Andrzej Bialecki took the covers off Lucene’s new portable index format and its impact on lowering the cost of moving search applications to more cutting edge platforms
- Finite State Automata expert Dawid Weiss dared the audience to rethink the power of randomized testing where no test has gone before
Of course, there’s much more — see the full listing of the agenda here, browse the slides, or visit Slideshare.net. You can see the e-guide summarizing the program here. Videos of the sessions, as well as slides and video of the Lightning Talks, are slated for publication over the next week or two; we’ll update periodically as they are available.