As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Anirudha Jadhav’s session on going beyond the conventional constraints of Solr.

The goal of the presentation is to delve into the implementation of Solr, with a focus on how to optimize Solr for big data search. Solr implementations are frequently limited to 5k-7k ingest rates in similar use cases. I conducted several experiments to increase the ingest rate as well as throughput of Solr, and achieved a 5x increase in performance, or north of 25k documents per second. Typically, optimizations are limited by the available network bandwidth. I used three key metrics to benchmark the performance of my Solr implementation: time triggers, document size triggers and document count triggers. The talk will delve into how I optimized the search engine, and how my peers can coax similar performance out of Solr. This is intended to be an in-depth description of the high-frequency search implementation, with q/a with the audience. All implementations described here are based on latest SolrCloud multi-datacenter setups.

Anirudha Jadhav is a big data search expert, and has architected and deployed arguably one of the world’s largest Lucene-based search deployments , tipping the scale at a little over 86 billion documents for Bloomberg LP. He has deep expertise in building financial applications, high-frequency trading and search applications as well as solving complex search and ranking problems. In his free time, he also enjoys scuba-diving, off-road treks with his 18th century British Army motorbike, building tri-copters and underwater photography. Anirudha earned his Masters in Computer Science from Courant Institute of Mathematical Sciences, New York University.

