Date: Thursday, July 18, 2013
Time: 10:00am Pacific Time
Register to get the recorded webinar.

Over the past several months, Solr has reached a critical milestone of being able to elastically scale-out to handle indexes reaching into the hundreds of millions of documents. At Dachis Group, we’ve scaled our largest Solr 4 index to nearly 900M documents and growing. As our index grows, so does our need to manage this growth.

In practice, it’s common for indexes to continue to grow as organizations acquire new data. Over time, even the best designed Solr cluster will reach a point where individual shards are too large to maintain query performance. In this Webinar, you’ll learn about new features in Solr to help manage large-scale clusters. Specifically, we’ll cover data partitioning and shard splitting.

Partitioning helps you organize subsets of data based on data contained in your documents, such as a date or customer ID. We’ll see how to use custom hashing to route documents to specific shards during indexing. Shard splitting allows you to split a large shard into 2 smaller shards to increase parallelism during query execution.

Attendees will come away from this presentation with a real-world use case that proves Solr 4 is elastically scalable, stable, and is production ready.

Featured Presenter:

Timothy Potter is an independent consultant, specializing in search and big data technologies. Previously at the Dachis Group, Tim led the design and development of a number of mission-critical big data analytics projects using Solr, Hadoop, Pig, Hive, Mahout, and Storm. He was the chief architect, developer, and operations engineer for DG’s large-scale Solr 4 implementation including a 1TB index containing ~900M documents. Tim is the co-author of Solr In Action from Manning publishers and has worked extensively with Lucene and Solr technologies.  Follow him on Twitter @thelabdude.

Register  to get the recorded webinar.