Fall is in the air, at least for many of us here at Lucidworks, and the harvest has been plentiful for Lucidworks Big Data.  We’ve added a number of new capabilities to the beta as we work towards GA.  The improvements range across the stack and are all in an effort to bring deeper capabilities to our search, discovery and analytics platform, as well as to respond to more and more customer requests.  Without further adieu, the improvements are:

  1. Apache Hive Support: A number of customers asked for SQL-like interfaces to their structured content, so we’ve integrated Hive into our platform.  
  2. Classification-As-A-Service (CAAS):  Multi-tenanted, replicated, Classification (using Stochastic Gradient Descent from Mahout for starters) As-A-Service designed to scale-out and be fault-tolerant.  We’ve started with serving pre-trained models via our ReST APIs and will be able to train/test and experiment with different techniques and models in the not-so-distant future.
  3. Tighter integration between HBase and Lucidworks Search to enable deeper integration between offline analysis algorithms and search. Previously, we had two different pathways for content to come into the system: one for bulk loading and one for near real-time.  These have been consolidated into a single approach that services both bulk loading and near real time needs.
  4. More Deployment Options: Hosted and On-Premise. On-premise now supports both virtual and physical hardware deployments. 
  5. Additional metrics about document usage in the system

Moreover, on the business front we’ve added a number of new customers with a range of use cases, including e-discovery, fraud analysis, large-scale archiving/search/analysis, bioinformatics (medical research, genetic analysis, etc.) and more. It’s not too late to get into the beta, either.  Simply fill out the beta application and let us know your use case and we’ll get the ball rolling.