As virtualization and cloud computing buzz louder, Lucene/Solr open source search is adding a vibe of its own — most recently, with our announcement of our strategic partnership with ISYS technologies. A couple of weeks ago, Business Week wrote up how cloud computing will change business; and in between discussions of VMWare and Amazon’s EC2, tucked in a reference to Xoopit, “[a startup that] has built a specialized search engine capable of finding bits of information scattered among e-mail systems, sales management programs, blogs, and online news sites.” The cool part, not-so-secret? Xoopit is built with Lucene, delivering hosted search services; I met Bijan Marashi, Xoopit’s CEO, at the San Francisco Bay Area Lucene/Solr Meetup a few weeks back. Cloud-based apps that don’t have search yet can get it with Xoopit; their cool mail-search service means you don’t have to wonder what folder you put that email in.
A key attribute of virtualization is what you don’t have to deal with: who cares what disk drives or device drivers EC2 uses? Despite the glories of Unix and Linux, countless sysadmins have marched to their virtual deaths trying to solve low-level interface problems that only subtract value. Sure, that zippy new drive could be speedy, but trying to match drivers and firmware levels in an array of disks? (A guy I worked with who ran large-scale database performance benchmarks would without a file system for huge databases, because he had memorized the names of hundreds of disks and the data each one held — which, owing to the nature of benchmarking, did not change over time. Mercy!) I want storage service, not an exercise in storage anatomy. With virtualization, ta-da! VMWare? Same difference (though doing VMWare doesn’t mean you’re doing cloud, as Robert Scheier points out in InfoWorld). A big part of what makes it useful is all the things you can stop caring about.
When it comes to finding that file, or that email, or that record, the ISYS File Readers do just that — they read the content. Sure, there are plenty of us (and we know who we are) that have labored mightily over fonts, footers, and formats. The bad news is, great fonts are not what make great content. Liberating content from its gilded cage of format is a useful proposition, so much so that many commercial enterprise search vendors charge a bundle for their “connectors”. Device drivers are useful, too; but what really unlocks the value is the service they enable, and the applications built on that service. Combining Lucene/Solr with ISYS File Readers creates a powerful service that can overcome the underlying variations and present content and information as a uniform resource.
Put another way, each format and data storage type has its own set of structures and interfaces — unstandardized. Search with Lucene/Solr virtualizes the data, and creates a standard set of interfaces for operating on it. Once you process your content — in any of dozens of different formats and containers — with the ISYS File Readers, it no longer matter where it is or what it is in — any more than it matters what device driver is on the disk drive in that file system that runs that database somewhere in the EC2 cloud.