While there are still tens of thousands of people experiencing disruption through eruption, I want to return some positive associations to the term. Disruption is a good thing as when it unlocks value and innovation. At the recent NoSQL EU conference in London, our friends at The Guardian talked about how Lucene/Solr is doing just that, as reported by James Governor of Redmonk:
SQL has worked well for the paper. “SQL is great. we can do cool stuff with that. at scale.” “Searching one tag is ok, but what about two? What does it do to the database? “Related content” was 40% of the Guardian’s app load so… the team used a search engine instead. The search engine approach – using Apache Solr – worked well, but scale issues were still likely to become a problem. “Willison suggested the Guardian stuck a massive memcached in front instead”. It worked.
But what about throwing more resource at Oracle instead? “We wanted to avoid Oracle RAC because its really expensive, but we want to scale out”. [Oracle RAC is the database giant’s clustering technology.] The Guardian’s Business Drivers: Linked data, social networks — there is all sorts of information out there. We need to engage with them. We can’t just broadcast the news…The Guardian’s editor called for the organisation to Mutualise the News. What happens with API access, which drives for example, tag proliferation, which dramatically increases load on the database.
“Apache Solr is like a database, it works like one for us.”
I was re-introduced to Clayton Christensen’s marvelous work in his book “The Innovator’s Solution” by Zack Urlocker, ex-VP of Products for MySQL, who recently blogged on the four rules of disruption for business. For search, the rules align nicely with the kind of innovative adoption of Lucene/Solr open source in places like the Guardian.
- There’s a proven market with large incumbents: this demonstrates that customers are willing to pay money to solve this problem. IDC calls the search market at $2 billion and growing through the recession, with Autonomy at one end and Google Search Appliance at the other. If you expand the market definition to databases, well, Larry has a really nice boat.
- There are underserved customers whose needs are not being met by the incumbents: They may be receptive to a “good enough” product that is easy to access. Whatever the imperfections, the accessibility and adoption of Lucene/Solr show there is demand for “good enough”.
- The incumbents cannot profitably meet the needs of this market Ideally, their entry into this market would hurt their core business. This is interesting, in that with the exception of Oracle and its side-effect acquisition of My SQL, none of the other major data management players have a business model that can go after the economies of open source. Stark evidence in a Gartner Study on the costs of BI puts the ratio for Business Intelligence at ~7x: $150K for open source BI from Jaspersoft and Pentaho, a cool million bucks for commercial BI from SAP, IBM, and Oracle. (Hasso Plattner, CEO emeritus of SAP, also has a really nice boat. I don’t think Sam Palmisano does).
- To disrupt market, you need to disrupt all the players, not just some of them If there are other players, you need to disrupt all of them. And here, we come back to the Guardian’s take on Oracle vs. Solr. When we look at who Lucid customers used before they made the move to Lucene/Solr, none of the incumbents are missing. Microsoft FAST, Google’s GSA, Autonomy/Verity, Endeca, Coveo, Attivio, Exalead, you name it — their customers are all looking hard at Lucene/Solr.
But to me, the most exciting part of the Lucene/Solr disruption is not in the economics, beyond new ways to solve old problems: it’s in the new problems that are emerging. The virtuous cycle of disruption emerges when the new solutions make the leap to tackle new problems. There’s more on this, of course, in Prague on May 18-21 at the forthcoming Apache Lucene EuroCon featuring talks by both The Guardian and Zack Urlocker — hope to see you there.