It wasn’t the biggest lesson learned from Alberto Mijares’ talk on Day 2 of Lucene Revolution, but the notion that funding issues can lead to a new and successful business model was uplifiting, at the very least.

Slides for this session:

When Mijares’s company, Canoo Engineering AG, met with Swiss newspaper publisher and media group Axel Springer, they all agreed that what Axel Springer needed was to keep readers on the sites of their most popular newspapers longer, and to drive traffic from those papers to their less-well-known brands. They also agreed that providing “related articles” was the way to do that.

What they didn’t agree on was how to pay for the development of such a service.

Still, Canoo soldiered on. Deciding on Lucene/Solr as the tool of choice was an easy decision; Lucene’s “More Like This” function would be perfect for finding articles similar to the current page.

Well, almost perfect. What Canoo discovered is that while More Like This does work out of the box, without semantics the results weren’t good enough for their purposes. Furthermore, at the moment, it’s designed to work well in English, and the “financial language” of Switzerland is German.

They solved the problem with a combination of strategies. The first was to use WMTrans, Canoo’s language tool and the foundation behind the linguistic sites Leo.de and Canoo.net, to perform linguistic analysis. They then used Lucene’s analysis pipeline to add semantics to the data using external sources such as Wikipedia in order to get better results from More Like This.

But the “funding problem” remained. Finally, they hit on a solution: they’d provide “related articles”, but using the Software as a Service model. This way, they could make the service available to other companies, and what was a one-off project became an ongoing business. Literally, the search turned from an implementation into a business opportunity, seeking new customers who are ‘more like this’.

Because of the semantics, not all companies will be able to take advantage of Canoo’s service; their system requires documents to have a significant size in order for the analysis to be accurate, though they’re working on shrinking that down. But by providing their application as a service, Canoo has expanded their market considerably. So because Canoo was able to both (a) have confidence in an open source solution that would get them close to what they needed and (b) get commercial-grade support for that product (disclosure: Canoo is a client of Lucid Imagination, a sponsor of Lucene Revolution), they were willing to take a chance on a project even though the potential client wasn’t, leading to not one project, but an ongoing business.

And that lesson I was talking about? Have faith in open source, and have faith in yourself; it can definitely pay off in the long run.

Cross-posted with Lucene Revolution Blog. Nicholas Chase is a guest blogger.This is one of a series of presentation summaries from the conference.