When it comes to new ways to searching old things, it’s hard to beat the museum and library space. Close observers of the search space, of course, have seen this insight pop up on our site, whether in presentations from Lucene Revolution, some of the really compelling scalability analysis (I think of these guys as the ones who put the “b” in billions” for Solr/Lucene scalability) at the Hathi Trust blog, and the webcast this past April featuring pragmatic search tips from the library team at Stanford.
Skeptics might claim that public sector embrace of open source is driven by a public domain open source sensibility. But this is not a matter of sentiment: it’s about pragmatic innovation. The Smithsonian Institutute’s Collection Search center (incidentally, the winner of the “Best Re-Purposing of Descriptive Data” category in the ArchivesNext Best Archives on the Web awards contest according the American Historical Society), is a case in point, as described recently in their blog:
In implementing this Collections Search Center, the Smithsonian reviewed a number of commercial and open-source products. The functional requirements included the support of faceted metadata searching, Boolean / simple search logic, synonym/stemming matching, proximity matching, customizable relevance ranking, and highlighting display capability. This system will need to support a wide range of documents and objects from libraries, archives, and museums (LAMs). In the end, the Smithsonian selected the open-source Lucene/Solr indexing software for the project.
The Lucene/Solr search engine has offered the Smithsonian a flexible and scalable indexing environment to support the fast growing online collections served in the new search center.
Simply put, this meaningfully complex search problem was solved using Solr/Lucene as the best alternative, over and above commercial search engines.
If you’d like to know more, I’d recommend an excellent presentation by Chien-hsien Wang of the Smithsonian, delivered last month at Lucene Revolution.