Search has certainly played a prominent role in national security, with much discussion across many channels on the information frontier, whether it’s about finding the needle in a haystack in Social Media or comparing spellings of different ‘persons of interest’ in various communications. But search has no less vital a role to play in matters of domestic security, down to the local level.
The most recent post on Steve Arnold’s Beyond Search trains the searchlight on Ron Mayer of Forensic Logic, a firm that specializes in technology for Law Enforcement, and specifically in application of Lucene/Solr open source search.
In many ways, the problems facing search in law enforcement are close analogs to use cases from other domains. Here are a couple of excerpts of the interview:
There are many factors that contribute to how relevant a search is to a law enforcement user. Obviously traditional text-search factors like keyword density, and exact phrase matches matter. How long ago an incident occurred is important (a recent similar crime is more interesting than a long-ago similar crime). And location is important too. Most police officers are likely to be more interested in crimes that happen in their jurisdiction or neighboring ones. … The quality of the data makes things interesting too. Victims often have vague descriptions of offenders, and suspects lie. We try to program our system so that a search for “a tall thin teen male” will match an incident mentioning “a 6’3″ 150lb 17 year old boy.”
For those of you who ever had to think twice about visiting New York City in the 1980’s owing to the risk of mugging:
There’s been a steady emergence of information technology in law enforcement, such as in New York City’s CompStat. …We’ve had meetings with the NYPD’s CompStat group, and they have inspired a number of features in our software including powering the CompStat reports for some of our customers. One of the biggest issues in law enforcement data today is bringing together data from different sources and making sense of it. These sources could be from different systems within a single agency like records management and CAD (Computer Aided Dispatch) systems and internal agency email lists – or groups of cities sharing data with each other – or federal agencies sharing data with state and local agencies. … The place where Solr/Lucene’s flexibility really shined for us is in our product that brings structured, semi-structured, and totally unstructured data together.
At the same time, with public safety on the line, the challenge of delivering tightly grouped results quickly to the top of a results list is all the more acute.
Want to learn more? Ron Mayer will speak on Highly Relevant Search Result Ranking for Law Enforcement at Lucene Revolution 2011, May 25-26 in San Francisco.