Lucene and Logs: Update

A couple more notes on this subject since the Webinar from a couple of weeks ago:

Steve Arnold of Beyond Search asks in a blog post:

…the notion of integrating log files is a good one but I wondered how long it takes to suck big log files, determine deltas, and then update the indexes.

We’ve offered some of the information from the Webinar in a case study we’ve posted about our work with Boomi:

The logging-and-searching service is characterized by frequent commits to make the data available for search; every 5 seconds or 10,000 transaction messages. … There are between two to ten million log transaction generated daily and each may trigger two or more Solr entries. Boomi maintains a rolling 30-day record of log entries.

Not to be outdone, there’s some interesting new input on using Solr for this kind of application from Symplicity, an integrator who does government and university applications, whose Solr credentials include fbo.gov, a site that searches business opportunities within the Federal government through the General Services Administration:

For a while we used a commercial solution to centralize and search our logs, but they wanted to charge us tens of thousands of dollars for just one gigabyte/day more of indexed data. So I said forget it, I’ll write my own solution!

We already use Solr for some of our other backend searching systems, so I came up with an idea to index all of our logs to Solr. I wrote a daemon in perl that listens on the syslog port, and pointed every single system’s syslog to forward to this single server. From there, this daemon will write to a Solr indexing server after parsing them into fields, such as date/time, host, program, pid, text, etc. I then wrote a cool javascript/ajax web front end for Solr searching, and bam. Real time searching of all of our syslogs from a web interface, for no cost!

How an electronics giant meets engineers where they are, with 44 million products in catalog

Meet Mohammad Mahboob: A search platform director navigating 44 million products across...

From Search to Solutions: How AI Agents Can Power Digital Commerce in 2025

Watch this on-demand webinar to discover the six smartest AI-driven DX strategies...

Build custom AI agents without writing a single line of code? Yep, we did that.

Finally, a low-code AI platform (really, no code) that lets the people...

Lucene and Logs: Update

You Might Also Like

How an electronics giant meets engineers where they are, with 44 million products in catalog

From Search to Solutions: How AI Agents Can Power Digital Commerce in 2025

Build custom AI agents without writing a single line of code? Yep, we did that.