Taming Text is released!

A new exciting book just published from Manning, with the catchy title Taming Text, by Grant S. Ingersoll (fellow Apache Lucene committer), Thomas S. Morton, and Andrew L. Farris.

Text processing has become vital for businesses to remain competitive in this digital age, with the amount of online unstructured content growing exponentially with time. Yet, text is also a messy and therefore challenging science: the complexities and nuances of human language don’t follow a few simple, easily codified rules and are still not fully understood today.

The book describe search techniques, including tokenization, indexing, suggest and spell correction. It also covers fuzzy string matching, named entity extraction (people, places, things), clustering, classification, tagging, and a question answering system (think Jeopardy).

Share the knowledge

You Might Also Like

Lucidworks AI Chunking: The Missing Foundation for Accurate Enterprise AI

Behind every AI-powered search, assistant, or generative experience sits a massive volume...

Read More

Beyond Keywords: Why AI Data Enrichment Is the Missing Link for AI‑Powered Commerce

Across B2B and B2C commerce, teams invest heavily in tuning ranking models,...

Read More

Why the World’s Best Enterprises Choose Lucidworks for Search, and Why It Matters Now

Search has quietly become one of the most strategic systems in the...

Read More

Quick Links