Every search query from your customer is an opportunity to better understand their needs and preferences. Maximizing business value in a digital environment where search queries are the primary view into your user’s needs quickly revelas that search is a machine learning problem.

You may be asking yourself: How do I make search functionality of my site, app, or startup smarter? How much machine learning is really needed? What kind of infrastructure is required to support all of this ML?

This year Jake Mannix, Software Architect, Search Relevance, of Salesforce presented “Table Stakes ML for Smart Search,” at our annual Search and AI Conference, Activate. During his talk, Mannix covered machine learning table stakes, smart search features, and architectural features to create the optimal search functionality for your users. Watch his talk below to learn more about machine learning in search.


Machine Learning for Smart Search

When looking to maximize business value in a search driven flow, you have to be able to predict with as much accuracy as possible what your users will do after each query. That’s called robust search relevance. “Search relevance is a complex data science problem built on detailed instrumentation to learn predictive models based on your users, their questions, and the potential answers,” says Mannix. The key is to be able to predict with as much accuracy as possible, what your uses will do after querying you.

Make It Conversational

For search to be smart it must be conversational. A session will never be just one search query, there will be a handful of other queries that may need to be followed up with other reformulations.

For example, if you are looking for a “white, black tie coat”, or a “white dinner jacket for a formal black tie event”. Mannix explained that conversational search doesn’t suffer from short term memory loss. Your users are trying to teach you what they mean, and search should continue to get more precise with every interaction.

Make It Natural

The days of simply relying on stop words, fancy tokenizers, stemmers and lemmatizers are over. Natural language search is able to take a user’s request (whether spoken or typed into a search box or chatbot) break it into parts of speech, figure out what their looking for — and what they aren’t — and turn it into the query that gets submitted to a database or search system in order to return the best results.

Users expect to be able to use more complex and natural queries such as, “How to open an account if I have one already?” or “Maria’s top open opportunities in Seattle with stage prospecting that were modified this month?” Natural search allows users to ask questions like they’re speaking to a sales associate, without having to use the “right” keywords and get the most relevant information.

Make It Personal

Personalization is that “you-know-me” feeling while you’re scrolling Netflix and the recommendations are spot on. Or when you visit a website and the chatbot proactively asks if you need support with a recently purchased item. Mannix explained that personalization is about capturing recent and not-so-recent signals across every user channel to deliver connected experiences that drive revenue.

Machine learning that is able to fine-tune search relevancy is also a major time-save for your employees. Keeping track of a decade’s worth of manually created rules creates a messy web for search teams. Tuning searches manually not only takes a long time to do in the first place, but once you’ve put in a manual curation rule, you have to maintain it as time goes on. Machine learning adjusts search results in real-time with little human intervention so teams can focus on adding value.

Note: Necessary Architectural Components

When building out these features for your search engine, keep these architectural components in mind.

  • Data pipelines: Treat relevance feedback from your UI tier as a first class citizen in your data landscape, right up there with find and add to index APIs.
  • Compute: Training your models somewhere scalable and secure
  • ML data storage: A document store which stores the uninverted form of the original documents.
  • Model serving: Inference should be outside of Lucene and Solr to get low latency, independently scalable predictions.

These machine learning table stakes will be foundational for your smart search strategy. Get your search engines a “seat at the table” for your user’s highly divided attention and watch the full presentation at Activate 2020 here.