What Is Semantic Search?
Explore the challenges semantic search solves and where it could be helpful for your brand.
A colleague recently said, “I have the impression that different people mean different things when they talk about semantic search. What do we at Lucidworks mean when we say semantic search?”
The simplest definition of ‘semantic search’ is searching by meaning. In the context of digital commerce, my perspective is that semantic search refers to a set of techniques for finding products by meaning instead of lexical search, which provides product discovery by matching words and their variants.
Others may argue that the meaning of semantic search depends on a particular technique, such as an ontology, a knowledge graph, or a semantic vector space. The inconsistency in using the phrase ‘semantic search’ is not surprising given the rapid evolution of semantic search techniques to understand meaning over the past 15 years. Let’s first consider that history helps ground us in our current use of the phrase.
A Quick History of Semantic Search
Consulting the Wayback Machine for the year 2007, the Wikipedia entry about semantic search began with the following:
Semantic search attempts to augment and improve traditional research searches by leveraging XML and RDF data from semantic networks to disambiguate semantic search queries and web text to increase the relevancy of results.
There was a clear focus on the semantic web and linked data. By 2009 the Wikipedia entry had been changed to include a reference to ontologies and the semantic web:
Other authors primarily regard semantic search as a set of techniques for retrieving knowledge from richly structured data sources like ontologies on the semantic web.
In 2010 the first sentence was changed to include the concepts of searcher intent and contextual meaning:
Semantic search seeks to improve search accuracy by understanding searcher intent and the contextual meaning of terms appearing in the searchable dataspace, whether on the web or within a closed system, to generate more relevant results.
By 2019 the first sentence had been simplified:
Semantic search denotes search with meaning, as distinguished from lexical search, where the search engine looks for literal matches of the query words or variants without understanding the query’s overall meaning.
We can see the evolution of consensus on the meaning of semantic search from a focus on ontologies, RDF, and the semantic web, to the more general “search with meaning.” Google’s approach to search evolved over the same period to focus more on meaning, introducing the Google Knowledge Graph (“things not strings“) in 2012, conversational search in 2013, RankBrain (ML-based ranking) in 2015, and BERT and “neural matching” in 2019.
Rather than taking a stand on whether or not semantic search has to include a knowledge graph or a particular type of ML model, I think it’s more helpful to focus on the effectiveness of a set of semantic search techniques in solving specific problems.
Each Semantic Search Technique Solves Specific Problems
Even though we say that lexical search is about finding by matching words and their variants, and semantic search is about finding by matching meaning, both approaches have the same goal in ecommerce solutions: for product discovery that matches the shopper’s intent. In other words, the goal is to respond to a query with products relevant to the task or interest implicit in the query.
I often hear “understanding query intent” discussed as the ultimate goal of search… However, the searcher’s query often reflects just one part of a goal. If a DIY shopper on an auto parts ecommerce site searches for shop rags, are they just thinking about shop rags? The site can offer more relevant products once the reason for needing shop rags is understood – what job the DIY shopper has in mind. Are they doing an oil change or some other messy job? Maybe it makes sense also to include other cleanup products.
And when a shopper searches for organic lemonade on a grocery ecommerce site, why not include products relevant to an interest in organic juices and snacks?
There are ecommerce solutions that do an excellent job of recommending products related to a goal. Still, the recommendations usually appear in the page’s “you may also like” section. The idea of focusing with high precision on query intent and then relegating other goal-relevant products to recommendation zones seems counterintuitive to me. Lexical search can be tuned to achieve a more goal-oriented relevance.
However, it usually involves query-specific rules that require constant curation to keep up with changing product assortments, shopping trends, and seasons. There are machine learning approaches to suggest such rules, but the suggestions are often of mixed quality and require vetting by the ecommerce team before being deployed – another type of curation.
Let’s focus on two specific semantic search techniques: semantic vector search for better recall based on a goal-oriented perspective of relevance and semantic query parsing for better precision when queries include specifications such as dimensions and price range.
Semantic Vector Search
Semantic vector search is a deep learning approach in which a model learns from shopper behavior to encode products and queries in a shared vector space – sort of like the way groceries are organized in aisles and shelves in a physical store. The organic lemonade is next to the organic orange juice. The flour is next to other common baking ingredients. The grocery store staff adjusts how products are shelved based on shopper behavior. This semantic search technique continues to learn over time as product assortments and shopper behavior change.
Semantic Vector Search Enables More Intuitive Relevance
Semantic vector search is much better than lexical search at predicting relevance based on what shoppers tend to buy given a specific query, and it does so without curation by merchandisers and search managers. The shopper searches for organic lemonade; they get to see organic orange juice and a variety of organic juices for kids following the organic lemonade. Furthermore, it can accomplish this for queries it hasn’t seen before.
Semantic Vector Search Slashes Zero Results
Fixing zero-results queries is an ornery burden for search managers. It’s another curation task without end, often involving a double-digit percentage of searches. Search managers focus on the top occurring zero-results queries, meaning that many long-tail queries are not addressed – missed sales and money left on the table.
Semantic vector search produces far fewer zero-results outcomes without curation. When organic lemonade is out of stock, the shopper still sees organic orange juice and a variety pack of organic juices for kids. One of the world’s top five retailers deployed semantic vector search and decreased null results by 91% compared to the previous year, translating into hundreds of millions in sales. For example, if I search for “pumpernickel crackers” that don’t exist, I’m served a mix of similar products.
Start With Specific Targets
Semantic search delivers a more intuitive implementation of relevance, so why not send all queries to semantic search?
Ecommerce companies have invested years of effort in tuning lexical search. Many ecommerce search queries are handled by lexical search without constant curation. We can think of this as relevance equity.
I recommend starting with queries performing poorly based on KPIs such as AOV and CTR. The risk of damaging relevance equity is lower for these queries, and the opportunity to improve KPIs and save the time of merchandisers and search managers is higher.
Semantic Query Parsing
Another semantic search technique is semantic query parsing, a word sense disambiguation (WSD) type. WSD research has been ongoing for decades, and various machine learning techniques continue improving and competing for state-of-the-art status. “Knowledge-based” approaches to WSD utilize an ontology or knowledge graph (or both). In ecommerce, a knowledge graph can be derived from product search data and shopper behavior. There is often some level of curation involved in maintaining the knowledge graph.
The goal of semantic query parsing in ecommerce solutions is to identify mentions of concepts in a query and then offer product discovery relevant to those concepts. The concepts could be named entities such as brands, designers, manufacturers, or specifications, such as size, color, and price range.
Specifications can be further classified into negotiable and non-negotiable categories. This negotiable/non-negotiable classification is specific to each ecommerce solution vertical and sometimes specific to individual companies. If I search for womens maroon pumps size 9, the gender and size are probably not negotiable. (Size might be if a specific shoe is known to fit small or large.) On the other hand, the mention of the color maroon may be negotiable.
Semantic Query Parsing Improves Precision
Semantic query parsing can be used to route queries for concept-specific processing. Different concepts can be associated with different models or logic for normalizing concept mentions.
Non-negotiable specifications such as size can be used to filter search results.
But what if the specification is ambiguous, as in the query mens jeans 30. Does 30 refer to the inseam or waist? Once semantic query parsing determines that a query mentions a size, the size part can be tagged, and the query can be routed to a “size resolver.”
Semantic Query Parsing Facilitates Concept-Specific Models
Negotiable specifications can be used in various ways depending on the concept. For example, a query that mentions a color could be routed to a color encoder trained on shopper behavior. Which colors convert when the query is for maroon pumps? A color encoder might learn that shoppers see rust and burgundy pumps as relevant to a query for maroon pumps.
Some queries include mentions of vague concepts such as inexpensive. Inexpensive kids snow boots and inexpensive adult snow boots likely imply two different price ranges. A price range model might learn to predict price ranges in the context of a query based on shopper behavior.
What are semantic search tools?
We can wield semantic search techniques, such as semantic vector search and semantic query parsing, to enhance user experience by going beyond traditional lexical search methods. Semantic vector search, powered by deep learning, predicts relevance based on user behavior, offering more intuitive product discovery and reducing zero-results queries. Semantic query parsing identifies concepts in queries, allowing for precise results based on specifications like size and color. By incorporating these semantic search tools, businesses and other organizations can significantly improve search relevance and user satisfaction across various industries.
Why Is Semantic Search Important for Your Business?
Understanding semantic search is vital for boosting conversions and revenue. A strong grasp of semantic search ensures an efficient and trustworthy digital shopping experience. The website’s streamlined shopping process can leave a lasting impression, potentially encouraging customer return. For enterprise businesses and B2B distributors with extensive product catalogs, implementing solutions like Lucidworks enhances search relevance and personalization, leading to rapid revenue growth and ROI.
Semantic Search at Lucidworks
Our strategy is about consumer goals, not consumer queries. We’re building on the foundation of our existing semantic vector search capability, initially adding semantic query parsing capabilities to act as a set of “precision guardrails” for vector search and ultimately as a tool for routing queries to concept-specific models.
Interested in learning more about how we use semantic search for product discovery? Check out our recent webinar, “The Case for Semantic-Based Approaches to Product Discovery.” To learn how to leverage semantic search for your business or organization, visit our platform overview page.
Best of the Month. Straight to Your Inbox!
Dive into the best content with our monthly Roundup Newsletter!
Each month, we handpick the top stories, insights, and updates to keep you in the know.