Question answering is often thought of as a subset of the search paradigm and is not usually referenced in terms of digital commerce use cases like product discovery. When shoppers search for products on a website, we say they are ‘querying.’ Another word for querying is ‘questioning.’ The products returned from a customer query can be thought of as the answers. It’s the same idea.
While Lucidworks Smart Answers is touted as a question answering engine, thinking about the customer search experience we decided to explore using Smart Answers to power a semantic search pipeline for product discovery. With a different combination of model training parameters and a different mold of training data used to create Smart Answers models, we have found that using Smart Answers for this semantic product discovery use case is an additional way to improve the customer experience .
Smart Answers for Product Discovery
This approach isn’t mutually exclusive from traditional Fusion search. Instead of solely applying Solr’s keyword matching and BM25 scoring system, the deep learning methodologies behind Smart Answers work in conjunction with these Fusion/Solr features to further ensure the best products are retrieved for a given customer query.
At a high level, there are four technical steps in this approach:
- Use shopper queries and the product clicked on from the search results as training data pairs
- Use titles and/or descriptions of products to learn the vocabulary of the product catalog and train custom embeddings or leverage pre-trained word embeddings
- Encode or vectorize all product titles/descriptions with the trained deep learning Smart Answers model
- At query time, encode the incoming queries into the vector space of the encoded product catalog and retrieve the closet products in the vector space
How Semantic Search Works
Before we get too deep in the weeds, let’s do a refresher on semantic search. Semantic search is the process of interpreting the intent and contextual meaning of queries and content. Unlike traditional keyword search, which only matches documents explicitly containing those keywords, semantic search typically supports full natural language queries with a more human level of understanding. It’s the kind of search that powers a really great chatbot.
Generally speaking, shopper queries include few terms. Think about it: “black umbrella” is shorter than “if I unwrapped the umbrella can I still return it”. With just a few search terms in each query, traditional search methodologies cannot extract the appropriate contextual concepts to enrich the quality of returned results. Applying this vector based approach and semantic pipeline to product discovery, on the other hand, allows for the right concepts to be extracted and applied.
Smart Answers Enriches Queries to Provide the Most Relevant Results
When queries are encoded into vector space—aka, a customer types something in the search bar—Smart Answers automatically enriches the search by utilizing any relevant synonyms, correcting misspellings, and identifying phrase matches from observing everything in the nearby vector space. These rules are traditionally created and managed by merchandisers. For example, a rule could be that anytime someone types ‘shose’ it should be corrected to ‘shoes’.
Smart Answers and Fusion mitigate the need for merchandisers to be constantly creating new misspelling, synonym, and phrase rules to try and capture all the different ways a customer might query for an item. One of our missions at Lucidworks is to create an increasingly ruleless experience for merchandisers and this Smart Answer’s approach to product discovery automatically applies these types of rules to improve the customer experience and save merchandisers the hassle.
Learning from the Persistent Customer to Improve Product Discovery
In practice, we have chosen to start with applying this methodology on zero result searches. We create training data pairs from shopper query-product pairs in addition to leveraging the search sessions of the ‘persistent customer.’ We look at sessions where a customer received zero results for a particular query and then look at the next product that customer added to their cart to create another set of query-product pairs. Curating training data in such a manner allows Smart Answers to bridge customer vernacular with that of the product catalog.
While we continue to tinker and refine the exact tunings and configurations for this approach, zero result searches provide a very low risk and very high reward opportunity for this methodology to be applied— no harm in experimenting with a new methodology in an area where conversion rate is zero… On the other hand, with this Smart Answers approach applied, this zero percent conversion rate will easily be propelled upwards! Even just a slight percentage jump in this rate can translate into potentially hundreds of thousands to millions of additional sales and revenue.
Right Answer, Right Place, Right Time
Depending on the commerce business vertical, zero result searches traditionally account for 5-10% of all searches; in some cases we have seen this number rise as high as 40%. On the high end, that means that two out of every five searches are ending up on a “no results” page. And for high traffic websites, that is millions of queries everyday, which means millions of missed opportunities for revenue. At Lucidworks, we are turning these 0% conversion rate searches into millions of opportunities to create additional revenue at 0% downside risk.