Using Deep Learning and Customized Solr Components to Improve Search Relevancy at Target

by Mike Airhart
September 11, 2019

As one of the world’s largest retailers, Target can’t afford slow or off-base search results. When adding products (and their data) how does Target maintain and improve speed and accuracy at the same time? They use a combination of deep learning models and custom Solr components to deliver highly accurate search results at scale – fast: imagine two million product SKUs with a return rate of 250 ms.

The Technology

Strong machine learning models are the key to scalable performance. Target built models on product title, category, type, and description — as it happens, the key search attributes that you’ll need when querying Solr.

Neural networks use models to identify different search intents and attributes.

It’s a challenge for applications to accurately and fully understand what users want. What, for example, does “c9 running shoes for boys” mean? This is a classification problem: C9 Champion is determined to be the brand, male is determined to be the gender. Retailers can develop a classification framework that, for each product attribute, can accurately generate a model to classify any query.

To classify the query, Target:

Gathers abundant training data from user search queries and user behavior, and from product attributes
Trains machine-learned models for each attribute with prepared lists of query/attribute value pairs, for example: shoes/athletic shoes, shoes/sneakers
Outputs a list of attribute values that are predicted to be related to the original request, with probabilities for each value

Evaluation

After sophisticated neural-model training, Target arrived at these evaluation metrics:

Precision: the number of correct predicted attribute values divided by the total number of predictions for a trial query from the classifier. The higher the precision, the more accurate the predictions are.
Recall: the number of correct predicted attribute values divided by the total number of attribute values there are for that query in the test set. The higher the recall, the more coverage of those attribute values in the test set.
Top-N accuracy: For a query, if any of the top N predictions is relevant, then it scores a 1, otherwise 0.

Target controls recall and precision through a combination of:

Category/attribute classification to relate items within a set,
Filtering for specificity within a set
Customized elevation to promote the most popular items, and
Precision components to filter out product SKUs based upon a threshold.

The relationship between precision and recall is inverse: Higher precision means lower recall. So retailers can determine the ideal balance between both, as well as ideal accuracy, measured in the proportion of times that at least one correct attribute value is in the top five predictions.

For Target, classifiers can achieve precision and recall above 90 percent, and accuracy of top 5 predictions above 96 percent. With proper classification pipeline, a new model can be automatically generated on any attribute within 18 hours. Retraining happens daily.

By using state-of-the-art neural network techniques, in conjunction with customized Solr components, Target has improved its search relevancy by more than 20 percent — increasing sales and decreasing user time and cost.

About Mike Airhart

LEARN MORE

Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.

Fusion Platform Overview

Fusion Platform Pricing

AI Hub

Lucidworks Features and capabilities (all Included)

Product Discovery

Searchandising

Site Search

Workplace Search

Ingest Data and Capture Signals

Employee Search Experience

Customer Service and Case Resolution

AI and Large Language Models

Solutions

Commerce

Customer Service

Knowledge Management

Industries

Retail

Government and Public Sector

Healthcare

B2B Commerce and Distribution

B2B Manufacturing

Financial Services

EXPLORE OUR CONTENT

Ebooks & Reports

Blog

Videos

Press

Resources

About Lucidworks

Documentation

Careers

LucidAcademy

Contact Us

Technical Support