A great solution to one problem does not guarantee it can solve another. This isn’t earth shattering logic, but in the world of search this concept is highly relevant–and isn’t always considered.
Machine learning techniques solve some of the most basic problems of ecommerce search (e.g. the old problem where the most relevant stuff is at the bottom of the page), but different platforms handle certain problems well and other problems rather poorly. Vendor A is good at this, Vendor B is good at that, Vendor C plays better over here, Vendor D shines over there. If search relevance has been figured out, then why is quality so inconsistent across vendors?
Putting the Focus on Models
Models, the data available for training them, and the ingenuity behind the model-training methodology is at the heart of this. Each vendor in the space takes a certain proprietary approach to training their machine learning models. Makes sense. To a significant extent, assumptions in those models or those training approaches predetermine their customers’ outcomes. Lucidworks’ advantage is that it ships with models and training approaches, but they can be overridden with customer models, 1st-party data, and novel training approaches – essentially putting a lot more power in the hands of the ecommerce customer, but also offering best-in-class guidance and pre-trained models in the absence of data science sophistication.
If we label all of that something like, “ML Strategy for Product Discovery,” then I believe that ML Strategy for Product Discovery becomes at least an essential component (if not the essential component) in the intellectual property arsenal of any company–brand, manufacturer, retailer, or distributor–that sells products online.
I cringe when I hear about companies doing under $10 billion in online revenue building their own search infrastructure. What a waste of money when you could invest that money in something more interesting and ultimately more strategically valuable, like a brilliant data science team to create a proprietary modeling methodology internally.
Retailers with a large number of products, departments, stores, and multi-channel touch points are treated with the requisite sophistication by Lucidworks and the models we provide in Fusion. In other words, it could make onsite search much more profitable by focusing on the harder problems that boost conversion, AOV, brand loyalty, and so forth.
Enter Query Routing
Query routing starts with the idea that there could be multiple opinions about the meaning of a query, or even multiple opinions about the meaning of just a part of a query. In browse, relevancy is somewhat black and white. However, when categorizing a customer query, there are relevant results and irrelevant results.
In the context of search, things get a little bit unique. People use specific words and phrases to find what they are looking for. We can then think of the search index as divided into relevant results and irrelevant results but it’s actually a little more complicated than that–because the user can query for anything. For example, a user may search for an “iron-free multivitamin” but because they’ve already used those specific words, vitamins including iron may be returned even though that wasn’t the initial intent of the query.
The context of returned queries can be grouped into four categories:
- True positives are the things that the search engine returned that should be returned
- False positives are the things that the search engine returned that are incorrect
- False negatives are the things that should have been returned that were not
- True negatives are the things that were not returned that were correctly omitted
The goal of query routing is to reduce both false positives and false negatives in answering a query. Effective query routing not only minimizes the potential for returning irrelevant information, but also eliminates search bounce and the assumption that you don’t have what the consumer is looking for. All boats rise.
The Role of Data, Signals, and Semantic Search
Like different vendor data science approaches, different commerce companies have different data and that data is useful in solving certain queries, but can be poor at solving for others. Shoppers come up with a perfectly phrased query on one commerce site, get the exact results they are looking for, then use that same query on another commerce site and end up with no results found.
So how do we solve that problem? We’ve taken the approach to interpreting queries using a novel new methodology called semantic vector search. Using an AI-driven approach, the machine learning models gain intelligence from collected consumer signals. These collected signals train the model how to organize products and queries based on their similarities to each other – relative to the rest of the information you have. It uses your customer’s goal or intent to retrieve additional products that may relate to the original query.
Routing queries can be used for concept-specific processing. Those concepts can be linked with different models or logic for normalizing concept mentions. More ambiguous queries can be routed through models that have been trained on behaviors presented by shoppers.
How Fusion Enables and Streamlines Query Routing
Using Fusion’s query pipeline stages, customer’s can create custom calls to a number of different sources for a diverse set of responses. Let’s say you run a site for athletic clothing. A simple query for ‘shorts’ would easily return the expected results from one model. A query for ‘shorts that look good with a blue shirt’ might be more challenging for the original model, but a complex model run by recently-launched Google Retail Search would be able to solve this. There may even be other models out there that return even better results! Let’s take a look at how Fusion can route queries to multiple models to solve these complex problems.
Fusion gives you the flexibility to orchestrate and route product catalogs and signal data to Google Retail Search, Fusion, or any other model services like AWS, Bing, or Milvus. When a user submits a complex query, custom SDK query pipeline stages then selectively route the query to multiple models that are set up. The results are gathered and scored within the context object. Boost scores are applied to the query responses from each model and weighted based on those boost scores. Fusion sorts the results based on this weighted relevance, then returns the highest ranked response to the end user. Because the query was applied to multiple models, the best results are returned that seamlessly match the user query.
Users can then leverage Fusion’s additional features, such as experiments, business rules or Predictive Merchandiser, to fine-tune query relevancy and take their search to the next level.
Using these custom pipelines, the output is an optimally constructed query that enhances the precision and relevance of a regular lexical Fusion query pipeline and the precision of vector search pipelines. Fusion shines when presented with keyword style queries, directly returning results with attributes matching what was found in the query. Additionally, Fusion catches spelling errors (i.e. flanel vs. flannel), whereas Google Retail Search often returns nothing in the case of a misspelling. Fusion’s architectural flexibility to consult multiple models uniquely differentiates Lucidworks from the rest of the search platform organizations out there. When returning results to the end user, the heterogeneity of the way we seek opinions on queries needs to be diverse.
Query routing can offer exceptional query understanding in uncertain scenarios. By focusing on the user goal and running the query against multiple models, we can create better outcomes that connect customers with the products and information they want. If you’d like to know how Lucidworks can supply this functionality to your brand, please get in touch with us.
Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.