Outdoor retailer REI has a customer-service-forward brand, personified in its stores by knowledgeable, helpful employees in green vests. Carrying this ethos across the wires to its online customers is a top priority of the company. But without in-person interactions with the green vests, how does REI cultivate a customer-centric culture over its website?

“Every search result is an expression of the authenticity and expertise of the REI brand,” explained Luke Warwick, REI’s Program Manager for Search and Browse. “On our site, we actually have more interactions, more experiences, and thus more brand-building experiences with customers than all of our green vests combined in the field, which is why we are so focused on relevancy.”

Warwick, along with REI Data Science Product Manager Jake Pratt and Lead Software Engineer Joshua Groppe, shared the journey REI took to improve customer experience online by factoring SKU and product availability into search results in their talk at the virtual Activate conference.

The Problem

The large variety of each product type available at REI makes search relevancy incredibly important as top results are powerful positions for promoting clicks and purchases. With over 100 bike shorts in the catalog, items high in the search results would ideally be popular items that suit the bulk of shoppers with appropriate size and color varieties.

Before REI optimized their search results for availability, the number two item on the list of bike shorts was the very popular Pearl iZUMi shorts. This seems reasonable, but when a customer clicked from the list to item detail, he discovered that unfortunately only an XXL size in one color was available.

Since XXL suits just a limited number of shoppers, that item appearing so high in search results was a disadvantage to customers and to the business.

At the time, the order of products in a search results page was determined by business rules and customer signals. These signals (clicks, add-to-cart, and conversions) identified popular products that should appear at the top of search results. But, popular products also tend to get low in stock quickly. This created a conundrum where items with low availability, limited sizes and colors that didn’t necessarily fit your average REI shopper, got high scores in its models and thus prime positioning in search results.

When neither increasing the importance of add-to-cart and conversions nor accelerating the time decay of those signals achieved the desired results of lowering the score of these limited-supply products, REI’s Business Intelligence and Search Teams joined the effort to improve relevancy.

Designing a Solution

REI hadn’t previously integrated an in-house data product directly with the digital infrastructure, nor had it presented output from a data science model directly to customers. Previously, there was always a human layer for interpretation between a model’s output and the customer.

On this virgin ground, Jake Pratt and the data science team prioritized simplicity, transparency, and explainability in the design. This philosophy allowed them to quickly estimate calculations for manual testing and built confidence in the model’s output. Simplicity also allowed them to quickly iterate with the limited resources on REI’s data science and engineering teams.

One drawback of designing around simplicity was an early decision to use batch rather than real-time inferences. With scores produced nightly and added once daily to the search index, the experience was not as agile and responsive as if they were done in real-time.

Joining Data From Multiple Systems

The necessary data had to be united from multiple operational systems. These included inventory from the ERP system, sales and return data from the POS and order management systems, product attributes and planning data from the master data system, and clickstream and performance data from web data. As is often the case, this also uncovered some dirty data that didn’t match up well so cleaning, prepping, and merging was needed to get everything to work properly.

Most of the project was accomplished using SQL in the data warehouse, but the model relied heavily on Python and REI’s analytics stack.

How Do You Measure Availability?

Most retailers measure their inventory as a sum of the SKUs in stock, but this wouldn’t suffice for the experience REI wanted to present.

While availability could be measured at a SKU level, search results are presented at a product level. The solution needed to have a way to aggregate SKUs by product and to take into account certain attributes of the inventory that REI identified as important to serving the majority of its customers.

REI determined these three attributes defined general product availability best:

  • Sizes available
  • Color choices available
  • SKUs remaining out of total SKUs originally available

To figure out what variety of sizes should be available for a product to rank high in search results, the data science team relied on a model they had previously created for their merchandising team. That model could tell the merchandising team – for example – that given a certain purchasing budget for bike helmets, how much should be spent on stocking each size of that helmet. A basic statistical model was used, taking into account sales over the previous two years and stock-out data to identify what sales were lost to a lack of inventory. This method answered the question, “If inventory were infinite, how much of each size would be sold?”

For the sizing model, it became clear mediums and larges were the most in demand so they were given higher weights than the smalls and extra larges.

Color was a little trickier since customers are more flexible in buying a different color than a different size. Someone who fits a small size helmet won’t buy an XL if that’s the only size in stock. They may buy a red shirt if their preferred blue isn’t available. This makes answering the infinite inventory question posed above more difficult because stock-out choices are less cut and dried.

The team relied on the last two years of sales data for the distribution of colors of products sold, which showed blacks, grays, and neutrals are popular while neon hues were less in demand.

The final factor to be folded into the model was measuring customer choice. “What defines this is essentially what we call the seasonal assortment,” explained Jake Pratt. “A merchant decides for each season what to buy, where to sell it, etc. We use that information, along with a lot of logic that we’ve had to build in for edge cases, to determine what our denominator should be, what the ‘out of total SKUs’ number should be.”

This allows search results to better represent the stock that REI’s knowledgeable buyers thought should be available on the sales floor.

Put It All Together and What Do You Get

Each SKU’s availability is indicated with a one (available) or zero (unavailable). There are, however, multiple business rules that influence this binary measure. For example, some items, such as boats or fuel, can’t be shipped to certain locations, or at all.

Combining binary SKU availability with the three factors previously described in a simple algorithm produces a weighted product availability score (WPA).

In the example above, four of a total six possible SKUs are available so without any additional factors input, the WPA (4/6) would be 0.66. When size, color, and customer choice weights are included in the calculations, the product’s WPA drops to 0.56. This makes logical sense as the medium sized helmet is unavailable in one color variety and the large size is unavailable in the other color. Since these are the sizes that fit the majority of the population, this limited availability would affect more shoppers, shoppers who would consider those products “unavailable” since they are not available in their size.

With a lower WPA, the product will be displayed lower in search results than if the additional factors were not figured into relevancy, the exact results REI was after.

Integrating the Solution with Search

With the model successfully built, the search team led by Lead Software Engineer Josh Groppe was handed a couple of challenges:

  • How should the model be factored into search relevancy calculations
  • How can the impact of the changes be measured

REI runs its product discovery search on Lucidworks Fusion integrated with FindTuner by Innovent Solutions used to manage business and merchandising rules. It uses signals aggregated from user clickstream data in its formula to determine relevancy, which, prior to incorporating the WPA, was done with this formula: (Document Score * Signals Boost)+ Business Score

Implementing the WPA score dampens the previous score proportional to how available the product is and products are reordered in search results accordingly.

Measuring the Impact

To understand the true value created by this change, in addition to measuring variations in KPIs using A/B testing, benefits to the customer experience also had to be gauged.

User experience benefit was measured with something REI calls “shadow querying.” The way shadow querying works is for every search request made by a shopper, behind the scenes two requests are created. The primary request returns results to the user. The secondary request is sent through mechanisms that include the new model. Any variety in the two result sets are recorded and analyzed.

With the new model successfully integrated, a product with low availability now moves from a top result to much farther down the page.

Lessons Learned

At this phase, REI has learned a few important things from this project. The first being that data hygiene is a huge challenge. The project unearthed data challenges REI had with availability outside of search. This discovery alone produced tremendous value.

Another lesson is that for a model that factors size and color into availability, products that only have one SKU, like tents, should be removed from the A/B test, as they’ll dilute the results. Single SKU products are either available or not so don’t need to be put through the availability algorithm.

Above all, Luke Warwick said, “be persistent. Introducing new models into your relevance algorithm is going to run into challenges along the way, but being persistent will help you push through those and ultimately deliver the value to your customers that they deserve.”