Last week we discussed diffuse categories in part one of the series. If you haven’t read that post yet, I recommend starting there to understand how diffusion creates challenges for search teams and customers. 

In part two of the series, I’m sharing recommendations on dealing with diffusion in romance copy, the long descriptions that often accompany a product. Read on to ensure you’re building the best online shopping experience for your customer and returning the results that lead them to make the purchase.

Diffusion in Romance Copy 

Search logic often relies on the category that something is in to validate that the products search wants to bring back are relevant. Products are generally paired with a long description when you click into a specific product off of a results page. For example, a pair of pants “looks great with crop tops” according to the long product description. But when I search tops, I don’t want my query for “tops” to have the first results as pants that “look great with crop tops” just because both the long description and the query have the word “top” in it. 

One solution would be to match the product with the category and have those results rank ahead of other related products. For example, you can match a query for “tops” to a pair of pants because those pants look great with tops, as long as the top results rank ahead of the pants results. 

A preferred solution is to include conditional logic in the matching. Rather than attempting a match on all fields in one pass and prioritizing which field is more important, consider matches on long description only after you know that you have no (or very few) matches on other fields like product name or category.  

Don’t Underestimate the Power of Product Descriptions

Keep in mind that the long description is there for a good reason. It helps the customer understand how to use the product, what the product is really like, how the product was fabricated, and so forth.  Consider a query such as “Made in America” or “100% cotton” — these queries are only likely to succeed if we look at the romance copy.

This realization, that the long description contains some information that is useful for search and other information that is not useful, has engendered discipline in the content management practices in some companies. I have long admired The GAP / Banana Republic for how they break up their long description into multiple fields and, in general, they do an outstanding job with precision. But even they run into issues.

Consider a “Slub Cotton Modal V-Neck T-Shirt” available at Banana Republic at the time this article was written. This product contains the following details:

Slub Cotton-Modal V-Neck T-Shirt
  • SUSTAINABLE: Made with a silky soft and slightly textured blend of cotton and Lenzing® modal, sourced from European beechwood trees, harvested from sustainably managed forests.
  • V-neck.

  • Center back seam shapes the waist (without flaring out at the sides).

  • Taped seams for an ultra-smooth, ultra-comfy fit.
  • Straight hem.

  • Produced in a facility that runs P.A.C.E. – Gap Inc.’s program to educate and empower women in the communities where our products are made. Learn more HERE

  • #551579.


For the most part, this detail content is outstanding. There are few words here that will cause this product to be incorrectly recalled. A person looking for sustainable shirts, soft shirts, Lenzing shirts, tapered shirts, and so forth will find this shirt. But what about someone who wants a peplum-style or fit-and-flare style shirt who searches for “flare tshirt” — this person is going to find this Slub Cotton Modal V-Neck T-Shirt and be disappointed because this product does not flare at the side. Yet, at the same time, someone else considering this purchase will decide to purchase this product precisely because of this detail that the shirt will not flare at the sides.

We’re back to the devil’s bargain that I mentioned in part one of the series. Should we index this content or shouldn’t we? If we want to bring back products when someone searches 551579, which is to say if we want to support our in-store associates who likely use the website all day, then we absolutely want to index this content.  

Build a Better Digital Experience

If you are looking to set up a standard for product naming and description, we recommend that you start with the Amazon standard which you can read here. Third party sellers who want their products to sell on Amazon follow these rules religiously. 

This is all the evidence I need: well-organized content is essential to search quality.  The principle of garbage in / garbage out applies even to Amazon – a company with millions to spend on its machine learning-driven search platforms. Nobody selling thousands of products in a single category (much less multiple categories!) is going to do this perfectly. This is why at Lucidworks we have built a platform that allows you to thread the needle and instrument the search to work with your content no matter how disciplined or diffuse.

Interested in learning more about how Lucidworks solves these types of challenges for retailers like REI, goop, and more? Drop us a line

About Peter Curran

Peter Curran

Meet Peter Curran, General Manager of Digital Commerce

Read more from this author


Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.