When shopping online, today’s customers do not expect to put much effort into finding what they need. Effective search is the difference between a successful online retailer and ending up a parked domain for sale. Search on an ecommerce site today goes well beyond the little box in the upper left-hand corner. It weaves through every part of the site from the first time a customer visits to the site to their next purchase. Today’s customers expect a Google-like or Amazon-like experience with their search results and effective online retailers anticipate their needs before they search.

Key Concerns for Online Retail Sites

Making a customer or potential customer’s experience so that “it’s easy to find things” means bringing back relevant search results from the very first query. This includes the results that are personalized based on who the customer is are and what they’ve ordered or browsed in the past. A smart retail site doesn’t wait for the next query but targets the customer with recommended products whenever they come to the site. Further, relevant results are timely, don’t show winter coats on sale when a customer is shopping for shorts in summer. To provide this type of highly detailed relevancy, the retailer needs relevant information about:

  • The customer including any demographic information, regional data, and past purchase and shopping behavior
  • Organized/indexed data from the backend inventory so the search app is always selling items that are in stock – and boosting ones that are trending.
  • Product information containing the various keywords and text a customer might enter into a search box
  • Data from auxiliary systems like credit-processing and loyalty.

Achieving effective search, targeting and personalization often means overcoming obstacles such as connecting feeds to/from legacy systems and keeping your data as fresh as possible.

Obstacles Retailers Face

Many online retailers didn’t start out online and often have legacy systems ill-suited for the realities of today’s ecommerce. Many of them have a brick and mortar presence and have systems that are dual use between both the online and offline lines of business. Even retailers that were born online, generally didn’t debut yesterday and not all of those systems are as easily accessible as modern-day ecommerce platforms (we’re looking at you, Blue Martini). This data may be difficult to integrate with other systems and may not even be “clean” enough in some cases to develop an effective schema out of the box.

Meanwhile, to personalize and target customers effectively, a retailer needs to know something about those customers. That means developing an effective customer profile which stores characteristics of the customer. This data helps produce relevant search results that align with the customer’s preferences at query time. Sometimes this data is in a custom database, other times it is in a CRM system, or stored in the ecommerce platform itself.

The schema which represents the customer must be flexible and adapt to the ever-changing needs of the business – and needs of the customer. This means being able to add new types of data without a major system change. For example, just because people use Skype today for customer support doesn’t mean they won’t use an entirely different means of communicating or paying tomorrow. Storing the customer’s shoe size is one thing, but what kind of smartwatch they may be something a retailer needs to know in the future as it might be userful for both payment and suggesting peripherals. A fixed schema approach such as those inherent to an RDBMS may not be an effective and efficient way to capture and use this information.

Finally, Google Analytics is fine for tracking how your whole site is used, but effective relevance tuning and search requires more personalized and comprehensive clickstream tracking. That means capturing what a customer is looking at and storing it with the customer profile. Many online stores do this effectively, others do not. Capturing the signals associated with a customer’s search is a non-trivial task. To capture them you need hooks at both the site and search level. Integrating signals at the search level is a fair amount of code.

Getting Your Data Feeds

Getting data can be hard. In an ideal world, you buy a connector rather than write and maintain one. Even if the source has a database connector, REST or WebServices API, what do you do to normalize the data? Plus, data frequently needs to be massaged and enriched with other data. To do that you need to organize the data through pipelines, in order to both manage change and provide an operational way to configure the manipulation of data.

Data should be indexed into a flexible schema and combined with other sources of product data. The correct characteristics for one type of product (color, shape, standard, plug type) might not be the same for another (color, wrist size, analog or digital). Customers need to be able to facet their search based on effective categorization (electronics, televisions, screen size, vs clothing, dress size, etc). Moreover, effective characteristics need to match your customer profiling effort for effective targeting. Often times this overlaps search, sometimes it is more specific (bargain shopper, season, closeout, priced to go, premium item).

Getting the data into the search index in a timely manner without affecting system operations can be a major effort unto itself. You need a system that is flexible, operationally efficient and manageable, not to mention scalable.


Tuning relevance is both an art and a science. The techniques and technologies for both are forever evolving. At one time simple keyword search was enough. Tuning relevant results today takes into account context and history. Relevance means tuning the search to both product and customer characteristics and plugging in effective algorithms. Often times this means correcting terrible spelling and determining intent from fractional information with a lot of noise. Increasingly this is even involving voice search so users can speak their queries directly into a device

Targeting and Personalizing for On-line Retail Customers

Search and targeting are now personalized, a customer whose favorite color is blue might see blue shirts boosted over red ones. A customer who bought a TV might see HDMI cables advertised. For goodness sake, don’t show a customer who just for the first time bought men’s dress shoes and a jacket more men’s dress shoes. Increasingly companies are using predictive models from neural networks to simple clustering to extract and apply these recommendations. It is said there is No Free Lunch in picking these systems and the best retailers spend a lot of time tuning these systems. A system needs to predict change and be flexible enough to plug in new algorithms over the passage of time.


Timeliness matters not just in making sure a customer sees Winter clothes from mid-fall to mid-winter. Timeliness means that the data is fresh, boost the things we have readily in stock first and be up to date with that information. The Index needs to be up-to-date in real-time or close enough for our business cycle. Our search results need to return instantly to meet today’s expectations. Time is money and every nickel or millisecond matters.

The best retail sites anticipate customer needs, target customers with relevant content before they even ask, personalize and contextualize search results and pull together timely information. Doing this requires pulling in data from multiple sources, developing a smart customer profile and identifying the right characteristics of products as well as tracking signals and other customer usage information. Keeping this up and running means doing this in an operationally intelligent and change-tolerant way. Doing this is a lot of work, we’ve taken the bulk of it done it for you and called it Lucidworks Fusion.


At Lucidworks, we’ve spent a lot of time helping our customers tune their search, targeting and personalization. This includes some of the largest online retailers like Bluestem, Staples, and B&H Photo. It also includes brands that sell through offline channels but maintain an online presence. We’ve developed a solution called Lucidworks Fusion that is designed to capture customer signals, integrate pipelines, and connect data from various sources and make it easy to tune and manage your search and indexing solution in an operationally efficient way. Learn more about our solution here.


About Andrew C. Oliver

Read more from this author


Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.