Product recommendations are a crucial part of online shopping. They help customers find items they will like and help businesses increase sales. But how do these product recommendation systems work?
By
Yichen Jin
Co-Founder & CEO, Fleak
At a high level, these engines sift through shopper data to understand their preferences and interests on top of their search queries. By analyzing search behavior and product choices, these engines deliver personalized suggestions and offers tailored to each individual, boosting engagement and driving sales. In this article, we'll explore a clever method called "two-stage retrieval and reranking" that makes product recommendations better with just a small tweak.
How Two-Stage Retrieval Improves Product Recommendations
Imagine you're in a huge library, searching for a novel about dinosaurs. Two-stage retrieval and reranking is like having a super smart librarian who knows exactly what you like. Here’s how it works:
Step 1: Retrieval
First, the librarian quickly grabs a bunch of books from the shelves that mention dinosaurs. These are just guesses based on what you asked for: dinosaur history, dinosaur encyclopedia, or even dinosaur coloring books.
Step 2: Reranking
Next, the librarian carefully looks at each book and decides which ones are the best match for your taste. Since you mentioned “a novel about dinosaurs”, Jurassic Park and The Lost World by Michael Crichton will be put at the top of the pile even though the title itself does not include “dinosaurs”. The encyclopedia and coloring books will get placed further down in the pile.
This process helps make sure you get the most relevant products recommended to you, just like finding a dinosaur novel in a big library.
Implementing a Product Recommendation System Using Two-Stage Retrieval with Reranking
The library example above explained the concept of two-stage retrieval and reranking. Now let’s dive into a step-by-step guide on how to build a product recommendation system using this technique.
Data Preparation:
First, to create a product recommendation system, we need a product catalog that we can search. An effective way of building it is to convert product details into vectors and store them into vector databases such as Pinecone. Vector databases provide the ability to store and retrieve vectors as high-dimensional points. They add additional capabilities for efficient and fast lookup of nearest-neighbors in the N-dimensional space.
The sample dataset we use in this guide is Costco’s electronics catalog. You can either use this notebook code sample to save embeddings into Pinecone or refer to Fleak embedding API blog to create an API endpoint that takes care of embedding and ingests data into Pinecone vector db for you.
Once the product data is stored in the vector database, we can implement a two-stage retrieval process to enhance the accuracy of the recommendations.
Two-Stage Retrieval:
Let's imagine a user is planning a ski trip and enters the query, "What camera to bring for my ski trip?" into the recommendation system.
Step 1: Initial Retrieval
First, the system takes the user’s query and converts it into a vector using an embedding model. This vector captures the essence of the query—understanding the need for a camera that is durable, portable, and suitable for cold, snowy conditions.
Once the query is embedded as a vector, the system then searches the vector database for camera products with similar vector representations. This step retrieves a broad set of cameras that align closely with the concept of a ski trip, based on the proximity of their vectors to the query vector.
As the result shows below, the initial retrieval pulls up a broad set of results, including cameras that might not be perfect for a ski trip but are still somewhat related - Lorex and Wyze are both outdoor home security cameras that share the durability and portability qualities of the ideal ski trip camera.
Step 2: Reranking
To refine the order of search results after the initial retrieval, a reranker reorders the results based on how closely they match the user’s query and make sure that the most relevant items appear at the top of the list.
There are several popular reranking models that are widely used in information retrieval systems:
BM25: A probabilistic model that scores documents based on the frequency of query terms in each document, adjusted by term frequency and document length.
BERT-based Rerankers: These models leverage the BERT (Bidirectional Encoder Representations from Transformers) architecture to understand the context of the query and documents, providing a deeper level of relevance scoring.
T5 (Text-To-Text Transfer Transformer): A versatile model that can be adapted for reranking tasks by rephrasing the reranking task as a text generation problem.
Cross-encoder Rerankers: These models take the query and each document as input and output a relevance score, allowing for fine-grained reranking based on semantic understanding.
Today, we are using the “bge-reranker-v2-m3” model (provided by Pinecone Rerank API). This is a lightweight multilingual rerank model based on BGE-M3. It focuses on understanding the context of your needs rather than just matching keywords.
By checking the “Rerank Results” box, we turned on the rerank stage. Here the “Query String Path” refers to the users’ initial query, "what camera should I bring for my ski trip?". The rerank model then compares the user query with the “description” field from the retrieved 50 vector results. The initial set of results got reordered and now reflects much more accurate matches. Specifically, the top 5 results are now having outdoor sport cameras with durability, weather resistance and portability within top 3, precisely the type of cameras ideal for a ski trip.
With better product catalog data and more enriched user profile data, the rerank step would work even better in pushing down less relevant options further and improve the topN quality.
It is important to note that the same two-stage retrieval and reranking technique can be applied to any product recommendation system, regardless of the industry or product type. If your product recommendation system needs to consider other metrics like pricing or new release dates, you can store these fields as part of the metadata into the vector database. Once the two-stage retrieval is done, you may easily apply additional sorting logic on top of the reranked list. With Fleak, you have the flexibility to use native SQL to customize these sorts, ensuring your recommendations align perfectly with your business goals.
You can adapt this method to your own projects by modifying the Python script provided here. Alternatively, you can import a pre-built template to learn how to ingest vectors into Pinecone or get customized product recommendations into your Fleak account and start converting more sales using the production-ready API today.
Other Posts
Jan 6, 2025
Adaptive Processing Architecture: AI-Driven Self-Healing Systems
Modern data processing systems face significant challenges in maintaining operational efficiency amid evolving requirements and increasing complexity.
Dec 9, 2024
Unifying Data Pipelines and Microservices: A Novel Architectural Approach
A new architectural approach that bridges the traditional divide between data pipeline systems and microservice architectures.
Sep 25, 2024
Unlock Hidden Gems: Gain Valuable Insights from Social Apps Through AI Workflows
Social media platforms have a lot of user data. This data shows what people like and how they behave, which is really useful for businesses, researchers, and developers.