Industry Best Practices & Top Insights delivered to your Inbox.
Blog Engineering

How We Built a Scalable Architecture for Real-Time Recommendations

Shivkumar M Shivkumar M has over 20 years of experience shaping technology product and GTM strategy. With B2B SaaS expertise across industries, he leads product launches, adoption, and GTM as Director of Product Marketing.
How We Built a Scalable Architecture for Real-Time Recommendations

The key to building flawless user experiences is timely, personalized messaging. That’s why our platform unifies customer data from multiple sources: phones, tablets, laptops, POS systems, APIs, etc., allowing marketers to better understand their users and segment them for targeted omnichannel campaigns.
But what happens when your app has 10 million users and you want each user to feel like the experience is customized to their specific needs?
That’s where our latest feature, Product Recommendations, comes in. It uses machine learning to send relevant suggestions to users across engagement channels so that every user experience is personalized and relevant.

Why Real-Time Recommendations are Necessary

Weekend recommendation emails are outdated and ineffective. Marketers now need to use more advanced engagement strategies, such as in-app notifications with videos or carousel push notifications. The ability to send personalized recommendations to users at the right moment on the right channel results in a superior user experience.
But there’s a larger problem looming over you, namely: how do you scale that kind of personalized recommendation to millions of users?

Our Solution: Building a Scalable Architecture

Here’s how we solved it: we paired the infrastructure at CleverTap with AWS.
The result delivers real-time recommendations at a scale that enterprise businesses can actually use for their millions of users.
CleverTap Architecture

A Scalable Architecture for Real-Time Data Analysis and User Engagement
  1. Devices: These are the end-users’ devices (phones, tablets, laptops, etc.) which send logs to Data-Collectors. The data can also come through other POS systems or a CRM database.
  2. Data-Collectors: This component receives the incoming data and serves campaigns such as in-app for mobile devices/tablets and web-popup for laptops/desktops. These are designed for very high throughput. Data-Collectors finally put the data in a queue.
  3. Queue: This offers a buffer between high throughput and latency.
  4. Data-store: This de-queues, maintains the user’s data, and stores the incoming action. It also aggregates and maintains user-level and device-level details, unifying all customer data in one place.
  5. Message Delivery Service: This is agnostic of the campaigns and journeys that are running or scheduled. A typical campaign involves a 2-step process:
    • A. Querying the data-store to collect users that satisfy the segmentation criteria, and
    • B. Serving campaigns to these users in the form of push, email, SMS, or webhooks.

Recommendation Pipeline –

Setting up recommendations is a two-step process. Here’s how we complete all the related tasks at scale and in real time:
1. Build a product correlation matrix
For every product or piece of content, we build a list of recommendations. Depending on the business, we can refresh this model every week or every day. Some use cases might require updating these recommendations continuously in an “online” mode.

  • Collect training data
  • Deploy a machine learning algorithm to create a correlation matrix
TaskComponents involvedReal-time
1Collecting training dataData-store✅*
2Building product correlationsAWS Batch + S3✅*

* potentially, it can be built in real-time. For now, it happens on a nearly real-time basis.
2. Serving these recommendations
Once the recommendations model is built, customers can start serving personalized recommendations based on the recent behavior of every user. This is done by:

  • Calculating recommended products based on a user’s most recent purchase or activity
  • Customizing the notification template with the recommended products or content
TaskComponents involvedReal-time
1User information (last purchased)Data-store
2Serving recommended productsData-Collectors, Message-Delivery-Service

Say you want to deliver recommendations through in-app notifications. As soon as a user launches the app from their device:

  1. An “App Launched” event is pushed to one of the Data-Collectors.
  2. This Data-Collector queries the Data-Store to fetch recommendations based on the last content viewed or products purchased by this user.
  3. The Data-Collector then puts the recommendations in an in-app notification template and returns it as a response to the device.

Similarly, to send recommended products through an email:

  1. Message-Delivery-Service queries Data-store for each user’s personalized recommendations.
  2. Message-Delivery-Service then puts the recommendations in an email template and forwards it to the email provider.

Building the Product Correlation Matrix

This phase is where Deep Learning comes into play.
Deep Learning models require a significant amount of training data. The Data-store is optimized to iterate through all events very quickly. Furthermore, its distributed nature scales up the speed linearly with the number of machines.
Preparing training data has its own set of challenges:

  • Neural networks are data hungry.
    They need huge volumes of training data. We have set a minimum limit of 5 million records.
  • Training a neural net can take up to a couple of hours.
    This training process should be decoupled from Data-store and delegated to a separate entity to cause minimum interference with regular campaigns.
  • Enriched Profiles are necessary.
    To serve recommendations in real time requires maintaining and storing the last few items viewed by every single user.

We overcome the above challenges with our in-house Data-store.

  • Neural networks are data hungry.
    The distributed nature of Data-store scales down the time taken linearly. Also, it covers a larger date range to get at least 5 million data points. As a result, we get a set of .CSV files that we then ship to S3 for downstream usage.
  • Training a neural net can take up to a couple of hours.
    To minimize interference with regular usage, the Data-store triggers this job at 3:00 am. When the training data is ready in S3, we submit a new AWS Batch Job. An AWS Batch Job internally spawns a new machine, trains the neural net and saves the results in a file on S3.
    Data-store fetches this output from S3 when the AWS Batch Job finishes. This output file contains the top 20 recommendations for every item a user has viewed.
  • Enriched Profiles are necessary.
    A process similar to (1) also saves the last few items viewed for every user. On a new “Product Viewed” event, we update this data in real time. This information can also be used to solve some other interesting use cases.

Recommendations Data Flow

Real-Time Recommendations at Scale

Contextual, real-time recommendations fetch far better responses. As soon as a user adds a product to their wish list, marketers will want to show them similar and/or complementary items.
For example, streaming apps can now feature movie trailers for the top 3 recommended movies as soon as a user visits their Favorites screen. To take this a step further, they can display the latest reviews, audience ratings, and movie synopses to prompt users to hit “Play” or “Add to watchlist.”
But serving up product recommendations is not without its share of challenges:

    1. High throughput: The larger the number of simultaneous users, the higher the chance that your system gets bogged down. Except, real-time campaigns shouldn’t face a lag when traffic increases. In fact, high traffic is an opportunity to grow business and any lag is a loss of revenue.
    2. Enriched Product Information: Not all product information comes with the “Product Viewed” event.
      • Dynamic properties such as ratings, price, etc. should be recent
      • Large pieces of information such as product descriptions, product reviews, etc. should be available
    3. Filter Products: Users will want to organize the way they view your product recommendations. They will want to filter by category (e.g., Movies only vs Movies and TV series) or by availability in your current stock (e.g., available in store vs. delivery)
        This is where CleverTap’s architecture excels.
        1. High throughput
          Data-Collectors are involved in sending in-app notifications, web pop-ups, and app inbox notifications. AWS ELB helps to scale according to traffic. Read our blog on how we scaled our infrastructure to handle high traffic during the FIFA and IPL championships.
          Message-Delivery-Service handles all the other channels such as push, email, webhooks, SMS, etc. We can already send as many as 25 million push notifications per minute. These can be triggered or scheduled campaigns.
        2. Enriched Product Information
          CleverTap also allows uploading a product catalog where you can specify auxiliary information such as ratings, description, image URL, deeplink, and other such properties for each product. It also allows you to re-upload the Catalog to keep it updated.
        3. Filter Products
          CleverTap gives users the chance to filter recommended products based on properties mentioned in the Catalog. For example, catalogs can have columns such as:
          • Category: Movies, TV Series, Sports, etc.
          • Out of stock: this can be true or false

Closing the Loop

Product Recommendations help maximize revenue by increasing upsell opportunities with personalized 1:1 recommendations.
They are available just like any other personalization variable in campaigns, and give you insight into campaign stats such as CTRs, conversions and more — allowing you to track your progress.
Journeys allows the orchestration of recommendation campaigns. A/B testing is also available to help the system decide on the best variant to offer to users. And then there are Real Impact and Control Groups to help measure the effectiveness of your recommendation campaigns.
As we continue to evolve this feature, however, we plan to focus on:

      1. Exclusions
        If we also aggregate the last few purchased products or items viewed for every user, we can exclude them in the campaign. This will guarantee that the user always gets to see new recommendations.
      2. Track per product
        CleverTap doesn’t track which suggestions were sent in the campaign and which were clicked. Tracking these fine details will help the marketer understand the diversity and coverage of products from these recommendation campaigns.
        a) Diversity: Number of new products that the Target group ends up viewing compared to Control Group. Good recommendations engine should expose rare gems to the users.
        b) Coverage: Percentage of recommended products with respect to the catalog. A good recommender should cover the majority of the catalog.
      3. Cold start
        New products are not a part of training data. They have very few to no entries in the product correlation matrix. We can update the neural net with this new data. This will enable new products to be a part of the offered recommendations.
      4. Online training
        Product correlations and similarities don’t change frequently for many businesses. For some businesses, such as song-streaming apps, social platforms, etc. recommendations can, however, become stale quickly. Online training is the right solution for such cases. For example, Data-store can update an “online” model every 15 minutes.
      5. Use demographics and other details
        Data-store has unified aggregated information for each user. We can also provide demographics, location, techno-graphics, and other details to the recommender for better results.


Recommendations are a powerful way to engage customers, particularly as a means of delighting and retaining users over time. Serving trending and popular products makes sense initially when the user launches the app for the first time. However, even a couple of days of interaction builds a rich user history. Even users with a very small history can be targeted through recommendation campaigns, throughout the user’s lifecycle across multiple channels.
Champion users will look forward to the next recommendation campaign because the notification will arrive in time with the right context.

Further Reading

If you’re interested, check out our series of posts that explain the machine learning algorithm that powers our recommendation engine.
How are you using Product Recommendations? We would love to hear your thoughts in the comments below.

Last updated on March 29, 2024