How long does it take to develop a recommendation system?

From 30 to 60 working days depending on data volume and integration complexity. We provide a detailed project plan during a free consultation.

What data is needed to train the model?

Minimum — user interaction history (clicks, views, purchases) and item/content descriptions. The more diverse features, the higher the accuracy.

How do you solve the cold start problem?

We use content-based filtering based on semantic embeddings until 3-5 interactions are accumulated, then switch to collaborative filtering. This gives Recall@10 up to 20% for new users.

What metrics improve after implementation?

CTR rises to 8-15%, conversion from recommendations to 3-7%, additional revenue 12-25%. We guarantee measurable results through A/B testing.

Do you support ready-made solutions?

Yes, we develop turnkey: from architecture to production, including integration, documentation, and team training. 12 months of warranty and updates.

How long does it take to develop a recommendation system?

From 30 to 60 working days depending on data volume and integration complexity. We provide a detailed project plan during a free consultation.

What data is needed to train the model?

Minimum — user interaction history (clicks, views, purchases) and item/content descriptions. The more diverse features, the higher the accuracy.

How do you solve the cold start problem?

We use content-based filtering based on semantic embeddings until 3-5 interactions are accumulated, then switch to collaborative filtering. This gives Recall@10 up to 20% for new users.

What metrics improve after implementation?

CTR rises to 8-15%, conversion from recommendations to 3-7%, additional revenue 12-25%. We guarantee measurable results through A/B testing.

Do you support ready-made solutions?

Yes, we develop turnkey: from architecture to production, including integration, documentation, and team training. 12 months of warranty and updates.

Two-Tower and FAISS: AI Recommendation System for Personalization

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Services we offer

Showing 1 of 1All 1566 services

Two-Tower and FAISS: AI Recommendation System for Personalization

Complex

~2-4 weeks

Frequently Asked Questions

AI Development Areas

Discuss your AI project

Free consultation — we'll show you how AI can solve your challenge

Get a quote

We'll estimate the budget and timeline for your AI project

AI Solution Development Stages

Latest works

B2B ADVANCE company website development
1317
Development of a web application for FEEDME
1226
Website development for BELFINGROUP
925
Development of an online store for the company FURNORO
1156
B2B Advance company logo design
620
Development of a web application for Enviok
894

Show more works

A new user enters a marketplace — the system doesn't know their tastes. They see popular products, but conversion drops. This is a typical cold-start. The Two-Tower + FAISS architecture is a proven production-ready approach that delivers CTR of 8-15% and additional revenue of 12-25%. We have implemented dozens of such projects, from startups to enterprise.

Let's consider the key problems: cold-start, scaling to millions of transactions, balancing accuracy and latency. Each requires a separate solution. Production-ready development of an AI recommendation system is built on a Two-Tower neural network and a FAISS index.

How Two-Tower solves the cold start problem?

Cold start for new users is a typical pain point. Without purchase history, the system displays random items, conversion drops. Our approach uses semantic embeddings of items and demographic features to provide relevant recommendations from the first visit. Matrix Factorization works, but only for users with history — for cold start, content features are needed.

We use a hybrid approach: content-based filtering based on text embeddings (BERT, 768-dim) for new users, and after accumulating 3-5 interactions, we switch to collaborative filtering. Recall@10 for cold users reaches 20% vs 5% for pure MF.

Why is Two-Tower architecture more effective than content-based?

Two-tower allows learning user and item embeddings in a single space, training on implicit feedback (clicks, views). This yields CTR of 8-15% vs 2-4% for rule-based systems. On large volumes (>5M transactions), early fusion does not scale — two-tower with in-batch negatives trains in 2-3 hours on 8 V100.

Transaction Volume	Recommended Approach	Recall@10	Latency
< 10K	Content-based + rules	15-25%	< 5ms
10K – 500K	Matrix Factorization (ALS)	25-40%	< 20ms
500K – 5M	Two-tower neural + MF ensemble	35-50%	< 50ms
> 5M	Two-tower + GNN + re-ranking	45-65%	< 100ms

Comparison of metrics before and after implementation

Metric	Without recommendations	With our system
Recommendation CTR	2–4%	8–15%
Conversion from recommendations	1–2%	3–7%
Additional revenue	0%	12–25%
Cold start time	10+ interactions	3–5 interactions

As a result of implementation, additional revenue increases by 12-25%, which for a large e-commerce platform can amount to millions of rubles per month.

Technical implementation: Two-Tower, training and inference

Two-Tower Architecture

import torch
import torch.nn as nn
import numpy as np
from torch.utils.data import DataLoader, Dataset

class UserTower(nn.Module):
    """User encoder"""
    def __init__(self, n_users: int, n_categories: int,
                 embedding_dim: int = 64, hidden_dim: int = 128):
        super().__init__()
        self.user_emb = nn.Embedding(n_users + 1, embedding_dim, padding_idx=0)
        self.category_emb = nn.Embedding(n_categories + 1, 16, padding_idx=0)

        self.mlp = nn.Sequential(
            nn.Linear(embedding_dim + 16 * 5 + 10, hidden_dim),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(hidden_dim, 64),
            nn.LayerNorm(64)
        )

    def forward(self, user_id, top_categories, behavior_features):
        user_vec = self.user_emb(user_id)
        cat_vecs = self.category_emb(top_categories).view(top_categories.shape[0], -1)
        combined = torch.cat([user_vec, cat_vecs, behavior_features], dim=1)
        return self.mlp(combined)


class ItemTower(nn.Module):
    """Item encoder"""
    def __init__(self, n_items: int, n_categories: int,
                 embedding_dim: int = 64, text_dim: int = 128):
        super().__init__()
        self.item_emb = nn.Embedding(n_items + 1, embedding_dim, padding_idx=0)
        self.category_emb = nn.Embedding(n_categories + 1, 16, padding_idx=0)

        self.mlp = nn.Sequential(
            nn.Linear(embedding_dim + 16 + text_dim + 5, 128),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(128, 64),
            nn.LayerNorm(64)
        )

    def forward(self, item_id, category_id, text_features, item_features):
        item_vec = self.item_emb(item_id)
        cat_vec = self.category_emb(category_id)
        combined = torch.cat([item_vec, cat_vec, text_features, item_features], dim=1)
        return self.mlp(combined)


class TwoTowerModel(nn.Module):
    def __init__(self, n_users, n_items, n_categories, text_dim=128):
        super().__init__()
        self.user_tower = UserTower(n_users, n_categories)
        self.item_tower = ItemTower(n_items, n_categories, text_dim=text_dim)
        self.temperature = nn.Parameter(torch.ones(1) * 0.05)

    def forward(self, user_inputs, item_inputs):
        user_emb = self.user_tower(**user_inputs)
        item_emb = self.item_tower(**item_inputs)

        # Cosine similarity
        user_norm = nn.functional.normalize(user_emb, dim=1)
        item_norm = nn.functional.normalize(item_emb, dim=1)
        scores = torch.sum(user_norm * item_norm, dim=1) / self.temperature
        return scores

    def get_user_embedding(self, user_inputs) -> torch.Tensor:
        with torch.no_grad():
            return nn.functional.normalize(self.user_tower(**user_inputs), dim=1)

    def get_item_embedding(self, item_inputs) -> torch.Tensor:
        with torch.no_grad():
            return nn.functional.normalize(self.item_tower(**item_inputs), dim=1)

In-Batch Negative Sampling Training

class RecommendationTrainer:
    def __init__(self, model: TwoTowerModel, lr: float = 1e-3):
        self.model = model
        self.optimizer = torch.optim.Adam(model.parameters(), lr=lr)
        self.scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
            self.optimizer, T_max=10
        )

    def train_epoch(self, dataloader: DataLoader) -> float:
        self.model.train()
        total_loss = 0

        for batch in dataloader:
            user_inputs, pos_item_inputs = batch['user'], batch['positive_item']

            # User embeddings: [batch_size, dim]
            user_embs = nn.functional.normalize(
                self.model.user_tower(**user_inputs), dim=1
            )
            # Item embeddings: [batch_size, dim]
            item_embs = nn.functional.normalize(
                self.model.item_tower(**pos_item_inputs), dim=1
            )

            # In-batch negatives: similarity matrix [batch_size x batch_size]
            scores = torch.matmul(user_embs, item_embs.T) / self.model.temperature

            # Diagonal = positive examples
            labels = torch.arange(len(user_embs)).to(scores.device)
            loss = nn.functional.cross_entropy(scores, labels)

            self.optimizer.zero_grad()
            loss.backward()
            torch.nn.utils.clip_grad_norm_(self.model.parameters(), 1.0)
            self.optimizer.step()

            total_loss += loss.item()

        self.scheduler.step()
        return total_loss / len(dataloader)

ANN Index for Real-Time

import faiss

class RecommendationIndex:
    """Index for fast top-K recommendation search"""

    def __init__(self, dim: int = 64, n_items: int = 100000):
        # IVF-PQ index: speed/accuracy tradeoff
        quantizer = faiss.IndexFlatIP(dim)
        n_lists = min(int(np.sqrt(n_items)), 4096)
        self.index = faiss.IndexIVFPQ(quantizer, dim, n_lists, 8, 8)
        self.index.nprobe = 32  # Number of clusters to search
        self.item_ids = []

    def build(self, item_embeddings: np.ndarray, item_ids: list):
        """Build index from embeddings"""
        embeddings_norm = item_embeddings / np.linalg.norm(
            item_embeddings, axis=1, keepdims=True
        )
        self.index.train(embeddings_norm.astype(np.float32))
        self.index.add(embeddings_norm.astype(np.float32))
        self.item_ids = item_ids

    def recommend(self, user_embedding: np.ndarray,
                   k: int = 50,
                   exclude_ids: list = None) -> list[tuple]:
        """Top-K recommendations for a user"""
        user_norm = user_embedding / np.linalg.norm(user_embedding)
        scores, indices = self.index.search(
            user_norm.reshape(1, -1).astype(np.float32),
            k + (len(exclude_ids) if exclude_ids else 0) + 10
        )

        results = []
        for score, idx in zip(scores[0], indices[0]):
            if idx < 0:
                continue
            item_id = self.item_ids[idx]
            if exclude_ids and item_id in exclude_ids:
                continue
            results.append((item_id, float(score)))
            if len(results) >= k:
                break

        return results

Re-ranking and Business Rules

class RecommendationReranker:
    """Apply business logic to candidates from retrieval model"""

    def rerank(self, candidates: list[tuple],
                user_context: dict,
                business_rules: dict) -> list[tuple]:
        """
        candidates: [(item_id, base_score)]
        business_rules: {'boost_new_items': 1.2, 'boost_on_sale': 1.15, 'max_same_category': 3}
        """
        scored_candidates = []

        category_count = {}
        for item_id, base_score in candidates:
            item = self._get_item_metadata(item_id)
            if item is None:
                continue

            # Apply boosts
            score = base_score
            if business_rules.get('boost_new_items') and item.get('is_new'):
                score *= business_rules['boost_new_items']
            if business_rules.get('boost_on_sale') and item.get('on_sale'):
                score *= business_rules['boost_on_sale']

            # Category limit
            cat = item.get('category')
            max_per_cat = business_rules.get('max_same_category', 5)
            if category_count.get(cat, 0) >= max_per_cat:
                continue
            category_count[cat] = category_count.get(cat, 0) + 1

            scored_candidates.append((item_id, score))

        return sorted(scored_candidates, key=lambda x: x[1], reverse=True)

Typical metrics after implementing a recommendation system: CTR of recommendations 8-15% (vs 2-4% for popular items), conversion from recommendations 3-7%, additional revenue 12-25% of total turnover. Cold start time for new users without history: 3-5 interactions to receive personalized recommendations.

Process, scope, and cost

Analytics: data audit, user profiling, defining business goals.
Design: choosing architecture (content-based, collaborative, two-tower), training pipelines.
Development: implementing models (PyTorch), training on historical data, setting up ANN index (FAISS).
Integration: REST API, gRPC, containerization (Docker), deployment in Kubernetes.
A/B testing: metrics (Recall, NDCG, revenue share), monitoring data drift.
Support: 12 months warranty, model updates, business rules refinement.

What is included in the work

Architectural document (approach selection, justification)
Source code for models and pipelines (PyTorch, FAISS)
Docker images and Kubernetes manifests
REST API / gRPC for real-time inference
Integration with your CRM/website
Documentation and team training
A/B testing and monitoring setup (Prometheus + Grafana)
12 months of support and updates

Timeline: from 30 to 60 working days depending on data volume and integration complexity. Cost is calculated individually — we will evaluate your project in 1 day after reviewing the task.

We have dozens of implemented projects — from startups to enterprise. Additional revenue after implementation amounts to millions of rubles per month. Order the development of an AI recommendation system and get a consultation on your project architecture.