Development of AI Recommendation System
Recommendation system is an ML model that predicts what user wants to view, buy, or read next. Difference between good and poor implementation: 15–20% revenue lift in first case versus 2–3% in second. Several architectural approaches exist — choice depends on data volume, cold start, and business context.
Architecture Selection by Data Volume
| Transaction Volume | Recommended Approach | Recall@10 | Latency |
|---|---|---|---|
| < 10K | Content-based + rules | 15–25% | < 5ms |
| 10K – 500K | Matrix Factorization (ALS) | 25–40% | < 20ms |
| 500K – 5M | Two-tower neural + MF ensemble | 35–50% | < 50ms |
| > 5M | Two-tower + GNN + re-ranking | 45–65% | < 100ms |
Two-Tower Neural Architecture
import torch
import torch.nn as nn
import numpy as np
from torch.utils.data import DataLoader, Dataset
class UserTower(nn.Module):
"""User encoder"""
def __init__(self, n_users: int, n_categories: int,
embedding_dim: int = 64, hidden_dim: int = 128):
super().__init__()
self.user_emb = nn.Embedding(n_users + 1, embedding_dim, padding_idx=0)
self.category_emb = nn.Embedding(n_categories + 1, 16, padding_idx=0)
self.mlp = nn.Sequential(
nn.Linear(embedding_dim + 16 * 5 + 10, hidden_dim),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(hidden_dim, 64),
nn.LayerNorm(64)
)
def forward(self, user_id, top_categories, behavior_features):
user_vec = self.user_emb(user_id)
cat_vecs = self.category_emb(top_categories).view(top_categories.shape[0], -1)
combined = torch.cat([user_vec, cat_vecs, behavior_features], dim=1)
return self.mlp(combined)
class ItemTower(nn.Module):
"""Item/product/content encoder"""
def __init__(self, n_items: int, n_categories: int,
embedding_dim: int = 64, text_dim: int = 128):
super().__init__()
self.item_emb = nn.Embedding(n_items + 1, embedding_dim, padding_idx=0)
self.category_emb = nn.Embedding(n_categories + 1, 16, padding_idx=0)
self.mlp = nn.Sequential(
nn.Linear(embedding_dim + 16 + text_dim + 5, 128),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(128, 64),
nn.LayerNorm(64)
)
def forward(self, item_id, category_id, text_features, item_features):
item_vec = self.item_emb(item_id)
cat_vec = self.category_emb(category_id)
combined = torch.cat([item_vec, cat_vec, text_features, item_features], dim=1)
return self.mlp(combined)
class TwoTowerModel(nn.Module):
def __init__(self, n_users, n_items, n_categories, text_dim=128):
super().__init__()
self.user_tower = UserTower(n_users, n_categories)
self.item_tower = ItemTower(n_items, n_categories, text_dim=text_dim)
self.temperature = nn.Parameter(torch.ones(1) * 0.05)
def forward(self, user_inputs, item_inputs):
user_emb = self.user_tower(**user_inputs)
item_emb = self.item_tower(**item_inputs)
# Cosine similarity
user_norm = nn.functional.normalize(user_emb, dim=1)
item_norm = nn.functional.normalize(item_emb, dim=1)
scores = torch.sum(user_norm * item_norm, dim=1) / self.temperature
return scores
def get_user_embedding(self, user_inputs) -> torch.Tensor:
with torch.no_grad():
return nn.functional.normalize(self.user_tower(**user_inputs), dim=1)
def get_item_embedding(self, item_inputs) -> torch.Tensor:
with torch.no_grad():
return nn.functional.normalize(self.item_tower(**item_inputs), dim=1)
In-Batch Negative Sampling Training
Training objective: maximize similarity between user-item pair with contrastive loss (InfoNCE). Typical metrics: Recall@10, NDCG@5, Hit Rate. Two-tower architecture allows real-time serving via ANN indices (Milvus, Pinecone) without retraining embeddings daily.
Integration via REST API: FastAPI endpoint receives user context, returns top-K candidates in <50ms. Personalization layer adds business rules: no repeat items, respect user filters, brand diversity.
Timeline: basic two-tower recommendation system — 3–4 weeks. Production-ready with A/B testing, feature store, and real-time updates — 8–12 weeks.







