AI-Powered Spam Detection for Mobile Apps
Spam in mobile apps isn't just "buy crypto" comments. It's mass bot registration, fake like accumulation via emulator, duplicate listings from multiple accounts, chat flooding. Each scenario requires separate approach — no single "enable anti-spam" button exists.
Common Naive Implementation Mistakes
Most widespread anti-pattern — client-side word blacklist filtering. First, list is easy to bypass: "купи" → "к-у-п-и", "кup и", Unicode homoglyphs. Second, logic on client is visible via decompilation. Third, this isn't ML at all — it's regex.
Second anti-pattern — send every message to server for sync classification. At 50 messages per second in active chat, either degrades UX (send delay) or crashes backend.
How It Works in Practice
Behavioral Signals + Text Classification
Effective spam detection builds on two levels. First — behavioral patterns: action frequency, intervals between events, device fingerprint, IP/ASN anomalies. These signals are collected on client and sent in batches.
Second level — NLP text classification. For mobile apps, distilled BERT variants work well: distilbert-base-multilingual-cased in ONNX (~265 MB) on server or MobileBERT (~95 MB) in TFLite for on-device inference. Server variant is preferable: model updates without app release.
// iOS: send message with behavioral metadata
struct MessagePayload: Encodable {
let text: String
let userId: String
let sessionDuration: TimeInterval
let messageIndexInSession: Int
let typingDurationMs: Int // <300ms — suspicious
let pasteDetected: Bool
}
func sendMessage(_ text: String) {
let payload = MessagePayload(
text: text,
userId: currentUser.id,
sessionDuration: sessionTimer.elapsed,
messageIndexInSession: messageCount,
typingDurationMs: typingTracker.duration,
pasteDetected: typingTracker.wasPasted
)
api.postMessage(payload) { result in
switch result {
case .success(let msg): self.appendMessage(msg)
case .failure(let error) where error == .spamDetected:
self.showSpamWarning()
}
}
}
Typing speed typingDurationMs < 300 for message length > 50 chars — almost certainly paste-spam or bot. This signal works even without ML.
On-Device Pre-Filter to Reduce Load
For text fields in forums and marketplaces, install light on-device filter based on TFLite Text Classification model (~1.5 MB). It filters out 70–80% of obvious spam without network request:
// Android: TFLite inference before sending
class SpamPrefilter(context: Context) {
private val interpreter: Interpreter
private val tokenizer: BertTokenizer
init {
val model = FileUtil.loadMappedFile(context, "spam_lite.tflite")
interpreter = Interpreter(model)
tokenizer = BertTokenizer.createFromAsset(context, "vocab.txt")
}
fun isLikelySpam(text: String): Boolean {
val inputIds = tokenizer.tokenize(text).toIntArray()
val output = Array(1) { FloatArray(2) }
interpreter.run(arrayOf(inputIds), output)
return output[0][1] > 0.85f // spam confidence threshold
}
}
Edge cases (confidence 0.6–0.85) sent to server model. Obvious spam blocked immediately. Reduces API requests roughly threefold.
Bot Protection on Registration
For account creation flow integrate Google Play Integrity API (Android) and DeviceCheck (iOS). Both provide token verifiable on server — confirms request came from real device, not emulator or Appium script. Not a silver bullet, but raises cost of spam registration for attacker.
Process
Audit spam types in app: text flooding, fake accounts, like accumulation, content duplication.
Design signals: behavioral metadata client should collect and transmit.
Develop on-device pre-filter and server classification.
Tune threshold values: auto-block vs human review queue.
Monitor false positive rate via Grafana/Datadog — first week in "shadow" mode (log, don't block).
Timeline Guidance
Basic server classification with behavioral signals — 5–7 days. On-device pre-filter on TFLite + Play Integrity / DeviceCheck — 3–4 more days. Full system with moderation dashboard and feedback loop for model retraining — 3–5 weeks.







