Development of Face Detection Systems
Face detection is the first stage of almost any face pipeline. Task: find all faces in an image and return bounding boxes with confidence score. Sounds simple, but real conditions add complexity: small faces at distance, profile angles, partial occlusions, poor lighting, masks.
Modern Face Detectors
SCRFD (Sample and Computation Redistribution for Face Detection, InsightFace) — current best for speed/quality ratio. SCRFD-10GF: 95.2% AP on WiderFace Hard.
RetinaFace — classic with landmark detection (5 points: eyes, nose, mouth corners). Used for alignment before face recognition.
YOLOv8 fine-tuned on WiderFace — universal variant for custom requirements.
from insightface.app import FaceAnalysis
import cv2
# InsightFace: detection + landmark detection
app = FaceAnalysis(allowed_modules=['detection'])
app.prepare(ctx_id=0, det_size=(640, 640))
def detect_faces(image_path: str) -> list[dict]:
img = cv2.imread(image_path)
faces = app.get(img)
results = []
for face in faces:
results.append({
'bbox': face.bbox.astype(int).tolist(), # [x1, y1, x2, y2]
'confidence': float(face.det_score),
'landmarks': face.kps.astype(int).tolist() # 5 keypoints
})
return results
Small Face Detection
Standard detectors lose faces smaller than 16x16 pixels. For surveillance cameras at large distances:
- Image tiling: split image into overlapping tiles, detect on each, merge results via NMS
- SAHI (Slicing Aided Hyper Inference) — automatic tiling with merge:
from sahi import AutoDetectionModel
from sahi.predict import get_sliced_prediction
model = AutoDetectionModel.from_pretrained(
model_type='yolov8',
model_path='face_detector.pt',
confidence_threshold=0.3
)
result = get_sliced_prediction(
image='crowd.jpg',
detection_model=model,
slice_height=512,
slice_width=512,
overlap_height_ratio=0.2,
overlap_width_ratio=0.2
)
Performance on Different Hardware
| Detector | WiderFace Hard AP | Latency CPU | Latency GPU (T4) |
|---|---|---|---|
| SCRFD-500MF | 90.5% | 8 ms | 1.5 ms |
| SCRFD-10GF | 95.2% | 45 ms | 4 ms |
| RetinaFace-R50 | 94.9% | 90 ms | 7 ms |
| YOLOv8n (WiderFace) | 93.1% | 12 ms | 2 ms |
Masked Face Detection
Pandemic created a separate class of tasks — masked face detection. MAFA dataset contains 35,806 annotated masked faces. Fine-tuning standard detector on MAFA+WiderFace: AP on masked faces increases from 65% to 89%.
| Task | Timeline |
|---|---|
| Detection, standard conditions, ready model | 1 week |
| Custom conditions (masks, cameras, lighting) | 2–3 weeks |
| Small face detection, pipeline optimization | 3–5 weeks |







