What image formats are supported?

Any raster format: JPEG, PNG, TIFF, BMP. For PDF documents we convert pages to PNG at 300 DPI.

Which framework is best for Cyrillic?

PaddleOCR offers the best speed-accuracy balance for Russian. TrOCR is good for printed text but requires GPU and fine-tuning. EasyOCR is easier to use but lower quality.

How long does fine-tuning take?

Typically 2-4 weeks depending on data volume and font complexity. We design an optimal dataset for your domain, including synthetic generation.

Can OCR be integrated into an existing system?

Yes, we provide a REST API or Python package. Integration with S3, queues, and databases is standard. On-premise deployment is also available.

What accuracy is achieved on handwritten text?

For structured forms up to 85-90%. For arbitrary handwriting 70-80% with language model correction. In our projects we achieved 93% on medical prescriptions after fine-tuning.

What image formats are supported?

Any raster format: JPEG, PNG, TIFF, BMP. For PDF documents we convert pages to PNG at 300 DPI.

Which framework is best for Cyrillic?

PaddleOCR offers the best speed-accuracy balance for Russian. TrOCR is good for printed text but requires GPU and fine-tuning. EasyOCR is easier to use but lower quality.

How long does fine-tuning take?

Typically 2-4 weeks depending on data volume and font complexity. We design an optimal dataset for your domain, including synthetic generation.

Can OCR be integrated into an existing system?

Yes, we provide a REST API or Python package. Integration with S3, queues, and databases is standard. On-premise deployment is also available.

What accuracy is achieved on handwritten text?

For structured forms up to 85-90%. For arbitrary handwriting 70-80% with language model correction. In our projects we achieved 93% on medical prescriptions after fine-tuning.

Custom OCR for Text Recognition from Images

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Services we offer

Showing 1 of 1All 1566 services

Custom OCR for Text Recognition from Images

Medium

~3-5 days

Frequently Asked Questions

AI Development Areas

Discuss your AI project

Free consultation — we'll show you how AI can solve your challenge

Get a quote

We'll estimate the budget and timeline for your AI project

AI Solution Development Stages

Latest works

B2B ADVANCE company website development
1317
Development of a web application for FEEDME
1226
Website development for BELFINGROUP
925
Development of an online store for the company FURNORO
1156
B2B Advance company logo design
620
Development of a web application for Enviok
894

Show more works

A client needed to recognize handwritten medical prescriptions from photos – ready-made solutions achieved less than 60% accuracy. A typical situation: the OCR pipeline fails on tilted or overexposed images, and specific terms (drug names) get distorted. We are a team of AI engineers with 5+ years of experience in computer vision, having delivered over 50 text recognition projects – we built a custom OCR model that raised accuracy to 93%. Here's how modern OCR works and how we adapt it to business tasks.

OCR (Optical Character Recognition) extracts text from images. The modern pipeline consists of three stages: detection of text regions → rectification → character recognition. Each stage affects final accuracy, and a weak link anywhere degrades the result. We use PaddleOCR as the base framework in 80% of projects for Cyrillic – it offers the best speed-quality balance among open-source solutions. Clients save up to 40% of their document processing budget through automation.

Which OCR framework to choose for Cyrillic?

We've tried all popular open-source solutions. For Russian, each has its strengths:

PaddleOCR (PP-OCRv4) – accuracy 92.8% on ICDAR2015, best Cyrillic support among open-source. Suitable for production: fast on CPU, easy to fine-tune.
EasyOCR – simple API, but for Russian accuracy is 5-10% lower, speed on CPU is 2-3 times slower.
TrOCR (Microsoft) – transformer-based, achieves CER 2.89% on printed text. However, requires GPU and fine-tuning for Cyrillic.
Tesseract 5 – classic, customizable for any font, but without custom training it loses to PaddleOCR on complex documents.

Framework	Cyrillic	Speed (CPU)	Best for
PaddleOCR	Excellent	Fast	General OCR, production
EasyOCR	Good	Slow	Prototypes
TrOCR	Good	Medium	Printed documents
Tesseract 5	Good	Medium	On-premise, custom fonts

According to the official benchmark, PaddleOCR achieves 92.8% accuracy on ICDAR2015 (PaddleOCR GitHub).

Why is image preprocessing important?

OCR quality directly depends on the input to the model. A mobile phone photo – low contrast, noise, tilt. We apply a chain of transforms:

def preprocess_for_ocr(image: np.ndarray) -> np.ndarray:
    # Deskewing
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    angle = detect_skew_angle(gray)
    if abs(angle) > 0.5:
        image = rotate_image(image, -angle)

    # Denoising
    denoised = cv2.fastNlMeansDenoisingColored(image, h=10)

    # Contrast enhancement (CLAHE)
    lab = cv2.cvtColor(denoised, cv2.COLOR_BGR2LAB)
    l, a, b = cv2.split(lab)
    clahe = cv2.createCLAHE(clipLimit=3.0, tileGridSize=(8, 8))
    l = clahe.apply(l)
    denoised = cv2.cvtColor(cv2.merge([l, a, b]), cv2.COLOR_LAB2BGR)

    return denoised

Even simple deskew improves accuracy by 3-5%. For old scans with yellow background, we use adaptive binarization – Otsu or Sauvola. Preprocessing is especially critical for handwritten text: it boosts recognition accuracy by 15-20%.

Additional methods for accuracy improvement

Using a language model to correct contextual errors (e.g., confusion of '0' and 'O').
Ensemble of models for complex fonts.
Data augmentation: rotations, noise, blur to improve robustness.

How we do it: a case study of medical prescription recognition

Let's detail a real project from our practice. The task: accept prescription photos from a mobile app, recognize drug name, dosage, and instructions. Problems: handwritten text, blurry images, stamp overlaps.

Solution:

Preprocessing: CLAHE + binarization + shadow removal via morphology.
Detection: fine-tuned PaddleOCR detection model on 2000 labeled prescriptions (bbox labels).
Recognition: PP-OCRv4 recognition model fine-tuned on 50,000 synthetic prescriptions (generated with different handwriting styles).
Postprocessing: a drug dictionary (10,000 names) + LanguageTool for OCR error correction + LLM for context correction (0/O confusion).

Result: accuracy on test set – 93% (Character Error Rate 0.07). Processing time per image – 1.5 seconds on CPU. For comparison, Tesseract 5 without fine-tuning would give about 40-50% on such data – our pipeline was twice as accurate.

Process of work

Any OCR project goes through 5 stages:

Analytics: assess data, typical defects, domain dictionary.
Design: choose framework, pipeline architecture (queues, caching).
Implementation: write code, fine-tune models, integrate with your system.
Testing: measure accuracy on validation set, A/B test on real data.
Deployment and support: package in Docker, REST API or gRPC, monitor metrics.

What's included

Comprehensive pipeline documentation describing all components.
Trained model (weights + model card).
Source code with launch instructions.
Integration with your storage (S3, MinIO) and queues (RabbitMQ, Kafka).
Training your team on the system.
Accuracy guarantee (we fix metrics in the contract).

Timelines

Task	Duration
OCR via ready framework + API	1–2 weeks
Complex documents with preprocessing	2–4 weeks
Custom font / handwritten text	4–8 weeks

Cost is calculated individually after data analysis. Get a consultation – we'll evaluate your project in one day. Contact us to discuss details and estimated cost.