Exchange data normalization system development

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.
Showing 1 of 1 servicesAll 1306 services
Exchange data normalization system development
Medium
~3-5 business days
FAQ
Blockchain Development Services
Blockchain Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1238
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1167
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    867
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1080
  • image_logo-advance_0.png
    B2B Advance company logo design
    563
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    829

Exchange Data Normalization System Development

Each cryptocurrency exchange — a separate universe with its own naming conventions, number formats, time units, and field semantics. BTC/USDT on Binance is called BTCUSDT, on Kraken — XBT/USDT, on Bitfinex — tBTCUST. Normalization is a layer that hides this incompatibility behind a single interface.

What Needs to Be Normalized

Symbols and pairs. Each exchange has its own conventions. Normalized format — BASE/QUOTE in uppercase: BTC/USDT, ETH/BTC. Exchange symbols are stored in a mapping with possibility of reverse transformation.

Timestamps. Binance returns milliseconds, some exchanges — seconds, OKX — nanoseconds. Normalized format — milliseconds UTC, stored as int64.

Numbers. REST API often returns numbers as strings ("43250.50"), some exchanges lose trailing zeros. Normalized format — Decimal with explicit precision depending on the instrument.

Order sides. BUY/SELL, buy/sell, b/s, 1/-1 — all occur. Normalized format — enum BUY | SELL.

Order statuses. Each exchange has its own statuses. Normalized mapping:

Exchange Raw Normalized
Binance NEW, PARTIALLY_FILLED, FILLED, CANCELED OPEN, PARTIAL, FILLED, CANCELLED
Bybit Created, New, PartiallyFilled, Filled OPEN, OPEN, PARTIAL, FILLED
OKX live, partially_filled, filled, canceled OPEN, PARTIAL, FILLED, CANCELLED

Normalizer Architecture

The normalizer is implemented as a set of exchange-specific adapters with a common interface:

from abc import ABC, abstractmethod
from decimal import Decimal

class ExchangeNormalizer(ABC):
    @abstractmethod
    def normalize_symbol(self, raw_symbol: str) -> str:
        """Converts exchange symbol to normalized format BASE/QUOTE"""

    @abstractmethod
    def normalize_ticker(self, raw_data: dict) -> NormalizedTicker:
        """Normalizes ticker data"""

    @abstractmethod
    def normalize_order(self, raw_data: dict) -> NormalizedOrder:
        """Normalizes order data"""


class BinanceNormalizer(ExchangeNormalizer):
    SYMBOL_MAP = {
        "BTCUSDT": "BTC/USDT",
        "ETHUSDT": "ETH/USDT",
        # ... from API /api/v3/exchangeInfo
    }

    def normalize_ticker(self, raw: dict) -> NormalizedTicker:
        return NormalizedTicker(
            exchange="binance",
            symbol=self.normalize_symbol(raw["s"]),
            timestamp=int(raw["T"]),
            price=Decimal(raw["c"]),
            volume_24h=Decimal(raw["v"]),
        )

Dynamic Symbol Mapping Loading

Hard-coded symbol mapping in code is a bad idea: exchanges add new pairs daily. The correct approach — load mapping from Exchange Info API on startup and update periodically:

async def load_symbol_map(self):
    exchange_info = await self.rest_client.get("/api/v3/exchangeInfo")
    self.symbol_map = {
        s["symbol"]: f"{s['baseAsset']}/{s['quoteAsset']}"
        for s in exchange_info["symbols"]
        if s["status"] == "TRADING"
    }
    # Inverted mapping for reverse transformation
    self.reverse_map = {v: k for k, v in self.symbol_map.items()}

Normalized Data Validation

After normalization, it's important to validate the result. Negative prices, zero volumes, future timestamps — all are signs of data source problems:

def validate_ticker(ticker: NormalizedTicker) -> list[str]:
    errors = []
    if ticker.price <= 0:
        errors.append(f"Invalid price: {ticker.price}")
    if ticker.timestamp > now_ms() + 5000:
        errors.append(f"Future timestamp: {ticker.timestamp}")
    if ticker.bid and ticker.ask and ticker.bid >= ticker.ask:
        errors.append(f"Crossed book: bid={ticker.bid} ask={ticker.ask}")
    return errors

Invalid data is logged and discarded, not reaching downstream systems.

Normalizer Testing

Unit tests with real raw data examples from each exchange are mandatory. Exchanges sometimes change API format without warning. A set of fixed fixtures with expected normalized results allows quick detection of regression:

def test_binance_normalizer():
    raw = {"s": "BTCUSDT", "c": "43250.50", "v": "28450.12", "T": 1704067200000}
    result = BinanceNormalizer().normalize_ticker(raw)
    assert result.symbol == "BTC/USDT"
    assert result.price == Decimal("43250.50")
    assert result.exchange == "binance"

Additionally — integration tests with live exchange API in sandbox mode, run daily in CI for early detection of API changes.