Data Parsing from Marketplaces (Ozon, Wildberries) for 1C-Bitrix
The task "fetch data from Ozon/WB" splits into two fundamentally different scenarios. First — you sell through these marketplaces and want to pull data from your account (orders, stock, analytics) via official API. Second — you want to collect public competitor data or monitor prices. These are different tasks with different risks and architecture.
Scenario 1: Official API — Fetching Your Marketplace Store Data
Ozon and Wildberries provide full REST APIs for sellers. This is the legal route, and here "parsing" is the wrong word. We're talking about API integration.
Ozon Seller API (https://api-seller.ozon.ru):
Main method groups useful for Bitrix integration:
| Method | URL | Description |
|---|---|---|
product.list |
/v2/product/list |
List of products in account |
product.info.list |
/v2/product/info/list |
Product details (prices, stock, statuses) |
posting.fbo.list |
/v2/posting/fbo/list |
FBO orders |
posting.fbs.list |
/v3/posting/fbs/list |
FBS orders |
analytics.data |
/v1/analytics/data |
Sales reports |
finance.transaction.list |
/v3/finance/transaction/list |
Financial transactions |
Authorization: headers Client-Id and Api-Key. Rate limit: depends on method, typically 1–10 RPS. Exceeding returns 429, response body contains retry-after.
Wildberries API (https://statistics-api.wildberries.ru, https://suppliers-api.wildberries.ru):
WB split the API into several base URLs depending on data type:
-
https://statistics-api.wildberries.ru— sales statistics, stock, orders -
https://suppliers-api.wildberries.ru— product management, prices, warehouses -
https://content-api.wildberries.ru— card content
Authorization: Authorization: Bearer {token} header. Tokens are created in WB account and have different permissions (statistics separate from content management).
Integration into 1C-Bitrix: Architecture of Data Retrieval Module
Marketplace API data needs to be fetched regularly, structured, and stored in Bitrix. Standard architecture:
Bitrix Agents (CAgent::AddAgent()) run scheduled tasks:
- Every 15 minutes — stock and order statuses
- Every hour — new orders, price updates
- Daily — analytics reports, financial transactions
Retrieved data is saved in multiple places depending on type:
-
Orders →
b_sale_order,b_sale_basketviaCSaleOrder::Add()or directly to DB for bulk import - Analytics → HighLoad infoblock or separate partitioned tables
-
Product data →
b_iblock_element,b_catalog_price,b_catalog_productvia infoblock API
To store raw API responses (needed for debugging and reprocessing), create a separate table:
CREATE TABLE mp_api_raw_log (
id SERIAL PRIMARY KEY,
marketplace VARCHAR(30) NOT NULL,
method VARCHAR(100) NOT NULL,
params JSONB,
response JSONB,
status_code SMALLINT,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX ON mp_api_raw_log (marketplace, created_at);
In Bitrix this is created in module's DoInstall() method via $DB->Query().
Scenario 2: Public Data Parsing — Competitor Prices and Positions
Here we're collecting data from public marketplace pages (search positions, competitor prices, ratings). There's no official API for this.
Technical approaches:
Direct HTTP requests work for some WB data — some endpoints (https://card.wb.ru/cards/v2/detail?nm=..., https://catalog.wb.ru/...) return JSON without authorization. This isn't official API, structure can change without notice.
For Ozon, public data is available via https://www.ozon.ru/api/composer-api.bx/page/json/v2?url=/product/{slug} — frontend internal API, also undocumented and with no stability guarantees.
Headless browser (Puppeteer, Playwright) is needed for JS-rendered pages and anti-bot protection. WB and Ozon actively use fingerprinting and behavioral analysis. For production parsing you need:
- Residential proxy rotation
- User-Agent, viewport, timing randomization
- Cloudflare/PerimeterX bypass (on both marketplaces)
Legal risks. Parsing public marketplace data is legally ambiguous. Ozon and WB user agreements prohibit automated data collection. IP blocks are standard practice. Seller accounts can be blocked if parsing is linked to them.
Processing and Storing Parsed Data in Bitrix
Competitor data is convenient to store in HighLoad infoblock — this avoids direct SQL work and provides ready API for queries. Structure of HL infoblock for price monitoring:
| UF Field | Type | Purpose |
|---|---|---|
| UF_MARKETPLACE | string | ozon / wb |
| UF_PRODUCT_ID | integer | Marketplace product ID |
| UF_ARTICLE | string | Article |
| UF_PRICE | double | Current price |
| UF_PRICE_OLD | double | Original price |
| UF_RATING | double | Rating |
| UF_REVIEWS_COUNT | integer | Number of reviews |
| UF_POSITION | integer | Position in search results |
| UF_SEARCH_QUERY | string | Search query for position |
| UF_COLLECTED_AT | datetime | Collection time |
Registered via CUserTypeEntity and Bitrix\Highloadblock\HighloadBlockTable::add().
Automatic Response: Rules Based on Data
The point of data collection — don't just watch, react. Agent checks hourly: if competitor price for similar product is below ours by X%, change price via CCatalogProduct::Update() or CPrice::Update(). Change limits (don't drop below cost) are set in module settings.
Implementation Timeline
| Task | Timeline |
|---|---|
| Integration with official API of one marketplace (orders + stock) | 3–5 weeks |
| Analytics collection via official API (2 marketplaces) | 5–8 weeks |
| Monitoring public prices via HTTP requests (no JS rendering) | 4–6 weeks |
| Full parser with headless browser and anti-bot bypass | 8–14 weeks |
| Automatic repricing system based on competitor data | 10–16 weeks |







