Synthetic Monitoring Setup
Synthetic monitoring simulates user actions in a browser on a schedule from different geographic points. Unlike ping monitoring, it tests real-world scenarios: login flows, data loading, form submission.
Datadog Synthetic Tests
# Create browser test via API
import datadog
from datadog import api
datadog.initialize(
api_key='YOUR_DATADOG_API_KEY',
app_key='YOUR_DATADOG_APP_KEY'
)
test = api.Synthetics.create_test({
'name': 'Login Flow - Production',
'type': 'browser',
'status': 'live',
'locations': [
'aws:eu-west-1',
'aws:us-east-1',
'aws:ap-southeast-1',
],
'options': {
'tick_every': 900, # every 15 minutes
'min_failure_duration': 300, # alert after 5 minutes downtime
'min_location_failed': 1,
'retry': {'count': 2, 'interval': 300},
'monitor_options': {
'notify_audit': False,
'renotify_interval': 60,
}
},
'config': {
'start_url': 'https://mysite.com/login',
'steps': [
{
'type': 'typeText',
'name': 'Enter email',
'params': {
'element': {'type': 'css', 'value': 'input[name="email"]'},
'value': '[email protected]'
}
},
{
'type': 'typeText',
'name': 'Enter password',
'params': {
'element': {'type': 'css', 'value': 'input[name="password"]'},
'value': '{{ MONITOR_PASSWORD }}' # From secret
}
},
{'type': 'click', 'params': {'element': {'type': 'css', 'value': '[type=submit]'}}},
{
'type': 'assertElementPresent',
'name': 'Dashboard loaded',
'params': {'element': {'type': 'css', 'value': '[data-testid="dashboard"]'}}
}
]
},
'message': 'Login flow failed! @pagerduty-production'
})
Checkly: Code-First Synthetic Monitoring
// checkly.config.ts
import { defineConfig } from 'checkly';
import { Frequency } from 'checkly/constructs';
export default defineConfig({
projectName: 'My Site',
logicalId: 'my-site',
repoUrl: 'https://github.com/myorg/my-site',
checks: {
activated: true,
muted: false,
runtimeId: '2024.02',
frequency: Frequency.EVERY_10M,
locations: ['eu-west-1', 'us-east-1'],
tags: ['production'],
browserChecks: {
testMatch: '**/__checks__/**/*.check.ts',
},
},
});
// __checks__/login.check.ts
import { BrowserCheck, Frequency } from 'checkly/constructs';
import { expect, test } from '@playwright/test';
export const loginCheck = new BrowserCheck('login-flow', {
name: 'Login Flow',
frequency: Frequency.EVERY_10M,
locations: ['eu-west-1', 'us-east-1', 'ap-southeast-1'],
code: {
entrypoint: './login.spec.ts',
},
});
// __checks__/login.spec.ts
import { test, expect } from '@playwright/test';
test('Login flow works', async ({ page }) => {
await page.goto('https://mysite.com/login');
await page.fill('[name="email"]', process.env.MONITOR_EMAIL!);
await page.fill('[name="password"]', process.env.MONITOR_PASSWORD!);
await page.click('[type="submit"]');
await expect(page).toHaveURL(/\/dashboard/);
await expect(page.locator('h1')).toContainText('Dashboard');
});
# Deploy checks from CI
npx checkly deploy --preview
npx checkly deploy # Production
Grafana + k6: Synthetic Load Testing
// k6/synthetic-load.js
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
scenarios: {
synthetic_monitoring: {
executor: 'constant-arrival-rate',
rate: 1, // 1 iteration per minute
timeUnit: '1m',
duration: '24h', // Continuous
preAllocatedVUs: 2,
},
},
thresholds: {
http_req_duration: ['p(95)<2000'], // 95% requests < 2s
http_req_failed: ['rate<0.01'], // < 1% errors
},
};
export default function () {
// Homepage
let res = http.get('https://mysite.com');
check(res, {
'homepage status 200': (r) => r.status === 200,
'homepage < 2s': (r) => r.timings.duration < 2000,
});
sleep(10);
// API health
res = http.get('https://api.mysite.com/health');
check(res, {
'api healthy': (r) => r.status === 200 && r.json('status') === 'ok',
});
}
Alerts and Escalation
# Grafana Alerting: notifications on failure
alert_rules:
- name: Synthetic Monitor Failure
condition: "avg(synthetic_check_success) < 0.9"
for: 5m
annotations:
summary: "Synthetic monitor failing for {{ $labels.check_name }}"
labels:
severity: critical
notifications:
- pagerduty
- slack-ops-channel
Synthetic monitoring metrics:
- Availability (uptime %) from each region
- TTFB (Time to First Byte)
- Page load time
- User scenario execution time
- Number of consecutive failures before alert
Checkly setup with browser checks from 3 regions and Slack/PagerDuty alerts — 1–2 business days.







