The module lifecycle stage: General Availability
Overview
Site monitoring is a blackbox monitoring system built into DOP. It allows you to check the availability of websites, TCP services, and DNS records from external observation points (zones).
The agent is installed on a server outside the cluster, receives configuration from DOP, performs checks, and sends metrics back to DOP. Results are available on dashboards and through the alerting system.
Supported check types: HTTP, TCP/TLS, DNS, Ping (ICMP).
Quick Start
- Make sure site monitoring is enabled (Admin → Settings →
web_monitoring_enabled) - Create a zone: Admin → Zones → New Zone (specify a name, e.g.
world) - Create an agent: Admin → Agents → New Agent (specify a name and zone)
- Copy the agent token from the agent card
- Install the agent on an external server (see Agent Installation)
- Navigate to your project: Project → Web Monitoring → Add Site
- Enter a domain, select zones, add a probe via the “HTTPS Check” preset, and save
Within 1–2 minutes, availability data will appear on dashboards.
Core Concepts
Zones
A zone is a geographic or network observation point. Each agent belongs to one zone. A site can be checked from multiple zones simultaneously for independent availability assessment.
Creating zones: Admin → Zones → New Zone.
Sites
A site is the monitoring target: a domain name or IP address. It belongs to a project. Each site contains one or more probes and shared settings (period, ping, zones, labels).
Probes and Checks
A probe is a single request of a specific type (HTTP, TCP, or DNS) to a site. Each probe contains one or more checks — specific verification conditions (e.g. HTTP status = 200 or response time < 5s).
The agent executes all checks on each probe run and sends a gogomonia_test metric (1 = pass, 0 = fail) for each check.
Labels
Labels are arbitrary Prometheus-compatible key=value pairs added to metrics. They are used for grouping, filtering, and alert management.
Labels are set at three levels:
- Site (
site_labels) — added to all site metrics - Site (
check_labels) — added to all site checks - Check (
check_labels) — per-check labels that override site-level ones
Alert management: the label alert=yes (default) enables alerts, alert=no disables them for a specific check.
Key format: Latin letters, digits, underscores; must start with a letter or underscore (/^[a-zA-Z_][a-zA-Z0-9_]*$/).
Check Types
Each probe (HTTP/TCP/DNS) contains one or more checks. The agent executes all checks on each probe run and sends a gogomonia_test metric (1 = pass, 0 = fail) for each check.
All pattern fields (bodyMatch, contentType, answerMatch, location, headers.match, cookies.match) use the /regex/ format (Go RE2-compatible). Adding ! after the closing slash (/pattern/!) inverts the result (NOT match).
HTTP Checks
status — HTTP Status Code Check
Verifies that the response code is in the list of allowed codes. Used to ensure server response correctness.
- Field: comma-separated list of codes
- Format: integers 100–599 or regex in slashes
- Examples:
200·200, 301, 302·/2[0-9]{2}/(any 2xx)
responseTime — Maximum Response Time
Verifies that the request completed faster than the threshold. The check fires in real time: if the response does not arrive within max, the test is marked as fail before the request completes.
- Field:
max— maximum duration - Format: number + unit:
ms,s,m,h - Examples:
5s·1500ms·1m
bodyMatch — Response Body Regex Check
Verifies that the HTTP response body contains (or does not contain) a specified pattern. Useful for checking that the page contains expected content and does not show an error.
- Field: regex
- Format:
/regex/or/regex/!(NOT match) - Examples:
/OK/·/Welcome to/·/error|maintenance/!(must not contain)
bodySize — Response Body Size
Verifies that the response body size is not less than the specified threshold. Helps detect empty responses.
- Field:
min— minimum size - Format: number + unit:
b,Kb,Mb,Gb - Examples: min
100b· min1Kb· min10Mb
contentType — Content-Type Header Check
Verifies that the Content-Type header in the response matches the expected value. Used to ensure the server returns the correct content type rather than an error page.
- Field: regex
- Format:
/regex/or/regex/! - Examples:
/text\/html/·/application\/json/·/image\/.*/
location — Location Header Check (Redirects)
Verifies that the Location header in the response (for 3xx redirects) matches a pattern. Useful for ensuring a redirect leads to the correct URL.
- Field: regex
- Format:
/regex/or/regex/! - Examples:
/https:\/\/example\.com\//·/\/login$/
sslBasicConstraintsValid — TLS Certificate Check
Comprehensive TLS certificate verification: BasicConstraints correctness (CA certificates must have BasicConstraintsValid=true), leaf certificate validity period (NotBefore / NotAfter), hostname matching (exact and wildcard *.example.com).
- Fields: none (enabled as boolean)
certValidDays — Certificate Expiration
Verifies that the TLS certificate will not expire within the next N days. Used for early detection of expiring certificates.
- Field:
min— minimum days until expiration - Format: positive integer
- Examples:
30(warn 30 days ahead) ·7·90
sslValidDays — Certificate Expiration (Alternative)
Same as certValidDays. Cannot be used simultaneously with certValidDays.
- Field:
min— minimum days until expiration - Format: positive integer
headers — Response Headers Check
Verifies that HTTP response headers contain specified values. Each pair: header name + regex for value. All pairs must match (AND logic).
- Fields:
name/matchpairs - name: header name (case-insensitive), e.g.
Content-Type,X-Request-Id - match: regex
/pattern/or/pattern/! - Examples: name
X-Request-Idmatch/^[a-f0-9-]+$/· nameCache-Controlmatch/no-cache/!
cookies — Response Cookies Check
Verifies that HTTP response cookies contain specified values. Each pair: cookie name + regex for value. All pairs must match (AND logic).
- Fields:
name/matchpairs - name: cookie name, e.g.
session_id,csrf_token - match: regex
/pattern/or/pattern/! - Examples: name
session_idmatch/^[a-f0-9]{32}$/· namelangmatch/^(en|ru)$/
TCP/TLS Checks
connected — TCP Connection Check
Verifies that the TCP connection (and TLS handshake if TLS is enabled) was established successfully. Basic TCP service availability check.
- Fields: none (enabled as boolean)
responseTime — Connection Establishment Time
Same as HTTP responseTime but measures TCP connection establishment time (and TLS handshake if applicable).
- Field:
max— maximum duration - Examples:
5s·2s
sslBasicConstraintsValid — TLS Certificate Check
Same as HTTP sslBasicConstraintsValid. Checks the TCP/TLS connection certificate.
- Fields: none (enabled as boolean)
certValidDays — TLS Certificate Expiration
Same as HTTP certValidDays. Checks the TCP/TLS connection certificate.
- Field:
min— days until expiration - Examples:
30·7
sslValidDays — TLS Certificate Expiration (Alternative)
Same as certValidDays. Cannot be used simultaneously with certValidDays.
DNS Checks
gotAnswer — Non-Empty DNS Response
Verifies that the DNS query returned at least one record. Basic check — the domain resolves.
- Fields: none (enabled as boolean)
answerMatch — DNS Response Regex Check
Verifies that at least one record in the DNS response matches a pattern. The response contains the full record string (e.g. example.com. 300 IN A 1.2.3.4).
- Field: regex
- Format:
/regex/or/regex/! - Examples:
/1\.2\.3\.4/·/IN A/·/IN MX.*mail\./
responseTime — DNS Query Time
Same as HTTP responseTime but measures DNS query time.
- Field:
max— maximum duration - Examples:
2s·500ms
Ping (ICMP)
Ping is enabled via the ping: true flag at the site level. It works completely independently of probes — it requires neither probes nor checks. The built-in ping.maxRtt test passes if RTT < 500ms (threshold is hardcoded).
Monitoring Setup
Creating Zones
A zone is a geographic or network observation point. Each agent belongs to one zone.
Create a zone: Admin → Zones → New Zone. Specify the zone name.
Creating Sites
Path: Project → Web Monitoring → Add Site.
Two Form Modes
The site creation/editing form operates in two modes.
Simple mode (default for new sites):
- Basic fields shown: host, period, ping, zones
- Adding probes via presets (ready-made templates)
- Available checks are limited (see table below)
- TCP: simplified TLS toggle
Advanced mode (“Advanced mode” toggle at the bottom of the form):
- All additional fields: hosts, probe_all_resolved_ips, site_labels, check_labels, request_defaults
- Full set of check types
- Check grouping
- Per-probe labels and period
- TCP: full TLS configuration (insecureSkipVerify, serverName)
Switching from advanced to simple mode is blocked if the form contains data available only in advanced mode. For existing sites, the mode is chosen automatically based on saved settings.
Available Checks by Mode
| Probe Type | Simple Mode | Advanced Only |
|---|---|---|
| HTTP | status, responseTime, certificate (sslBasicConstraintsValid + certValidDays) | bodyMatch, bodySize, contentType, location, headers, cookies |
| TCP | connected, responseTime, certificate | — |
| DNS | gotAnswer, responseTime | answerMatch |
Check Grouping
In advanced mode, the “Group conditions into checks” toggle is available.
Without grouping (default): each check is a separate verification with its own result (pass/fail) and its own alert.
With grouping: multiple conditions (status, bodyMatch, responseTime, etc.) are combined into a single logical check with a shared name and shared alert. A severity level is assigned to the group.
Example: a group “https-health” may include status=200 + bodyMatch="/OK/" + responseTime max=5s. If any condition is violated, a single shared alert fires.
Grouping is enabled automatically if the site already contains checks with multiple conditions in a single config.
Site Parameters
| Parameter | Description | Mode |
|---|---|---|
| name | Display name, unique within the project | both |
| host | Domain or IP to monitor | both |
| period | Check interval. Format: \d+(s|m), max 60s. Example: 20s, 1m |
both |
| ping | Enable ICMP ping (true/false) | both |
| zones | Zones to check from. If not specified — checks from all zones | both |
| hosts | Array of specific IPs instead of DNS resolution | adv. |
| probe_all_resolved_ips | Check all resolved IPs, not just the first one | adv. |
| site_labels | Prometheus labels for all site metrics | adv. |
| check_labels | Labels for all site checks | adv. |
| request_defaults | HTTP defaults: scheme, timeout, headers | adv. |
| probes | Array of probes with checks | both |
Presets
Presets are ready-made probe templates available when creating a site. They allow adding a standard configuration with a single click.
| Preset | Probe Type | Created Checks |
|---|---|---|
| HTTPS Check | HTTP GET (https) | status 200, sslBasicConstraintsValid, certValidDays min=30 |
| TCP/TLS | TCP with TLS | connected, certValidDays min=30 |
| DNS | DNS A query | gotAnswer |
| HTTP Redirect | HTTP GET | status 301/302, location |
Project Requirements
For site monitoring to work, the project must have an API token with “Metrics: write” (write_metrics) permission. This token is created automatically when the project is created.
If the token is deleted, a warning is displayed in the Web Monitoring section, and the agent does not receive the configuration for this project.
Agent Installation
Requirements
- Linux (amd64)
- Network access to DOP API (
api.<domain>)
Manual Installation
- Create an agent in Admin UI: Admin → Agents → New Agent (specify name and zone)
- Copy the token from the agent card
- On the target server:
# Download binary
curl -o /usr/local/bin/gogomonia-agent \
"https://update.<domain>/gogomonia-agent/storage/latest/linux/amd64/gogomonia-agent"
chmod +x /usr/local/bin/gogomonia-agent
# Config
mkdir -p /usr/local/gogomonia-agent/etc
cat > /usr/local/gogomonia-agent/etc/config.yaml <<EOF
dop_api_url: "https://api.<domain>"
EOF
# Token
echo -n "<TOKEN>" > /usr/local/gogomonia-agent/agent.token
# Run
/usr/local/bin/gogomonia-agentAgent Configuration
File config.yaml:
dop_api_url: "https://api.dop.example.com"
proxy:
url: "" # HTTP proxy (optional)
no_verify_tls: falseEnvironment Variables
| Variable | Description | Default |
|---|---|---|
GOGOMONIA_CONFIG_PATH |
Path to config.yaml | /usr/local/gogomonia-agent/etc/config.yaml |
GOGOMONIA_TOKEN_PATH |
Path to token file | /usr/local/gogomonia-agent/agent.token |
GOGOMONIA_UUID_PATH |
Path to UUID file | /usr/local/gogomonia-agent/agent.uuid |
GOGOMONIA_PUSH_URL |
Override for remote write URL | (from state API) |
Git-Based Configuration Management
The converter utility allows storing site configurations in a git repository in a declarative YAML format with Go template support, and synchronizing them with DOP via the API. This is a GitOps approach: validate on merge request, sync on merge to master.
Download
Binary: webmon-cli.
curl -o /usr/local/bin/webmon-cli \
"https://update.<domain>/webmon-cli/storage/latest/linux/amd64/webmon-cli"
chmod +x /usr/local/bin/webmon-cliCommands
The first argument is the command: sync or validate. Parameters are set via environment variables only.
sync — Synchronize Configs to DOP
Creates, updates, and (with DELETE_ORPHANS=true) deletes sites in DOP based on YAML file contents.
export DOP_API_URL="https://api.<domain>/api/v1/observability"
export CONFIGS_DIR=/path/to/configs
export DOP_WEB_MON_TOKEN_MYPROJECT="<API-token>"
webmon-cli syncvalidate — Local Validation
Validates configs without applying: field formats, allowed keys, deprecated fields, duplicates. Loads the zone list from the API for validation.
export DOP_API_URL="https://api.<domain>/api/v1/observability"
export CONFIGS_DIR=/path/to/configs
webmon-cli validateImportant: DOP_API_URL is required for validate as well (zone list loading). A token is not required for validate.
Global Environment Variables
| Variable | Description | Default |
|---|---|---|
DOP_API_URL |
Backend API URL (required, valid URL) | — |
CONFIGS_DIR |
Path to the config tree | . |
LOG_LEVEL |
Log level (zap): debug, info, warn, error |
info |
COMMON_TEMPLATES |
Directories with shared .tpl templates, comma-separated |
— |
DRY_RUN |
Plan without applying (true / 1 / yes) |
false |
DELETE_ORPHANS |
Delete sites in DOP that are absent from configs | false |
SKIP_DUPLICATES |
Skip duplicates instead of raising an error | false |
Project Mapping and Tokens
Environment variables are set for each project directory. The suffix is derived from the folder name: all non-alphanumeric characters are replaced with _, and the result is uppercased.
Examples: folder myapp → MYAPP, folder my-client → MY_CLIENT.
| Variable | Description | For sync |
|---|---|---|
DOP_WEB_MON_TOKEN_{FOLDER} |
API token with “Site Monitoring” permission | required |
DOP_WEB_MON_SPACE_{FOLDER} |
Override workspace name | default = folder name |
DOP_WEB_MON_PROJECT_{FOLDER} |
Override project name | default = folder name |
Repository Structure
{team-repo}/
.helpers/helpers.tpl # shared Go templates (Sprig)
{project}/conf.yaml # site configuration
.gitlab-ci.yml- Project = any folder containing
*.yamlfiles - Templates
.tplare loaded fromCOMMON_TEMPLATESand.helpers/directories up the path from the project folder - Templates use Go
text/templatesyntax with the Sprig library - Configs are rendered as templates before YAML parsing
conf.yaml Format
# Unit-level labels — added to all sites
siteLabels: # optional, map[string]string
env: production
checkLabels: # optional, map[string]string
alert: "yes"
# Default values for all sites
defaults: # optional
ping: "true" # "true"/"yes" → enable ping
period: "20s" # format: \d+(s|m), max 60s
request: # HTTP request defaults
scheme: https # http / https
timeout: 10s # format: \d+(ms|s|m|h)
headers: # array of {name, value}
- name: User-Agent
value: "monitoring/1.0"
# List of monitoring targets (required, at least 1)
targets:
- name: example-main # REQUIRED: unique, no spaces/quotes/parens/pipes
host: example.com # REQUIRED: domain or IP
hosts: # optional: fixed IPs instead of DNS
- 93.184.216.34
probeAllResolvedIPs: "true" # optional: check all IPs from DNS resolve
period: 30s # period override
ping: true # ping override
siteLabels: # merge with unit-level
env: staging
checkLabels: # merge with unit-level
alert: "no"
request: # override request defaults
scheme: https
timeout: 5s
probes: # >= 1 probe if ping=false
- name: https-main # REQUIRED: unique, no spaces/quotes/parens/pipes
period: 30s # optional: probe period override
siteLabels: # optional: merge with site-level
component: api
checkLabels: # optional: merge with site-level
alert: "yes"
request: # full HTTP configuration
scheme: https
path: /healthz
method: GET # GET, POST, PUT, DELETE, HEAD, OPTIONS, PATCH
port: 443 # 1-65535
timeout: 5s # max 300s
headers:
- name: Accept
value: application/json
basicAuth:
username: user
password: pass
data: '{"key":"value"}'
disableHttp2ForHttps: "true"
checks: # REQUIRED: >= 1 check
- name: status-ok # REQUIRED: unique, no spaces/quotes/parens/pipes
status: [200, 301]
- name: response-time
responseTime:
max: 5s
- name: body-check
bodyMatch: "/OK/"
- name: size-check
bodySize:
min: 100b
- name: content-type
contentType: "/text\\/html/"
- name: ssl-check
sslBasicConstraintsValid: true
- name: cert-expiry
certValidDays:
min: 30
checkLabels:
alert: "no"Legacy format: instead of
targets, you can usesiteswithsite(instead ofhost/name) andservers(instead ofhosts). Both formats are supported, buttargetsis recommended. Mixingsite+hostorservers+hostsin one entry will cause an error.
TCP probe (instead of request):
tcp:
port: 443
timeout: 5s
tls:
insecureSkipVerify: false
serverName: example.comDNS probe (instead of request):
dns:
type: A # A, AAAA, CNAME, MX, NS, TXT, SOA, SRV, PTR, CAA
query: example.com # REQUIRED: domain to query
timeout: 2sValidation rules:
nameandhost— required fields of each target- If
ping: falseandprobesis empty — validation error - Probe type is determined by key presence:
tcp:→ tcp,dns:→ dns, otherwise http - Zones are determined from
siteLabels.zone; empty zones = all zones certVerify/sslVerify— deprecated, will produce an error- Unknown keys at any level — validation error
certValidDaysandsslValidDays— cannot be used simultaneouslybodySizesupports onlymin(notmax)responseTimesupports onlymax(notmin)
CI/CD Integration
The converter is designed for use in GitLab CI: validate on MR, sync on merge to master.
stages:
- validate
- apply
variables:
DOP_CONVERTER_URL: "https://update.${DOP_BASE_DOMAIN}/webmon-cli/storage/${DOP_CONVERTER_VERSION}/linux/amd64/webmon-cli"
.download_converter: &download_converter
image: alpine:${DOP_WEB_MONITORING_ALPINE_VERSION}
before_script:
- apk add --no-cache curl
- curl -sSL -o webmon-cli "${DOP_CONVERTER_URL}"
- chmod +x webmon-cli
validate:
<<: *download_converter
stage: validate
variables:
DOP_API_URL: "${DOP_API_URL}"
CONFIGS_DIR: "."
script:
- ./webmon-cli validate
apply:
<<: *download_converter
stage: apply
variables:
DOP_API_URL: "${DOP_API_URL}"
CONFIGS_DIR: "."
script:
- ./webmon-cli sync
rules:
- if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'CI/CD Variables:
| Variable | Description | Where to Set |
|---|---|---|
DOP_WEB_MON_TOKEN_{FOLDER} |
API token for the project directory | Settings → CI/CD → Variables (masked) |
DOP_API_URL |
Backend API URL | Settings → CI/CD → Variables |
DOP_BASE_DOMAIN |
Base domain for downloading the converter | Group-level Variables |
DOP_CONVERTER_VERSION |
Converter version (e.g. 0.4.1 or latest) |
Group-level Variables |
DOP_WEB_MONITORING_ALPINE_VERSION |
Alpine image version for CI jobs | Group-level Variables |
Important: replace {FOLDER} with the project directory name after sanitization. Folder myapp → DOP_WEB_MON_TOKEN_MYAPP.
Metrics
Metric names use the technical prefix gogomonia_ — this is the internal component name, preserved for backward compatibility.
Probe Metrics (per-project tenant)
| Metric | Type | Description |
|---|---|---|
gogomonia_test |
gauge | Check/test result (1=pass, 0=fail) |
gogomonia_probe_request_succeeded |
gauge | Probe request succeeded (1/0) |
gogomonia_http_requests_sent |
counter | HTTP requests sent |
gogomonia_http_responses_received |
counter | HTTP responses received |
gogomonia_http_response_time_ms_sum |
counter | Sum of HTTP response time (ms) |
gogomonia_http_body_size_bytes |
counter | Sum of HTTP body sizes |
gogomonia_http_requests_in_flight |
gauge | HTTP requests in progress |
gogomonia_tcp_requests_sent |
counter | TCP requests |
gogomonia_tcp_responses_received |
counter | TCP successful connections |
gogomonia_tcp_response_time_ms_sum |
counter | TCP response time |
gogomonia_tcp_requests_in_flight |
gauge | TCP in progress |
gogomonia_dns_requests_sent |
counter | DNS requests |
gogomonia_dns_responses_received |
counter | DNS responses |
gogomonia_dns_response_time_ms_sum |
counter | DNS response time |
gogomonia_dns_requests_in_flight |
gauge | DNS in progress |
gogomonia_ping_rtt_ms |
gauge | Ping RTT (ms) |
gogomonia_ping_requests_sent |
counter | Ping requests sent |
gogomonia_ping_responses_received |
counter | Ping responses received |
gogomonia_site_resolved |
gauge | DNS resolve status (1=ok, 0=fail) |
gogomonia_site_resolve_tries_sum |
counter | DNS resolve attempts |
gogomonia_site_resolve_success_sum |
counter | Successful DNS resolves |
gogomonia_server_ips |
gauge | Resolved IP addresses for the site |
Labels: instance, zone, project, site, server, probe, check, test
Agent Self-Monitoring Metrics
| Metric | Type | Description |
|---|---|---|
gogomonia_agent_info |
gauge | Agent information (always 1) |
gogomonia_agent_up |
gauge | Heartbeat (1 = alive) |
gogomonia_agent_start_time_seconds |
gauge | Unix timestamp of start |
gogomonia_agent_config_version |
gauge | Applied config version |
gogomonia_agent_config_sites_count |
gauge | Number of sites in config |
gogomonia_agent_config_probes_count |
gauge | Number of probes |
gogomonia_agent_config_checks_count |
gauge | Number of checks |
gogomonia_agent_config_last_fetch_seconds |
gauge | Time of last config fetch |
gogomonia_agent_config_fetch_errors_total |
counter | Config fetch errors |
gogomonia_agent_checks_ok |
gauge | Number of checks with OK status |
gogomonia_agent_checks_failed |
gauge | Number of checks with FAIL status |
gogomonia_agent_checks_no_data |
gauge | Number of checks with no data |
gogomonia_agent_remote_write_errors_total |
counter | Remote write errors |
gogomonia_agent_remote_write_samples_total |
counter | Samples sent |
gogomonia_agent_remote_write_queue_length |
gauge | Remote write buffer size |
gogomonia_agent_dns_resolve_errors_total |
counter | DNS resolve errors |
gogomonia_agent_http_request_errors_total |
counter | HTTP request errors |
gogomonia_agent_tcp_request_errors_total |
counter | TCP request errors |
gogomonia_agent_dns_query_errors_total |
counter | DNS query errors |
gogomonia_agent_log_push_queue_length |
gauge | Log push queue length |
gogomonia_agent_log_push_errors_total |
counter | Log push errors |
Labels: instance, zone, uuid, hostname, version
Dashboards
When site monitoring is enabled, the following dashboards automatically appear in the project:
- Overall Status — summary table of all project sites: availability, response times, check status
- Zone Status — zone-level details: per-zone metrics, availability comparison across zones
- Site by Instance — site drill-down: per-server/per-instance metrics, response time scatter plot
Additional system dashboards (Go Runtime, Logs) are available for monitoring agent health.