diff --git a/docs/backup-restore.md b/docs/backup-restore.md new file mode 100644 index 0000000..ce0cdff --- /dev/null +++ b/docs/backup-restore.md @@ -0,0 +1,223 @@ +# OpenBMP Backup & Restore + +How to back up and restore the OpenBMP PostgreSQL database, what the backup +covers, and what it deliberately does not. + +--- + +## What `scripts/pg-backup.sh` backs up + +The script runs `pg_dump` inside the `obmp-psql` container and produces a +single timestamped, compressed, custom-format dump of the **entire `openbmp` +database**: + +- All BMP/BGP operational tables — `routers`, `bgp_peers`, `ip_rib`, + `base_attrs`, `global_ip_rib`, `l3vpn_rib`, the `ls_*` link-state tables. +- All history / TimescaleDB hypertables — `ip_rib_log`, `peer_event_log`, + `stat_reports`, and the `stats_*` aggregate tables. +- Reference / enrichment data — `geo_ip`, `info_asn`, `info_route`, + `rpki_validator`, `pdb_exchange_peers`. +- Schema objects — table definitions, indexes, views, functions, triggers, + enum types, and the TimescaleDB hypertable configuration. + +The dump is taken against a **live database** — `pg_dump` uses an MVCC +snapshot, so no downtime and no service stop is required. It is written +atomically (to a `.partial` file, renamed on success) so an interrupted run +never leaves a dump that looks valid but is truncated. + +Output: `${OBMP_DATA_ROOT:-/var/openbmp}/backups/openbmp-YYYYMMDD-HHMMSS.dump` + +### TimescaleDB note + +The OpenBMP database uses TimescaleDB hypertables (`ip_rib_log`, +`peer_event_log`, the `stats_*` tables, with compression policies). +**A `pg_dump` logical backup restores hypertables correctly** — the dump +captures the `_timescaledb_catalog` metadata, and on restore the hypertable +structure, chunks, and compression settings are recreated. No special flags +are needed for the dump. The only requirement is that the **restore target +has the TimescaleDB extension available** — which the `openbmp/postgres` +image provides, so restoring into a fresh `obmp-psql` works out of the box. + +--- + +## Scheduling + +Make the script executable once: + +```bash +chmod +x scripts/pg-backup.sh +``` + +Add a cron entry (`crontab -e`) — daily at 02:30, logging to a file: + +```cron +30 2 * * * OBMP_DATA_ROOT=/var/openbmp /home/user/obmp-docker/scripts/pg-backup.sh >> /var/openbmp/backups/pg-backup.log 2>&1 +``` + +The cron user must be able to reach the Docker daemon — run it as a user in +the `docker` group, or as root. A systemd timer is an equally valid +alternative. + +### Configuration + +All settings are environment variables with sensible defaults: + +| Variable | Default | Purpose | +|----------|---------|---------| +| `OBMP_DATA_ROOT` | `/var/openbmp` | Base data dir; backups go to `${OBMP_DATA_ROOT}/backups` | +| `OBMP_BACKUP_DIR` | (unset) | Explicit backup dir, overrides the default | +| `OBMP_PG_CONTAINER` | `obmp-psql` | Postgres container name | +| `OBMP_PG_DB` | `openbmp` | Database name | +| `OBMP_PG_USER` | `openbmp` | Database user | +| `OBMP_BACKUP_RETENTION_DAYS` | `14` | Dumps older than this are pruned each run | + +Retention only prunes files matching the script's own `openbmp-*.dump` +naming pattern — nothing else in the directory is touched. + +### Production recommendations + +- **Copy dumps off-host.** A local backup does not survive host loss. Sync + the backup directory to object storage / a backup server (e.g. nightly + `rclone`, `restic`, or your existing ISP backup tooling). +- **Size the backup volume** — at production scale (~100–150M NLRIs) the + dump can be tens of GB even compressed. See `docs/production-sizing.md`. +- **Test restores periodically** — an untested backup is not a backup. +- For tighter RPO than once-daily logical dumps, consider PostgreSQL + continuous archiving / PITR (WAL archiving + `pg_basebackup`). That is out + of scope for this script but worth planning for a production deployment. + +--- + +## Restore procedure + +This restores a dump into a **fresh, empty** `obmp-psql` database. Restoring +over a populated database risks conflicts — start clean. + +### 1. Stop the writers + +Stop the services that write to the database so nothing races the restore: + +```bash +docker compose -p obmp stop psql-app collector +``` + +Leave `obmp-psql` running. + +### 2. Recreate an empty database + +Drop and recreate the `openbmp` database inside the running container: + +```bash +docker exec -i obmp-psql psql -U openbmp -d postgres <<'EOSQL' +DROP DATABASE IF EXISTS openbmp; +CREATE DATABASE openbmp OWNER openbmp; +EOSQL +``` + +> Restoring into a **brand-new container**? Bring `obmp-psql` up first and let +> it initialize, but **do not** create the `config/init_db` trigger file — +> the schema comes from the dump, not from psql-app's first-run migration. + +### 3. Restore the dump + +Copy the dump into the container and run `pg_restore`: + +```bash +DUMP=/var/openbmp/backups/openbmp-YYYYMMDD-HHMMSS.dump + +docker cp "${DUMP}" obmp-psql:/tmp/restore.dump + +docker exec -i obmp-psql \ + pg_restore -U openbmp -d openbmp --no-owner --no-privileges \ + --jobs=4 /tmp/restore.dump + +docker exec obmp-psql rm -f /tmp/restore.dump +``` + +- `--no-owner --no-privileges` — the dump was created with the same flags; + objects are recreated owned by the connecting role. +- `--jobs=4` — parallel restore; raise it on a many-core host to speed up the + large `ip_rib` / `ip_rib_log` tables. Custom-format dumps support this. +- Some non-fatal warnings (e.g. about the TimescaleDB extension or existing + objects) are normal. A non-zero exit with only warnings is usually fine — + inspect the output before assuming failure. + +Alternatively, stream the restore without `docker cp`: + +```bash +docker exec -i obmp-psql pg_restore -U openbmp -d openbmp \ + --no-owner --no-privileges < "${DUMP}" +``` + +(Streaming via stdin disables `--jobs` parallelism — use `docker cp` for +large dumps.) + +### 4. Verify + +```bash +docker exec -i obmp-psql psql -U openbmp -d openbmp -c " + SELECT (SELECT count(*) FROM routers) AS routers, + (SELECT count(*) FROM bgp_peers) AS peers, + (SELECT count(*) FROM ip_rib) AS rib_rows;" +``` + +Confirm hypertables came back: + +```bash +docker exec -i obmp-psql psql -U openbmp -d openbmp -c " + SELECT hypertable_name FROM timescaledb_information.hypertables;" +``` + +### 5. Restart the writers + +```bash +docker compose -p obmp start collector psql-app +``` + +The collector reconnects to the routers' BMP sessions and psql-app resumes +consuming from Kafka. Live state catches up from the routers. + +--- + +## What is NOT covered + +This backup is **PostgreSQL only**. The following are out of scope and need +their own handling: + +- **Kafka data is transient.** The `obmp-kafka` topics are a short-retention + pipeline buffer (`KAFKA_LOG_RETENTION_MINUTES: 720` — 12 hours). They are + not a system of record and do not need backing up. After a restore, routers + re-send BMP and the pipeline refills naturally. + +- **InfluxDB telemetry has its own backup.** The gNMI streaming-telemetry + data lives in `obmp-influxdb` (bucket `telemetry`), not in PostgreSQL. + `pg_dump` does not touch it. Back it up separately with the Influx CLI: + + ```bash + # Backup + docker exec obmp-influxdb influx backup /var/lib/influxdb2/backup \ + --token "$INFLUXDB_ADMIN_TOKEN" + docker cp obmp-influxdb:/var/lib/influxdb2/backup \ + /var/openbmp/backups/influxdb-$(date +%Y%m%d) + + # Restore + docker cp /var/openbmp/backups/influxdb-YYYYMMDD \ + obmp-influxdb:/var/lib/influxdb2/restore + docker exec obmp-influxdb influx restore /var/lib/influxdb2/restore \ + --token "$INFLUXDB_ADMIN_TOKEN" + ``` + + Telemetry is also less critical than BMP data (30-day retention, + data-plane counters) — back it up if you need historical telemetry to + survive a host loss; otherwise the 30-day window simply re-fills. + +- **Grafana** — dashboards and datasources are provisioned from files in the + repo (`obmp-grafana/provisioning/` and `obmp-grafana/dashboards/`), so they + are already version-controlled in git. The Grafana database under + `${OBMP_DATA_ROOT}/grafana` (users, preferences, manually-created + dashboards, alert state) is *not* covered by this script — back up that + directory separately if it holds anything not reproducible from the repo. + +- **Configuration & secrets** — `.env`, `docker-compose.yml`, and the + `${OBMP_DATA_ROOT}/config` directory. Keep these in version control / + your secrets manager. diff --git a/docs/security-hardening.md b/docs/security-hardening.md new file mode 100644 index 0000000..f8f356e --- /dev/null +++ b/docs/security-hardening.md @@ -0,0 +1,488 @@ +# OpenBMP Production Security Hardening + +A prioritized checklist for hardening the OpenBMP Docker stack before exposing +it to a production ISP network of 40 full-table-edge routers. Work top to +bottom — items are ordered roughly by risk reduction per unit effort. + +This document **recommends** changes. It does not modify `docker-compose.yml` +or any running service. Apply the changes in a maintenance window and test. + +> Threat model in brief: the stack ingests BMP from production routers, stores +> the full DFZ in PostgreSQL, and exposes Grafana to operators. The crown +> jewels are (a) the database, (b) the Grafana admin plane, and (c) the BMP +> ingest port. Everything below protects one of those three. + +--- + +## Priority 0 — Credentials (do this first) + +Every service currently ships with the placeholder credential `openbmp` and +related defaults are committed in `docker-compose.yml`: + +| Service | Setting | Current value | +|---------|---------|---------------| +| PostgreSQL | `POSTGRES_USER` / `POSTGRES_PASSWORD` | `openbmp` / `openbmp` | +| psql-app | `POSTGRES_PASSWORD` | `openbmp` | +| whois | `POSTGRES_PASSWORD` | `openbmp` | +| Grafana | `GF_SECURITY_ADMIN_PASSWORD` | `openbmp` | +| InfluxDB | `DOCKER_INFLUXDB_INIT_PASSWORD` | `openbmp123` | +| InfluxDB | `DOCKER_INFLUXDB_INIT_ADMIN_TOKEN` | `openbmp-telemetry-token` | +| Grafana datasource | `secureJsonData.password` | `openbmp` (in `openbmp-ds.yml`) | + +### 0.1 Move every secret to `.env` (or a secrets manager) + +`.env` is git-ignored. As a minimum, replace the hardcoded literals in +`docker-compose.yml` with `${VAR}` references and define them in `.env`: + +```env +# .env — never commit this file +POSTGRES_PASSWORD= +GF_SECURITY_ADMIN_PASSWORD= +INFLUXDB_ADMIN_PASSWORD= +INFLUXDB_ADMIN_TOKEN= +``` + +```yaml +# docker-compose.yml (recommended edit — operator applies) + grafana: + environment: + - GF_SECURITY_ADMIN_PASSWORD=${GF_SECURITY_ADMIN_PASSWORD:?set in .env} + psql: + environment: + - POSTGRES_PASSWORD=${POSTGRES_PASSWORD:?set in .env} +``` + +The `:?` form makes the stack fail fast if a secret is missing rather than +silently falling back to a default. + +Generate strong values: + +```bash +openssl rand -base64 32 # passwords +openssl rand -hex 32 # tokens +``` + +### 0.2 For a real production deployment, use a secrets manager + +`.env` on disk is better than committed literals, but it is still a +plaintext file readable by anyone in the `docker` group. For production: + +- **Docker Compose secrets** (`secrets:` block, files mounted at + `/run/secrets/...`) — the lowest-friction upgrade; keep the secret files + outside the repo, `chmod 600`, owned by root. +- **HashiCorp Vault**, **AWS Secrets Manager**, **Bitwarden Secrets**, or your + existing ISP secret store — inject at deploy time via a wrapper that renders + `.env` from the vault and shreds it after `docker compose up`. + +Whatever the choice: rotate all six credentials above on first production +deploy — they have been in git history as `openbmp` and must be considered +compromised. + +### 0.3 Rotate the Grafana datasource password in lockstep + +`obmp-grafana/provisioning/datasources/openbmp-ds.yml` carries +`secureJsonData.password`. It is read at Grafana start. When you change the +PostgreSQL password, update this file too (it supports `$__file{}` and +env-var expansion: `password: $POSTGRES_PASSWORD`) and restart Grafana. + +--- + +## Priority 1 — Network exposure / firewalling + +The host currently publishes these ports to `0.0.0.0`: 5000 (BMP), 5432 +(PostgreSQL), 9092 (Kafka), 3000 (Grafana), 8086 (InfluxDB), 4300 (whois), +9091 (Authelia). Most should not be world-reachable. + +### 1.1 BMP collector (port 5000) — restrict to router management subnets + +The collector accepts a BMP session from any source. A rogue BMP feed can +inject bogus routers/peers/prefixes into the database. Firewall it to the +router management subnets only. + +`nftables` example (preferred on modern hosts): + +```nft +# /etc/nftables.conf — adjust subnets to your router management ranges +table inet obmp { + chain input { + type filter hook input priority 0; policy accept; + + # BMP ingest — only from router management subnets + tcp dport 5000 ip saddr { 10.100.0.0/24, 10.100.1.0/24 } accept + tcp dport 5000 drop + } +} +``` + +`iptables` equivalent: + +```bash +iptables -A INPUT -p tcp --dport 5000 -s 10.100.0.0/24 -j ACCEPT +iptables -A INPUT -p tcp --dport 5000 -s 10.100.1.0/24 -j ACCEPT +iptables -A INPUT -p tcp --dport 5000 -j DROP +``` + +> Docker's `iptables` integration uses the `DOCKER-USER` chain for +> container-published ports. Put the rules above in `DOCKER-USER` so Docker +> does not bypass them: +> ```bash +> iptables -I DOCKER-USER -p tcp --dport 5000 -s 10.100.0.0/24 -j RETURN +> iptables -I DOCKER-USER -p tcp --dport 5000 -s 10.100.1.0/24 -j RETURN +> iptables -A DOCKER-USER -p tcp --dport 5000 -j DROP +> ``` + +### 1.2 PostgreSQL (5432), Kafka (9092), InfluxDB (8086), whois (4300) + +None of these need to be reachable from outside the stack: + +- **PostgreSQL** — only `psql-app`, `whois`, and `grafana` connect, all on the + Compose network. Bind the published port to loopback only, or drop the + `ports:` mapping entirely: + ```yaml + # docker-compose.yml — psql service + ports: + - "127.0.0.1:5432:5432" # localhost only; or remove entirely + ``` +- **Kafka 9092** — see Priority 2. +- **InfluxDB 8086** — only Grafana and Telegraf use it; bind to loopback or + drop the mapping (Telegraf uses host networking and reaches it via + localhost; Grafana reaches it on the Compose network). +- **whois 4300** — expose only if you actually offer a public whois service; + otherwise bind to loopback. + +For anything that genuinely must be reachable, restrict by source with the +firewall pattern from 1.1. + +### 1.3 Grafana (3000) — keep it behind Authelia + +Authelia already fronts Grafana (the `auth` profile + `GF_AUTH_PROXY_*` +settings). Make that the *only* path: + +- Bind Grafana's published port to loopback: `127.0.0.1:3000:3000`, and let + the reverse proxy / Authelia terminate TLS and reach it internally. +- Do **not** leave port 3000 directly reachable — `GF_AUTH_PROXY_ENABLED=true` + trusts the `Remote-User` header, so any client that can reach 3000 directly + and set that header bypasses authentication entirely. + +--- + +## Priority 2 — Kafka transport security + +Kafka is currently **PLAINTEXT** and advertises a host-IP listener: + +```yaml +KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092,PLAINTEXT_HOST://${HOST_IP}:9092 +KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT +``` + +The `obmp-kafka:29092` listener is internal to the Compose network and is the +only one the collector and psql-app use. The `PLAINTEXT_HOST://...:9092` +listener exists only for outside access and is not needed by the core stack. + +**Recommended (simplest, most secure): remove the host listener.** If nothing +outside the Compose network consumes Kafka, drop the `9092` port mapping and +the `PLAINTEXT_HOST` advertised listener so Kafka is reachable only on the +internal Docker network: + +```yaml + kafka: + # remove the - "9092:9092" ports entry + environment: + KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092 + KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT + KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT +``` + +**If external Kafka access is genuinely required** (e.g. a separate analytics +consumer, or the split-host architecture in `production-sizing.md` where +Kafka and the DB are on different hosts), do **not** leave it PLAINTEXT on a +routed network. Enable SASL_SSL on the external listener: + +```yaml +KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092,SASL_SSL://${HOST_IP}:9092 +KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,SASL_SSL:SASL_SSL +KAFKA_SASL_ENABLED_MECHANISMS: SCRAM-SHA-512 +KAFKA_SSL_KEYSTORE_LOCATION: /etc/kafka/secrets/kafka.keystore.jks +KAFKA_SSL_KEYSTORE_PASSWORD: ${KAFKA_KEYSTORE_PASSWORD} +KAFKA_SSL_KEY_PASSWORD: ${KAFKA_KEY_PASSWORD} +KAFKA_SSL_TRUSTSTORE_LOCATION: /etc/kafka/secrets/kafka.truststore.jks +KAFKA_SSL_TRUSTSTORE_PASSWORD: ${KAFKA_TRUSTSTORE_PASSWORD} +``` + +Keep the internal `PLAINTEXT://obmp-kafka:29092` listener for the collector +and psql-app — intra-Compose traffic on a private bridge does not need TLS and +adding SASL there means re-configuring both clients. At minimum, never publish +a PLAINTEXT Kafka listener on an IP that routes beyond the host. + +--- + +## Priority 3 — PostgreSQL hardening + +### 3.1 Change the default `openbmp` / `openbmp` credentials + +Covered in Priority 0. Note that `POSTGRES_USER`/`POSTGRES_PASSWORD` only take +effect when the data directory is initialized. To rotate on an existing +database, change the password in SQL and update every consumer: + +```bash +docker exec -it obmp-psql psql -U openbmp -d openbmp \ + -c "ALTER ROLE openbmp WITH PASSWORD '';" +``` + +Then update `POSTGRES_PASSWORD` for `psql-app` and `whois`, the +`secureJsonData.password` in `openbmp-ds.yml`, and restart those services. + +### 3.2 Create a least-privilege role for Grafana + +Grafana only needs to read. Do not let it connect as the owning role: + +```sql +CREATE ROLE grafana_ro LOGIN PASSWORD ''; +GRANT CONNECT ON DATABASE openbmp TO grafana_ro; +GRANT USAGE ON SCHEMA public TO grafana_ro; +GRANT SELECT ON ALL TABLES IN SCHEMA public TO grafana_ro; +ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO grafana_ro; +``` + +Point `openbmp-ds.yml` at `grafana_ro`. This contains a Grafana compromise to +read-only and blocks SQL-panel writes. + +### 3.3 Restrict `pg_hba.conf` + +The default OpenBMP image is permissive (`host all all all md5` or similar). +Tighten it so only the stack's own subnet can connect, and require +`scram-sha-256`: + +```conf +# pg_hba.conf (inside the obmp-psql container / mounted) +# TYPE DATABASE USER ADDRESS METHOD +local all all scram-sha-256 +host openbmp openbmp 172.16.0.0/12 scram-sha-256 # Docker bridge range +host openbmp grafana_ro 172.16.0.0/12 scram-sha-256 +hostssl openbmp openbmp 0.0.0.0/0 scram-sha-256 # only if remote DB host +# reject everything else +host all all 0.0.0.0/0 reject +``` + +Identify the actual Compose network subnet with +`docker network inspect obmp_default` and scope `ADDRESS` to it. Reload with +`docker exec obmp-psql psql -U openbmp -c "SELECT pg_reload_conf();"`. + +> `scram-sha-256` requires `password_encryption = scram-sha-256` in +> `postgresql.conf` and that passwords were set/rotated *after* that change. + +### 3.4 Enable SSL/TLS + +The Grafana datasource already requests `sslmode: "require"` — but the server +must actually present a certificate. In `postgresql.conf`: + +```conf +ssl = on +ssl_cert_file = '/var/lib/postgresql/server.crt' +ssl_key_file = '/var/lib/postgresql/server.key' +``` + +Generate a cert (self-signed is acceptable for an internal DB; use your +internal CA if you have one): + +```bash +openssl req -new -x509 -days 825 -nodes -text \ + -out server.crt -keyout server.key -subj "/CN=obmp-psql" +chmod 600 server.key # PostgreSQL refuses a world-readable key +``` + +Mount both files into the container's data directory. For the strongest +posture, move clients to `sslmode: verify-full` once a proper CA chain is in +place. This is most important if PostgreSQL runs on a separate host (the +split-host architecture in `production-sizing.md`) — intra-host Compose +traffic is lower-risk but TLS is still recommended. + +### 3.5 Limit listen addresses + +If PostgreSQL must accept connections from another host (split-host layout), +keep `listen_addresses` scoped — do not leave it at `*` if a single interface +suffices: + +```conf +listen_addresses = 'localhost,172.18.0.1' # loopback + Docker bridge gateway +``` + +On a single-host deployment, drop the `5432` port mapping entirely (1.2) so +the listener is reachable only on the Compose network. + +--- + +## Priority 4 — Drop `privileged: true` on the `psql` service + +```yaml + psql: + privileged: true # <-- remove or replace + shm_size: 1536m + sysctls: + - net.ipv4.tcp_keepalive_intvl=30 + - net.ipv4.tcp_keepalive_probes=5 + - net.ipv4.tcp_keepalive_time=180 +``` + +**Why it is a risk:** `privileged: true` gives the container *all* Linux +capabilities, disables seccomp/AppArmor confinement, and grants access to all +host devices. A compromise of PostgreSQL — the process most exposed to +untrusted route data — would then be a near-complete host compromise. This is +the single largest container-isolation gap in the stack. + +**Why it is probably there:** PostgreSQL needs adequate shared memory and +benefits from the TCP keepalive `sysctls`. The compose file already sets +`shm_size: 1536m` and the `sysctls:` list explicitly — both of which Docker +applies *without* needing privileged mode. So `privileged: true` is most +likely a leftover, not a hard requirement. + +**Recommended action — test without it:** + +1. In a maintenance window, remove `privileged: true` and start the service. +2. Confirm PostgreSQL starts, the namespaced `sysctls` apply + (`docker exec obmp-psql sysctl net.ipv4.tcp_keepalive_time`), and shared + memory is honored (`docker exec obmp-psql cat /proc/meminfo | grep Shmem`, + and watch for `could not resize shared memory segment` errors in the log). +3. If everything is healthy, leave it removed. + +If a specific capability turns out to be needed, add only that one instead of +going fully privileged: + +```yaml + psql: + # privileged: true <-- removed + shm_size: 1536m + cap_drop: + - ALL + cap_add: + - CHOWN + - SETUID + - SETGID + - DAC_OVERRIDE # add only capabilities proven necessary by testing + sysctls: + - net.ipv4.tcp_keepalive_intvl=30 + - net.ipv4.tcp_keepalive_probes=5 + - net.ipv4.tcp_keepalive_time=180 +``` + +The `sysctls:` block stays — those are namespaced and do not require +privileged mode. + +--- + +## Priority 5 — Container hardening (defense in depth) + +Apply across services after the higher-priority items. Test each service +individually — `read_only` in particular will surface paths a service writes +to that then need explicit `tmpfs` mounts. + +### 5.1 `no-new-privileges` + +Prevents a process inside a container from gaining privileges via setuid +binaries. Safe to apply to every service: + +```yaml + security_opt: + - no-new-privileges:true +``` + +### 5.2 Drop capabilities + +Most of these services need almost no Linux capabilities. Start from zero and +add back only what breaks: + +```yaml + cap_drop: + - ALL +``` + +- `grafana`, `whois`, `portal`, `zookeeper` — typically run fine with + `cap_drop: [ALL]`. +- `collector`, `kafka`, `psql`, `psql-app` — drop ALL, then add back any + capability proven necessary (see Priority 4 for `psql`). +- `traffic-gen*` legitimately need `NET_RAW`/`NET_ADMIN` (Scapy) — leave those + `cap_add` entries; they are already minimal. + +### 5.3 Read-only root filesystem + +Make the root filesystem immutable where the service only writes to known +volumes: + +```yaml + grafana: + read_only: true + tmpfs: + - /tmp + # /var/lib/grafana is already a bind mount — writes go there, not to rootfs + + portal: + read_only: true # nginx:alpine static site; add tmpfs for nginx + tmpfs: + - /tmp + - /var/cache/nginx + - /var/run +``` + +`read_only` is straightforward for `grafana`, `portal`, and `whois`. It is +trickier for `psql`, `kafka`, and `zookeeper` (they write to data volumes but +also expect a writable rootfs in places) — test individually and add `tmpfs` +mounts for any write paths, or skip `read_only` for those and rely on +`cap_drop` + `no-new-privileges`. + +### 5.4 Pin and scan images + +Images are already version-pinned (`grafana:9.1.7`, `cp-kafka:7.1.1`, +`openbmp/postgres:2.2.1`, etc.) — good. Add periodic vulnerability scanning: + +```bash +trivy image openbmp/postgres:2.2.1 +trivy image grafana/grafana:9.1.7 +``` + +Note Grafana 9.1.7 is old; review Grafana security advisories and plan an +upgrade path. Track CVEs for the pinned Confluent and OpenBMP images too. + +### 5.5 Resource limits + +Every service already has a `mem_limit`. For production also set `cpus:` (or +`deploy.resources.limits`) so a runaway query or ingest burst cannot starve +the host — this also mitigates local denial-of-service. See +`docs/production-sizing.md` for target values. + +--- + +## Priority 6 — Authelia / access control + +Authelia fronts Grafana (ROADMAP C5). For production: + +- Enforce **TOTP / 2FA** for all operator accounts; do not allow `one_factor` + for the Grafana route. +- Set short session timeouts and an inactivity expiry in the Authelia config. +- Use strong, unique passwords; back the user store with your IdP / LDAP if + available rather than the file backend. +- Ensure Authelia's own secrets (`jwt_secret`, `session.secret`, + `storage.encryption_key`) are strong and stored as secrets, not literals. +- Confirm the reverse proxy strips any client-supplied `Remote-User` header + before Authelia sets it — otherwise the auth-proxy trust model is bypassable + (see 1.3). + +--- + +## Quick checklist + +- [ ] Rotate all six default credentials; remove literals from compose, move to `.env` / secrets manager +- [ ] Update `openbmp-ds.yml` datasource password to match +- [ ] Firewall BMP port 5000 to router management subnets (`DOCKER-USER` chain) +- [ ] Bind 5432 / 8086 / 4300 to loopback or drop the port mappings +- [ ] Bind Grafana 3000 to loopback; reach it only via Authelia +- [ ] Remove the Kafka `PLAINTEXT_HOST` listener + 9092 mapping (or enable SASL_SSL if external access needed) +- [ ] Create `grafana_ro` least-privilege DB role; repoint the datasource +- [ ] Tighten `pg_hba.conf`; require `scram-sha-256` +- [ ] Enable PostgreSQL `ssl = on` with a server certificate +- [ ] Test removing `privileged: true` from `psql`; replace with specific `cap_add` if needed +- [ ] Add `security_opt: [no-new-privileges:true]` to all services +- [ ] Add `cap_drop: [ALL]` and add back only required capabilities +- [ ] Add `read_only: true` + `tmpfs` to `grafana` / `portal` / `whois` +- [ ] Add `cpus:` limits per service +- [ ] Scan images with `trivy`; plan a Grafana upgrade off 9.1.7 +- [ ] Enforce TOTP and short sessions in Authelia diff --git a/obmp-grafana/provisioning/alerting/contact-points.yaml b/obmp-grafana/provisioning/alerting/contact-points.yaml new file mode 100644 index 0000000..352d9e0 --- /dev/null +++ b/obmp-grafana/provisioning/alerting/contact-points.yaml @@ -0,0 +1,71 @@ +# OpenBMP — Grafana contact points & notification policy provisioning +# Grafana 9.1.7 (apiVersion: 1) +# +# Defines WHERE alert notifications go (contact points) and WHICH alerts go +# there (the notification policy tree). Pairs with obmp-alerts.yaml in this +# directory. +# +# ---------------------------------------------------------------------- +# OPERATOR REVIEW — this file ships with PLACEHOLDERS. Fill them in. +# ---------------------------------------------------------------------- +# * The 'obmp-ops' contact point below has BOTH an email and a webhook +# receiver as examples. Delete whichever you do not use and fill in real +# values for the one you keep. +# * EMAIL requires Grafana SMTP to be configured (the [smtp] section of +# grafana.ini, or GF_SMTP_* env vars on the obmp-grafana container). +# Without working SMTP the email receiver silently fails. +# * WEBHOOK url: point it at your alerting system (Slack incoming webhook, +# PagerDuty Events API, Mattermost, an internal handler, etc.). +# * After editing, restart Grafana and verify under +# Alerting > Contact points > (test). +# ---------------------------------------------------------------------- + +apiVersion: 1 + +# --- Contact points ---------------------------------------------------- +contactPoints: + - orgId: 1 + name: obmp-ops + receivers: + # ---- Email receiver (requires Grafana SMTP configured) ---- + - uid: obmp-ops-email + type: email + settings: + # REPLACE with the real NOC / on-call distribution address(es). + # Comma-separate multiple recipients. + addresses: noc@example.net + singleEmail: false + disableResolveMessage: false + + # ---- Webhook receiver (Slack / PagerDuty / internal handler) ---- + # Delete this block if you only use email. + - uid: obmp-ops-webhook + type: webhook + settings: + # REPLACE with your real webhook endpoint. + url: https://hooks.example.net/services/REPLACE-ME + httpMethod: POST + disableResolveMessage: false + +# --- Notification policy tree ----------------------------------------- +# The root policy routes every alert from obmp-alerts.yaml to 'obmp-ops'. +# Sub-routes split by the `severity` label so critical alerts can page +# faster / repeat sooner than warnings. +policies: + - orgId: 1 + receiver: obmp-ops + # Group alerts that share these labels into a single notification. + group_by: ['alertname', 'service'] + # Timing for the default (warning-ish) path. + group_wait: 30s + group_interval: 5m + repeat_interval: 4h + routes: + # Critical alerts (peer down, router BMP down): notify fast, repeat + # more often until resolved. + - receiver: obmp-ops + matchers: + - severity = critical + group_wait: 10s + group_interval: 2m + repeat_interval: 1h diff --git a/obmp-grafana/provisioning/alerting/obmp-alerts.yaml b/obmp-grafana/provisioning/alerting/obmp-alerts.yaml new file mode 100644 index 0000000..bb5bcd5 --- /dev/null +++ b/obmp-grafana/provisioning/alerting/obmp-alerts.yaml @@ -0,0 +1,270 @@ +# OpenBMP — Grafana unified-alerting rule provisioning +# Grafana 9.1.7 (apiVersion: 1) +# +# Provisioned alert rules for the OpenBMP BGP-monitoring stack. They query the +# PostgreSQL datasource (uid: obmp_postgres) and fire on BGP peer/router +# session loss, peer flap storms, and RPKI-invalid routes. +# +# ---------------------------------------------------------------------- +# DEPLOYMENT +# ---------------------------------------------------------------------- +# This file is read by Grafana from /etc/grafana/provisioning/alerting/. +# The compose stack bind-mounts ${OBMP_DATA_ROOT}/grafana/provisioning into +# the container, so copy this directory there and restart Grafana: +# +# cp -r obmp-grafana/provisioning/alerting ${OBMP_DATA_ROOT}/grafana/provisioning/ +# docker compose -p obmp restart grafana +# +# Pair it with contact-points.yaml (in this directory) for notifications. +# +# ---------------------------------------------------------------------- +# OPERATOR REVIEW — fields you should check before relying on these +# ---------------------------------------------------------------------- +# * folderUID: '1001' — reuses the existing 'OBMP-Base' dashboard folder so +# the rules have a home in the UI. Change it to a dedicated alerting +# folder UID if you prefer; the folder must already exist in Grafana. +# * datasourceUid: obmp_postgres — confirmed correct for this stack. +# * Thresholds and `for:` durations below are reasonable starting points. +# Tune them against your production baseline (40 full-table routers will +# have a different normal flap/churn profile than the lab). +# * The reduce/threshold expression UIDs (B, C) and refIds are internal to +# each rule; do not rename them without updating the matching references. +# * Alert-rule provisioning YAML is intricate. These definitions are +# intentionally minimal and well-commented. After first load, open each +# rule in the Grafana UI (Alerting > Alert rules) and confirm it +# evaluates without error before depending on it for paging. +# ---------------------------------------------------------------------- + +apiVersion: 1 + +groups: + - orgId: 1 + name: OpenBMP BGP Health + folder: OBMP-Base + # How often every rule in this group is evaluated. + interval: 1m + rules: + + # ------------------------------------------------------------------ + # (a) BGP peer down within the last 15 minutes + # ------------------------------------------------------------------ + # bgp_peers.state is an enum ('up'/'down'); .timestamp is the last + # state-change time. A peer whose state is 'down' AND changed within + # the last 15 min indicates a recent session loss. + - uid: obmp-peer-down + title: BGP Peer Down (recent) + condition: C + for: 5m + data: + - refId: A + relativeTimeRange: { from: 600, to: 0 } + datasourceUid: obmp_postgres + model: + refId: A + datasource: { type: postgres, uid: obmp_postgres } + format: table + rawSql: > + SELECT count(*)::float8 AS value + FROM bgp_peers + WHERE state = 'down' + AND timestamp > (now() AT TIME ZONE 'utc') - interval '15 minutes'; + - refId: B + datasourceUid: __expr__ + model: + refId: B + type: reduce + datasource: { type: __expr__, uid: __expr__ } + expression: A + reducer: last + - refId: C + datasourceUid: __expr__ + model: + refId: C + type: threshold + datasource: { type: __expr__, uid: __expr__ } + expression: B + # Fire when one or more peers went down in the last 15 min. + conditions: + - evaluator: { type: gt, params: [0] } + labels: + severity: critical + service: bmp + annotations: + summary: One or more BGP peers went down in the last 15 minutes + description: > + {{ $values.B }} BGP peer(s) are in state 'down' with a state + change within the last 15 minutes. Check the OBMP peer + inventory and the affected routers. + + # ------------------------------------------------------------------ + # (b) Peer flap storm — >5 down-events for one peer in 1 hour + # ------------------------------------------------------------------ + # peer_event_log records every peer state transition. Counting 'down' + # events per peer over the last hour detects a flapping session even + # if the peer is currently 'up'. The inner query groups per peer; the + # outer takes the worst offender's count. + - uid: obmp-peer-flap-storm + title: BGP Peer Flap Storm + condition: C + for: 0m + data: + - refId: A + relativeTimeRange: { from: 3600, to: 0 } + datasourceUid: obmp_postgres + model: + refId: A + datasource: { type: postgres, uid: obmp_postgres } + format: table + rawSql: > + SELECT coalesce(max(c), 0)::float8 AS value + FROM ( + SELECT count(*) AS c + FROM peer_event_log + WHERE state = 'down' + AND timestamp > (now() AT TIME ZONE 'utc') - interval '1 hour' + GROUP BY peer_hash_id + ) s; + - refId: B + datasourceUid: __expr__ + model: + refId: B + type: reduce + datasource: { type: __expr__, uid: __expr__ } + expression: A + reducer: last + - refId: C + datasourceUid: __expr__ + model: + refId: C + type: threshold + datasource: { type: __expr__, uid: __expr__ } + expression: B + # >5 down-events for a single peer within 1h = flap storm. + conditions: + - evaluator: { type: gt, params: [5] } + labels: + severity: warning + service: bmp + annotations: + summary: A BGP peer is flapping (more than 5 resets in the last hour) + description: > + At least one peer has logged {{ $values.B }} 'down' events in + peer_event_log within the last hour. Investigate link/session + instability on the affected peer. + + # ------------------------------------------------------------------ + # (c) RPKI-invalid routes present + # ------------------------------------------------------------------ + # ip_rib has no RPKI column on this schema, so validity is derived by + # joining against rpki_validator (ROA cache, refreshed by the psql-app + # RPKI cron). A route is "invalid" when a covering ROA exists for the + # prefix but NO ROA matches its origin AS. + # + # NOTE: rpki_validator is empty until ENABLE_RPKI=1 has run at least + # once (every ~2h). Until then this rule correctly reports 0. + - uid: obmp-rpki-invalid + title: RPKI-Invalid Routes Present + condition: C + for: 10m + data: + - refId: A + relativeTimeRange: { from: 600, to: 0 } + datasourceUid: obmp_postgres + model: + refId: A + datasource: { type: postgres, uid: obmp_postgres } + format: table + rawSql: > + SELECT count(*)::float8 AS value + FROM ip_rib r + WHERE r.iswithdrawn = false + AND r.origin_as IS NOT NULL + AND EXISTS ( + SELECT 1 FROM rpki_validator v + WHERE r.prefix <<= v.prefix + AND r.prefix_len BETWEEN masklen(v.prefix) AND v.prefix_len_max + ) + AND NOT EXISTS ( + SELECT 1 FROM rpki_validator v2 + WHERE r.prefix <<= v2.prefix + AND r.prefix_len BETWEEN masklen(v2.prefix) AND v2.prefix_len_max + AND v2.origin_as = r.origin_as + ); + - refId: B + datasourceUid: __expr__ + model: + refId: B + type: reduce + datasource: { type: __expr__, uid: __expr__ } + expression: A + reducer: last + - refId: C + datasourceUid: __expr__ + model: + refId: C + type: threshold + datasource: { type: __expr__, uid: __expr__ } + expression: B + # Any RPKI-invalid route is worth surfacing. Raise the param + # (e.g. to 10) if you expect a steady-state baseline of + # invalids and only want to alert on spikes. + conditions: + - evaluator: { type: gt, params: [0] } + labels: + severity: warning + service: routing-security + annotations: + summary: RPKI-invalid routes are present in the RIB + description: > + {{ $values.B }} route(s) in ip_rib are RPKI-invalid (a covering + ROA exists but none matches the route's origin AS). Possible + mis-origination or hijack — review the RPKI Validation dashboard. + + # ------------------------------------------------------------------ + # (d) Router BMP session down + # ------------------------------------------------------------------ + # routers.state is the BMP session state for each monitored router. + # 'down' means the router's BMP feed to the collector has dropped. + - uid: obmp-router-bmp-down + title: Router BMP Session Down + condition: C + for: 5m + data: + - refId: A + relativeTimeRange: { from: 600, to: 0 } + datasourceUid: obmp_postgres + model: + refId: A + datasource: { type: postgres, uid: obmp_postgres } + format: table + rawSql: > + SELECT count(*)::float8 AS value + FROM routers + WHERE state = 'down'; + - refId: B + datasourceUid: __expr__ + model: + refId: B + type: reduce + datasource: { type: __expr__, uid: __expr__ } + expression: A + reducer: last + - refId: C + datasourceUid: __expr__ + model: + refId: C + type: threshold + datasource: { type: __expr__, uid: __expr__ } + expression: B + # Any router with a down BMP session. + conditions: + - evaluator: { type: gt, params: [0] } + labels: + severity: critical + service: bmp + annotations: + summary: One or more routers have a down BMP session + description: > + {{ $values.B }} router(s) are in BMP state 'down' — the + collector is no longer receiving BMP from them. Check the + router BMP config and reachability to the collector on port 5000. diff --git a/scripts/pg-backup.sh b/scripts/pg-backup.sh new file mode 100755 index 0000000..10e9b50 --- /dev/null +++ b/scripts/pg-backup.sh @@ -0,0 +1,105 @@ +#!/usr/bin/env bash +# +# pg-backup.sh — logical backup of the OpenBMP PostgreSQL database. +# +# Performs a `pg_dump` of the `openbmp` database inside the obmp-psql +# container, writes a timestamped compressed dump to a backup directory, +# and prunes dumps older than the configured retention. +# +# Usage: +# ./pg-backup.sh +# +# Configuration (environment variables, all optional): +# OBMP_DATA_ROOT Base data dir. Default: /var/openbmp +# Backups go to ${OBMP_DATA_ROOT}/backups unless +# OBMP_BACKUP_DIR is set. +# OBMP_BACKUP_DIR Explicit backup directory. Overrides the default. +# OBMP_PG_CONTAINER Postgres container name. Default: obmp-psql +# OBMP_PG_DB Database name. Default: openbmp +# OBMP_PG_USER Database user. Default: openbmp +# OBMP_BACKUP_RETENTION_DAYS Prune dumps older than N days. Default: 14 +# +# Output format: +# pg_dump custom format (-Fc), gzip-level compressed by pg_dump itself. +# Restore with `pg_restore` — see docs/backup-restore.md. +# +# This script is idempotent and safe to run repeatedly. It does not stop +# the database; pg_dump takes a consistent MVCC snapshot of a live DB. +# +# Make it executable once: +# chmod +x scripts/pg-backup.sh +# +# ---------------------------------------------------------------------- +# Scheduling via cron +# ---------------------------------------------------------------------- +# Run `crontab -e` and add (daily at 02:30, log to a file): +# +# 30 2 * * * OBMP_DATA_ROOT=/var/openbmp /home/user/obmp-docker/scripts/pg-backup.sh >> /var/openbmp/backups/pg-backup.log 2>&1 +# +# The script must be able to reach the Docker daemon, so run it as a user +# in the `docker` group (or root). For systemd-based hosts a +# systemd timer is an equally good alternative to cron. +# ---------------------------------------------------------------------- + +set -euo pipefail + +# --- Configuration ----------------------------------------------------- +OBMP_DATA_ROOT="${OBMP_DATA_ROOT:-/var/openbmp}" +BACKUP_DIR="${OBMP_BACKUP_DIR:-${OBMP_DATA_ROOT}/backups}" +PG_CONTAINER="${OBMP_PG_CONTAINER:-obmp-psql}" +PG_DB="${OBMP_PG_DB:-openbmp}" +PG_USER="${OBMP_PG_USER:-openbmp}" +RETENTION_DAYS="${OBMP_BACKUP_RETENTION_DAYS:-14}" + +TIMESTAMP="$(date +%Y%m%d-%H%M%S)" +DUMP_NAME="openbmp-${TIMESTAMP}.dump" +DUMP_PATH="${BACKUP_DIR}/${DUMP_NAME}" +DUMP_TMP="${DUMP_PATH}.partial" + +log() { printf '%s [pg-backup] %s\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "$*"; } +fail() { log "ERROR: $*" >&2; exit 1; } + +# --- Pre-flight checks ------------------------------------------------- +command -v docker >/dev/null 2>&1 || fail "docker command not found in PATH" + +if ! docker inspect -f '{{.State.Running}}' "${PG_CONTAINER}" 2>/dev/null | grep -q true; then + fail "container '${PG_CONTAINER}' is not running" +fi + +mkdir -p "${BACKUP_DIR}" || fail "cannot create backup directory ${BACKUP_DIR}" + +# --- Backup ------------------------------------------------------------ +# Write to a .partial file first, then atomically rename on success so a +# crashed/interrupted run never leaves a truncated dump that looks valid. +log "starting backup of database '${PG_DB}' from container '${PG_CONTAINER}'" + +if docker exec "${PG_CONTAINER}" \ + pg_dump -U "${PG_USER}" -d "${PG_DB}" -Fc --no-owner --no-privileges \ + > "${DUMP_TMP}"; then + mv -f "${DUMP_TMP}" "${DUMP_PATH}" +else + rm -f "${DUMP_TMP}" + fail "pg_dump failed; no backup written" +fi + +DUMP_SIZE="$(du -h "${DUMP_PATH}" | cut -f1)" +log "backup complete: ${DUMP_PATH} (${DUMP_SIZE})" + +# --- Prune old backups ------------------------------------------------- +# Only prune files matching our own naming pattern, so nothing else in the +# directory (logs, manual dumps) is touched. +log "pruning dumps older than ${RETENTION_DAYS} days" +PRUNED=0 +while IFS= read -r -d '' old; do + rm -f "${old}" + log " removed $(basename "${old}")" + PRUNED=$((PRUNED + 1)) +done < <(find "${BACKUP_DIR}" -maxdepth 1 -type f \ + -name 'openbmp-*.dump' -mtime "+${RETENTION_DAYS}" -print0) +log "pruned ${PRUNED} old dump(s)" + +# Also clean up any stale .partial files from previous crashed runs. +find "${BACKUP_DIR}" -maxdepth 1 -type f -name 'openbmp-*.dump.partial' \ + -mtime +1 -delete 2>/dev/null || true + +log "done"