Add production sizing guide for 40 full-table-edge routers

Documents compute, memory, and storage requirements for a production deployment: ~100-150M NLRI estimate, 96-128 GB RAM, 16-32 vCPU, 3-5 TB NVMe, a split-host architecture option, PostgreSQL tuning, and a BMP RIB-scope recommendation (Adj-RIB-In only initially). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:06:25 -07:00 · 2026-05-18 20:06:25 -07:00 · f1558946ae
commit f1558946ae
parent 960806fc06
1 changed files with 96 additions and 0 deletions
--- a/docs/production-sizing.md
+++ b/docs/production-sizing.md
@ -0,0 +1,96 @@
 # OpenBMP Production Sizing — 40 Full-Table-Edge Routers
 Sizing guidance for deploying the OpenBMP stack against a production ISP
 network of **40 full-table-edge routers** with gNMI streaming telemetry.
 Derived from the OpenBMP `psql-app` sizing guidance and measured lab behavior.
 ## Workload assumptions
 | Parameter | Value |
 |-----------|-------|
 | Monitored routers | 40, full-table edge |
 | BMP RIB scope | Adj-RIB-In (see recommendation below) |
 | Full feeds per router | ~2–3 eBGP peers carrying the full DFZ |
 | Routes per full feed | ~1.2M (≈1M IPv4 + ~0.2M IPv6) |
 | **Estimated total NLRIs** | **~100–150M** in Adj-RIB-In |
 | Telemetry | gNMI via Telegraf → InfluxDB, ~50–200 interfaces/router, 10 s interval |
 | History retention | `ip_rib_log` 4 weeks, LS logs 4 months, `peer_event_log` 1 year |
 The NLRI estimate (40 × ~2.5 feeds × 1.2M) places this deployment at the top
 of the OpenBMP `psql-app` guidance tier (150M NLRIs → 64 GB heap).
 ## BMP RIB scope — recommendation
 **Deploy with Adj-RIB-In only.** It is the OpenBMP default, is what every
 dashboard is built on, and captures the highest-value data — what each peer
 advertises. Alternatives and their cost:
 - **Loc-RIB** — adds a full post-best-path converged table per router
  (~40 × 1.2M ≈ +48M NLRIs). Add later, selectively, only where best-path
  analysis is needed; verify the IOS-XR release supports Loc-RIB BMP.
 - **Adj-RIB-Out** — multiplies further (per advertised peer). Not recommended
  for the initial deployment.
 - **Post-policy Adj-RIB-In** — if inbound policy is restrictive this trims
  volume meaningfully; with permissive import it is similar to pre-policy.
 ## Compute & memory
 | Component | Lab today | Production target | Rationale |
 |-----------|-----------|-------------------|-----------|
 | **Total RAM** | 31 GB | **96–128 GB** | psql-app heap 48–64 GB + PostgreSQL shared_buffers/cache + Kafka 4–8 GB + InfluxDB + Grafana + collector |
 | **CPU** | 8 cores | **16–32 vCPU** | PostgreSQL is CPU-bound under full-table churn — lab psql already sustains ~287% (3 cores) at 18 routers |
 | `psql-app` JVM heap (`MEM`) | 3 GB | **48–64 GB** | OpenBMP guidance: 4 GB ≈ 10M NLRIs, 64 GB ≈ 150M NLRIs |
 | `psql-app` container `mem_limit` | 4 GB | **heap + ~8 GB** | Set `PSQL_APP_MEM_LIMIT` above the JVM heap |
 | `psql` container `mem_limit` | 6 GB | **48–64 GB** | Set `PSQL_MEM_LIMIT`; PostgreSQL wants ~25% as `shared_buffers` and the rest for OS cache |
 | `kafka` container `mem_limit` | 4 GB | **8–12 GB** | Set `KAFKA_MEM_LIMIT`; full-table initial dumps from 40 routers are bursty |
 ## Storage
 | Store | Lab today | Production target | Notes |
 |-------|-----------|-------------------|-------|
 | **PostgreSQL** | 25 GB | **2–4 TB NVMe SSD** | `ip_rib` current state (~100–150M rows) + `ip_rib_log` history (4-week retention, the dominant grower) + `base_attrs` + `geo_ip` (~7 GB fixed). OpenBMP guidance: 500 GB main + 1 TB TimescaleDB; add headroom. |
 | **Kafka** | 0.2 GB | **100–500 GB** | 12 h retention; sized for full-table initial-dump bursts × 40 routers |
 | **InfluxDB (telemetry)** | minimal | **50–200 GB** | 40 routers × ~50–200 interfaces × 10 s gNMI × 30 d; compresses well |
 | **Total** | — | **~3–5 TB fast NVMe** | Use NVMe; PostgreSQL random-IO under churn is the bottleneck on slow disks |
 Put the PostgreSQL data directory and the TimescaleDB tablespace on NVMe.
 `ip_rib_log` 4-week retention is the main storage tuning knob — revisit once
 production update volume is measured.
 ## Architecture
 A single host is viable only if large (**≥128 GB RAM, ≥32 vCPU, multi-TB
 NVMe**). **Preferred: split services across hosts** —
 | Host | Services | Profile |
 |------|----------|---------|
 | **DB host** (heaviest) | postgres | — |
 | **Pipeline host** | kafka, zookeeper, collector, psql-app | core |
 | **Presentation host** | grafana, influxdb, telegraf, whois | core + telemetry |
 Whichever layout: every service already carries a Compose `mem_limit` — raise
 `PSQL_MEM_LIMIT` / `PSQL_APP_MEM_LIMIT` / `KAFKA_MEM_LIMIT` in `.env` for the
 production hosts.
 ## PostgreSQL tuning
 - `shared_buffers` ≈ 25% of host RAM; large `effective_cache_size`.
 - Raise `work_mem` (dashboard aggregate queries) and `maintenance_work_mem`.
 - `max_wal_size` already 10 GB — keep or raise for churn bursts.
 - Enable parallel query (`max_parallel_workers_per_gather`).
 - Aggressive autovacuum on churn tables (`ip_rib`, `base_attrs`, `ip_rib_log`)
  — applied in the lab; persist these settings in production provisioning.
 - TimescaleDB compression is already enabled on `ip_rib_log` and the `stats_*`
  hypertables — keep it.
 ## Reference bill of materials (single-host option)
 | Resource | Spec |
 |----------|------|
 | CPU | 32 vCPU |
 | RAM | 128 GB |
 | Storage | 4 TB NVMe SSD |
 | Network | 1 GbE+ to the routers' BMP source network |
 For the split-host option, divide per the architecture table — the DB host
 takes the bulk of RAM and all of the fast storage.