Policy Diff (roadmap E2 follow-up): obmp-rib-poller pulls per-router
post-policy accepted/advertised prefix counts and route-policy bindings
over CLI+NETCONF (BMP on XRv9000 24.3.1 carries only pre-policy
Adj-RIB-In). New tables in 008_obmp_policy_diff.sql; Policy Diff
dashboard joins them against BMP ip_rib for received-vs-kept-vs-rejected.
GoBGP fleet-wide feed: GoBGP re-advertises the full Bromirski table to
both labs' core routers (CML AS65020, PROX AS65021) over eBGP; as route
reflectors the cores propagate it to every R9K client, so all 18 lab
routers carry and BMP-export a full table -- an intentional stress test
of the ingestion/storage path. cml/gobgp_peering_config.py applies and
rolls back the core-side config; gobgp/README.md documents the rollback.
Kafka lag monitoring: kafka-lag-monitor samples consumer-group lag every
30s into TimescaleDB (009_kafka_lag.sql); Kafka Ingestion Lag dashboard
gives visibility into the pipeline under churn load.
Peer Detail dashboard: the Peer selector is now router-qualified
(router -> peer) so it is unambiguous in an iBGP route-reflector mesh.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- postgres/scripts/007_obmp_evpn.sql: the evpn_rib landing table
(roadmap E5 step 1), applied to the live DB. Mirrors l3vpn_rib;
a dedicated consumer will populate it.
- production-sizing.md: corrected retention figures to the actual
policy values, added a measured-data section (one full feed ≈
+5 GB current state; DB now ~30 GB), and a horizontal-scaling
section — the bottleneck is the psql-app consumer + disk IOPS, so
scale psql-app as a Kafka consumer group (cap = partition count),
treat multi-collector as HA/locality not throughput.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The AS map previously exploded ~4.4M base_attrs AS_PATH rows live,
three times per load (one per panel), ~1.8s each — slow enough that
navigating away cancelled the queries mid-flight.
Add mv_as_adjacency: undirected consecutive-AS pairs with occurrence
counts over the full RIB (17k rows), refreshed hourly by pg_cron via
REFRESH ... CONCURRENTLY. The dashboard panels now read the view in
~1ms. Min-occurrence options rescaled for full-RIB counts
(2000/5000/10000/50000, default 2000 -> ~63-node graph).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>