Compare commits

...

56 Commits

Author SHA1 Message Date
sam
4bcf368af0 setup.sh: add OBMP_AUTH_MODE for local vs authelia bootstrap
The bootstrap previously hard-required OBMP_DOMAIN and OBMP_COOKIE_DOMAIN
even when a user just wanted a local lab deployment with Grafana's built-in
login -- those vars only feed Authelia's session-cookie domain and the
public URL it lives behind. On a fresh host with no FQDN this made
./setup.sh impossible to pass without inventing dummy values.

New OBMP_AUTH_MODE=local|authelia in .env (default local) gates the FQDN
validation, Authelia secret generation, Authelia config rendering, and the
auth-profile image pull/build. setup.sh also writes GF_SERVER_ROOT_URL into
.env -- http://HOST_IP:3000/grafana/ for local, https://OBMP_DOMAIN/grafana/
for authelia -- and docker-compose.yml now reads ${GF_SERVER_ROOT_URL}
instead of hardcoding the apodacalab.com fallback.

Back-compat: an existing .env with no OBMP_AUTH_MODE but a real OBMP_DOMAIN
or an existing AUTHELIA_SESSION_SECRET is inferred as 'authelia' and the
mode is persisted -- a re-run on a live Authelia host won't silently flip
it to local and break the next docker compose up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 13:35:22 -07:00
sam
ef932fe1e8 Dashboard QoL: fill the viewport, push legends to bottom
Two recurring layout issues across dashboards I built this session:

  1) Right-placed legend tables ate 30% of each panel width.
  2) Default h:9 panels left ~50% of the viewport empty on a 1080p
     display (total dashboard height ~18 grid rows vs ~30 available).

Stack Resources (Telemetry-3001/stack_resources.json):
  * 3 timeseries: legend placement right -> bottom, calcs [max] -> [last,max],
    added sortBy: Max desc so top consumers float to the top of the legend.
  * Bumped all 4 panels h: 9 -> 14 (dashboard total 18 -> 28 rows).

Kafka Ingestion Lag and Live BGP Churn (Telemetry-3001/*):
  * Bumped timeseries panels h: 9 -> 12; second-row y: 13 -> 16.
    Dashboard total 22 -> 28 rows.

Policy Diff (obmp/History-1002/policy_diff.json):
  * Bumped bottom-row panels h: 8 -> 11. Total 24 -> 27 rows.

Untouched (already adequate, scrollable by design, or built earlier):
  evpn_rib (30 rows), global_table (38), router_diff (52), and the
  Maps-1006 dashboards (already h:22-28 single panels).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 19:58:33 -07:00
sam
2634aada24 Parameterize HOST_IP everywhere -- portable to another lab host
Removes hardcoded 10.40.40.202 references so a fresh clone + .env-only
edit can stand the stack up on a new compute node.

  * docker-compose.yml: rib-poller PG_DSN now uses ${HOST_IP:-...}.
  * obmp-rib-poller/poller.py: default PG_DSN host falls back to
    ${HOST_IP} env (compose passes it; manual runs honour $HOST_IP too).
  * cml/gobgp_peering_config.py: GOBGP_IP read from $HOST_IP or the
    HOST_IP= line in repo-root .env, with a small _env_default helper.
  * cml/proxmox_bmp_config.py: COLLECTOR_HOST resolved the same way.

For gobgp/gobgpd.conf and gobgp-evpn/gobgpd.conf -- jauderho/gobgp is
distroless (no shell), so we can't sed-substitute at container start.
Pattern instead:

  * gobgpd.conf is now gobgpd.conf.tmpl with __HOST_IP__ placeholders
    (committed). The rendered gobgpd.conf is gitignored.
  * setup.sh renders the .tmpl(s) to .conf using $HOST_IP from .env.
  * compose `command` stays the simple `gobgpd -f /config/gobgpd.conf`.

After cloning on a new host:  cp .env.example .env  -> edit HOST_IP ->
./setup.sh -> docker compose up -d. Verified locally by force-recreating
gobgp; all 6 sessions (4 cores + 2 Bromirski) re-established in <60s.

Known portability gaps still to address (separate work):
  * Hardcoded lab-router inventories in cml/*.py and
    obmp-rib-poller/poller.py.
  * The /etc/cron.d/openbmp */5 -> */15 edit inside obmp-psql-app is
    not persistent (regenerated by config_cron on every container start).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 18:34:51 -07:00
sam
2a82bd9a94 ip_rib perf tuning: per-table autovacuum + drop 4 unused indexes
Derived from the 2026-05-19 ingestion stress-test session. psql-app's
unicast_prefix drain rate caps at a few-hundred msg/s under continuous
Postgres maintenance (autovacuum on ip_rib + update_global_ip_rib() /
update_chg_stats() / update_peer_rib_counts() crons) competing for
ip_rib disk I/O.

ALTER TABLE ip_rib SET autovacuum_vacuum_scale_factor=0.02 -- run more
often on smaller chunks. cost_limit kept at its OpenBMP-default 3000 so
each run finishes fast; the consumer runs flat out between bursts
instead of being throttled continuously.

DROP INDEX for four unused/redundant indexes (every INSERT updates every
index; these all had 0 scans in ~2h of heavy activity):
  - ip_rib_hash_id_idx           (907 MB)
  - ip_rib_base_attr_hash_id_idx (558 MB)
  - ip_rib_prefix_idx            (1538 MB, GiST)
  - ip_rib_origin_as_idx         (364 MB)

9 -> 5 indexes; ~3.4 GB freed (6,715 MB -> 3,348 MB). Reduces index
write-amplification per UPSERT by ~45% and shortens autovacuum on
ip_rib by ~the same.

Measurement note: across-cycle 25-min runs were inconclusive on the
sustained-rate effect (inflow was near-zero by then -- gobgp stopped --
so the consumer was largely idle). The real test is re-enabling the
fleet-wide feed with the consumer-replica + 62 GiB RAM and seeing
whether unicast_prefix keeps up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 16:50:15 -07:00
sam
06019ef74c Add consumer-only psql-app replica for ingestion scale-out
psql-app-consumer: profile-gated (scale-out) horizontal scale-out for the
Kafka->Postgres ingestion path. Shares the primary's /config read-only so
it reuses obmp-psql.yml, whose fixed group.id makes Kafka rebalance
partitions across the primary and every replica. Its command runs ONLY
the consumer jar -- no cron, RPKI/IRR/DBIP or initdb -- so it does not
duplicate the primary's DB-maintenance jobs (config_cron wires those up
unconditionally in /usr/sbin/run). Each replica brings its own consumer
and writer threads.

Measured: one consumer-only replica took the post-storm backlog drain
from a cold-start ~3.7k msg/s to ~48k msg/s; group membership 8->16. With
2 consumers feeding it, Postgres becomes the next bottleneck (~500% CPU)
-- DB write capacity is the ceiling beyond ~2-3 consumers.

  docker compose --profile scale-out up -d --scale psql-app-consumer=2

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 14:04:34 -07:00
sam
22d26f0e0f Gate psql-app startup on Postgres health (fix cold-boot race)
On a cold boot all containers start together; psql-app finishes its
RPKI/IRR/DBIP setup and opens its single Postgres connection while the
DB is still initialising -> "the database system is starting up" ->
ConsumerApp.main throws and the consumer dies. The container does NOT
exit (the wrapper keeps cron/rsyslog alive), so restart: unless-stopped
never fires and the consumer stays dead silently.

Add depends_on psql: condition: service_healthy (plus kafka) so Compose
holds psql-app until Postgres passes its pg_isready healthcheck.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 13:53:09 -07:00
sam
d7084aba54 Add fast-path churn monitor and churn-storm load tool
obmp-churn-monitor: a decoupled fast-path BGP churn consumer. Reads
openbmp.parsed.unicast_prefix with its own Kafka consumer group and only
counts announcements/withdrawals per (router,peer) into churn_metrics
(010_churn_metrics.sql) -- no relational RIB write. Storm-tested: it
stayed real-time (tracked 1k->85k msg/s) while the psql-app bulk
pipeline lag grew 3.8M->5.6M. Live BGP Churn dashboard reads it.

tools/churn_storm.py: programmatic churn-storm generator (flaps GoBGP's
eBGP sessions to the lab cores) for load testing.

Stress-test finding: fleet-wide full table from 18 routers exceeds this
31 GiB host. The bottleneck is RAM, not CPU -- at 16 cores the host
still hit load 33 because it was swap-thrashing (swap 2/2 full, <1.5 GiB
free). Lag ran away 3.8M->20M+. Recourse: more host RAM for bulk
throughput; the fast-path consumer for visibility regardless.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 13:17:09 -07:00
sam
b681c473c0 Add Policy Diff, fleet-wide full-table feed, and Kafka lag monitoring
Policy Diff (roadmap E2 follow-up): obmp-rib-poller pulls per-router
post-policy accepted/advertised prefix counts and route-policy bindings
over CLI+NETCONF (BMP on XRv9000 24.3.1 carries only pre-policy
Adj-RIB-In). New tables in 008_obmp_policy_diff.sql; Policy Diff
dashboard joins them against BMP ip_rib for received-vs-kept-vs-rejected.

GoBGP fleet-wide feed: GoBGP re-advertises the full Bromirski table to
both labs' core routers (CML AS65020, PROX AS65021) over eBGP; as route
reflectors the cores propagate it to every R9K client, so all 18 lab
routers carry and BMP-export a full table -- an intentional stress test
of the ingestion/storage path. cml/gobgp_peering_config.py applies and
rolls back the core-side config; gobgp/README.md documents the rollback.

Kafka lag monitoring: kafka-lag-monitor samples consumer-group lag every
30s into TimescaleDB (009_kafka_lag.sql); Kafka Ingestion Lag dashboard
gives visibility into the pipeline under churn load.

Peer Detail dashboard: the Peer selector is now router-qualified
(router -> peer) so it is unambiguous in an iBGP route-reflector mesh.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 12:42:25 -07:00
sam
565ebdbee0 Roadmap E5: mark EVPN lab-testable scope complete
evpn_rib table, gobgp-evpn injector, obmp-evpn-consumer and the EVPN
RIB dashboard are built and verified for type-2/type-3. type-5 and
real (non-synthetic) EVPN remain limited by collector 2.2.3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 09:31:08 -07:00
sam
4e0f3fb0ff Add EVPN RIB dashboard (roadmap E5)
Visualises BGP EVPN routes from evpn_rib: route counts and EVIs,
route-type breakdown, a per-EVI summary, and detail tables for type-2
MAC/IP advertisements (MAC, host IP, VNI/label, route-targets, ESI)
and type-3 inclusive-multicast routes. Scoped by an RD/EVI variable.
Lives in the OBMP-L3VPN folder.

Completes roadmap E5's lab-testable scope: evpn_rib table, gobgp-evpn
injector, obmp-evpn-consumer, and this dashboard — verified end to
end with synthetic type-2/type-3 routes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 09:30:43 -07:00
sam
107cbf6ac5 Add obmp-evpn-consumer: openbmp.parsed.evpn -> evpn_rib (roadmap E5)
A standalone Python Kafka consumer that subscribes to the
openbmp.parsed.evpn topic (which the stock psql-app ignores) and
writes BGP EVPN routes into evpn_rib. Field positions are pinned to
the verified collector 2.2.3 / v1.7 message layout; route_type is
derived from which fields populate. Profile-gated ('evpn-test')
alongside the gobgp-evpn injector.

Verified end to end: 5 injected type-2/type-3 routes land in evpn_rib
with correct RD, ethernet-tag, MAC, IP, label and route-target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 09:28:19 -07:00
sam
41ec96c3ac EVPN injector: drop type-5 (collector 2.2.3 mis-decodes it)
Verified against the live collector: EVPN type-2 (MAC/IP) and type-3
(inclusive multicast) parse cleanly onto openbmp.parsed.evpn, but
type-5 (IP-prefix) is mis-decoded — the IP prefix corrupts the RD
field. inject-evpn.sh now injects only type-2/3; the type-5
limitation is documented in the injector README and roadmap E5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 09:24:08 -07:00
sam
f7532b62ef Add modular gobgp-evpn EVPN test-route injector (roadmap E5)
A profile-gated GoBGP instance (Compose profile 'evpn-test', not part
of the normal stack) that originates synthetic BGP EVPN routes and
BMP-exports its local RIB to the collector. Verified end to end: the
injected type-2/3/5 routes are parsed by the collector and land on
the openbmp.parsed.evpn Kafka topic, ready for the EVPN consumer.

inject-evpn.sh pushes type-2 (MAC/IP), type-3 (inclusive multicast)
and type-5 (IP-prefix) routes. Start with:
  docker compose --profile evpn-test up -d gobgp-evpn

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 09:15:44 -07:00
sam
2d83d6c02e Add evpn_rib schema; update production sizing with measured data
- postgres/scripts/007_obmp_evpn.sql: the evpn_rib landing table
  (roadmap E5 step 1), applied to the live DB. Mirrors l3vpn_rib;
  a dedicated consumer will populate it.
- production-sizing.md: corrected retention figures to the actual
  policy values, added a measured-data section (one full feed ≈
  +5 GB current state; DB now ~30 GB), and a horizontal-scaling
  section — the bottleneck is the psql-app consumer + disk IOPS, so
  scale psql-app as a Kafka consumer group (cap = partition count),
  treat multi-collector as HA/locality not throughput.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:44:09 -07:00
sam
c18d11a48f Roadmap E5: refine with EVPN research findings
The OpenBMP collector already decodes EVPN and emits openbmp.parsed.evpn;
the gap is solely the psql-app (no subscription/handler) and the missing
schema table. L2VPN-VPLS is unsupported entirely. Records the two
implementation paths: fork the Java psql-app, or run GoBMP as a second
EVPN-capable collector with a thin Postgres consumer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:31:44 -07:00
sam
e55e12b778 Add Storage & Feed Health dashboard
Surfaces the new Telegraf disk/DB metrics plus GoBGP feed health:
openbmp database size (current + trend), largest tables, host
filesystem usage % and free space, GoBGP feed route count, and the
state of the IPv4/IPv6 BGP sessions to AS57355. Lives in the
OBMP-Telemetry folder.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:31:08 -07:00
sam
fc164a5689 Add disk-space and DB-size monitoring to Telegraf
Telegraf now collects host filesystem usage ([[inputs.disk]], via a
read-only /hostfs mount) and PostgreSQL database + per-table sizes
([[inputs.postgresql_extensible]]) into InfluxDB. Surfaces RIB growth
and disk pressure — relevant now that the full-table GoBGP feed has
pushed the openbmp DB to ~30 GB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:28:36 -07:00
sam
cffb835f30 Enable IPv6 feed: run GoBGP in host network mode
The IPv6 eBGP session never established because the Docker bridge
has no IPv6. Switch the gobgp container to network_mode: host so it
uses the host's real dual-stack connectivity — both sessions to
AS57355 now source from the host's public v4/v6 addresses.

Host mode binds the host's port namespace, so disable GoBGP's
inbound BGP listener (port = -1) — we only originate outbound
sessions, and a non-root container cannot bind privileged port 179.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:08:55 -07:00
sam
7766525787 Roadmap: add E5 — L2VPN/EVPN needs platform work, not dashboards
This OpenBMP deployment has no EVPN/L2VPN schema; supporting it
requires collector + psql-app + schema changes upstream, not a
Grafana dashboard. Captured as E5 with a research-spike first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:01:59 -07:00
sam
6496b60311 Make L3VPN RD/VRF filter dynamic (roadmap E4)
Both L3VPN dashboards had a static custom 'rd' variable holding only
the '-' (all) sentinel — you could not actually filter by a VRF.
Convert 'rd' to a query variable that discovers route distinguishers
from l3vpn_rib. Degrades cleanly on the (currently empty) lab table:
the query always returns '-', so behaviour is unchanged until real
L3VPN data exists, then RDs auto-populate. Existing panel SQL
('$rd' = '-' OR rd = '$rd') is untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 08:00:50 -07:00
sam
0451b2aa87 Add Global Internet Table dashboard (roadmap E3)
Explores the real DFZ table received from the AS57355 route server
via the GoBGP feed (the '$feed' / GoBGP BMP peer): IPv4/IPv6 prefix
counts, distinct origin ASes, prefix-length distribution, top origin
ASes by prefix count, and an overlap-based prefix lookup. Serves as
the comparison baseline for the Router Diff dashboard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:57:47 -07:00
sam
af4b816bef Fix GoBGP BMP target: use host IP, not collector hostname
GoBGP's BMP config requires a literal IP — 'obmp-collector' failed
to parse and the container crash-looped. Point BMP export at the
docker host IP (10.40.40.202) where the collector publishes port
5000; stable across container recreation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:51:41 -07:00
sam
8ced62e491 Add generic Router Diff dashboard (roadmap E2)
Generalizes the RR-specific RR Loc-RIB Diff into a comparison of up
to four selectable routers. router1/router2 are required; router3/
router4 default to a '-- none --' sentinel and drop cleanly out of
every query (no empty-IN, no dangling predicates).

Panels: per-router prefix counts, divergent-prefix count, a presence
matrix (row per prefix, column per router, cell = best-path next-hop),
a divergence detail table classifying missing / next-hop / AS-path
disagreement, and a per-prefix all-paths drill-down. Once the GoBGP
global feed (E1) is up, GLOBAL-FEED is selectable as any of the four
for lab-vs-Internet diffing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:41:49 -07:00
sam
88a5546e29 Add GoBGP full-table feed container (roadmap E1)
New gobgp service: GoBGP peers eBGP-multihop with the AS57355 lab
route server (Bromirski) for the full real IPv4 + IPv6 Internet table
and BMP-exports it to the OpenBMP collector, landing in ip_rib as a
monitored peer.

Config follows the route server's published peering spec: local AS
65001, no password, keepalive 3600 / hold-time 7200, IPv4 feed on the
v4 session and IPv6 feed on the v6 session. gobgp/mrt-refresh.sh is a
cron-safe fallback that injects RouteViews MRT RIB dumps when the live
session is down. The live BGP session is not started here — bringing
gobgp up establishes the external session and loads ~1M routes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:39:12 -07:00
sam
d60c582ff6 Add roadmap Track E: Internet-scale routing analytics
Plan for a local full-Internet routing table, a generalized N-way
router diff, and VRF/RD scoping:

- E1: GoBGP container peering AS57355 (Bromirski lab route server)
  for a live full v4/v6 table, MRT RIB dumps as a 2-hourly fallback,
  BMP-exported into ip_rib as a GLOBAL-FEED peer.
- E2: generic up-to-4-router diff dashboard (presence matrix),
  generalized from the RR-specific rr_locrib_diff.
- E3: global table exploration dashboard.
- E4: VRF/RD scoping across unicast + L3VPN dashboards (built to
  schema; not lab-verifiable with CML IOS-XR).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:19:34 -07:00
sam
cc0d20bf9e Back AS Relationship Map with a materialized view
The AS map previously exploded ~4.4M base_attrs AS_PATH rows live,
three times per load (one per panel), ~1.8s each — slow enough that
navigating away cancelled the queries mid-flight.

Add mv_as_adjacency: undirected consecutive-AS pairs with occurrence
counts over the full RIB (17k rows), refreshed hourly by pg_cron via
REFRESH ... CONCURRENTLY. The dashboard panels now read the view in
~1ms. Min-occurrence options rescaled for full-RIB counts
(2000/5000/10000/50000, default 2000 -> ~63-node graph).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:04:38 -07:00
sam
0190ef5fb8 Fix BGP Peer Map blank graph: connect disconnected lab components
The node graph rendered blank because the two CML/PROX labs formed
two disconnected components (iBGP-only meshes within each lab), and
Grafana's nodeGraph layout renders nothing for a disconnected graph.

Match BGP sessions to monitored routers by peer IP as well as peer
BGP-ID, so the real cross-lab eBGP sessions become graph edges. The
graph is now one connected component (30 iBGP + 4 eBGP edges) and
lays out. The companion external-neighbours table uses the same
peer-IP check so those sessions are no longer double-listed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:42:31 -07:00
sam
940f54c553 Fix BGP Peer Map blank node graph: numeric edge mainstat
The node graph rendered empty because the edges target returned a
string mainstat ('iBGP'/'eBGP'). Grafana's nodeGraph treats edge
mainStat as numeric for layout/labelling; a string value silently
breaks the layout so no nodes are drawn (the working LS map and the
original ls_topo both cast edge mainstat to an integer).

Edge mainstat is now COUNT(DISTINCT feed)::int (BMP peer-feed count
for the router pair); the iBGP/eBGP label moves to secondarystat and
detail__session_type, which accept strings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:14:05 -07:00
sam
d815a4774b Use proven singlequote format for RR variable in BGP Peer Map
Switch the route-reflector membership test to
= ANY(ARRAY[${rr_loopbacks:singlequote}]::text[]) — the singlequote
format is the one already proven to interpolate correctly in this
Grafana instance (rr_locrib_diff uses it), and the ARRAY[...]::text[]
wrapper stays valid (empty array) when the variable resolves empty.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:53:43 -07:00
sam
1acdc32dda Fix LS Topology map double-quoted protocol variable
The protocol variable has includeAll enabled, so Grafana auto-quotes
its value ('IS-IS_L2'); the SQL then wrapped it again, producing
''IS-IS_L2'' and a syntax error that blanked the node graph. Replace
the quoted equality filter with IN ($protocol) — Grafana already
emits a quoted CSV — and make the variable multi-select so "All"
expands cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:20:09 -07:00
sam
0200932ea0 Fix BGP Peer Map empty-variable crash in RR detection
When the rr_loopbacks variable resolved empty, the IN
(${rr_loopbacks:singlequote}) clause expanded to IN (), a SQL
syntax error that blanked the topology panel. Switch to
= ANY(string_to_array('${rr_loopbacks:csv}', ',')), which yields
a no-match (not a syntax error) on an empty variable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:12:12 -07:00
sam
f6a100e673 Add OBMP-Maps dashboard suite: BGP/IGP/AS topology and geo maps
Create a new OBMP-Maps Grafana folder (folderUid 1006) with four
data-visualization dashboards built on nodeGraph and geomap panels:

- BGP Peer Map: routers as nodes, BGP sessions as edges; iBGP/eBGP
  edge typing and operator-editable rr_loopbacks variable to denote
  route reflectors; companion table for sessions to non-monitored
  neighbours.
- IGP / Link-State Topology Map: reworked from LinkState-1004 and
  moved here (uid preserved); scoped by peer feed / protocol / AS so
  the 489-node BGP-LS topology stays readable; SR-capability rings.
- AS Relationship Map: AS adjacency graph from consecutive AS_PATH
  pairs over a 200k-route sample; min-occurrence and focus-AS
  variables; nodes enriched from info_asn whois.
- Geographic Prefix Map: geomap of RIB prefixes and origin ASes by
  IP geolocation, with a note that lab 10.x loopbacks do not
  geolocate; bounded geo_ip join via a sample-size variable.

Also add a data link on the Looking Glass ASN Info panel's origin_as
column that jumps to the ASN View dashboard scoped to that AS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 00:07:41 -07:00
sam
26dea47a55 Make the ASN View origin-AS selector a free-text input
asn_num was a fixed custom variable; converting it to a textbox lets an
operator look up any origin AS and see all of its RIB prefixes, upstreams,
and downstreams.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:22:21 -07:00
sam
9d74940614 Fix ExaBGP OOM, add container health checks and resource monitoring
RCA: the exabgp container was OOM-killed — its 512m mem_limit was far too
small for the full-table feature (900K route objects in memory). Raises the
limit to a parameterized 6g default (EXABGP_MEM_LIMIT).

Adds Docker healthchecks to 14 services (port/HTTP probes) so unhealthy
containers are visible. Adds a Telegraf docker input that collects per-
container CPU/memory/IO into InfluxDB, plus a "Stack Resources" dashboard —
so resource pressure is caught before it causes an OOM crash. telegraf runs
with an overridden entrypoint so it keeps root and can read the docker socket.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:03:52 -07:00
sam
482c0cdc01 Add ipv6 unicast to ExaBGP neighbor family
The IOS-XR routers negotiate IPv6 unicast capability, but the generated
exabgp.conf declared only ipv4 unicast — producing repeated "route family
(ipv6/unicast) is not configured" errors that crashed ExaBGP. Declaring
ipv6 unicast on the neighbor matches the routers' capabilities and stops
the crash-restart cycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 21:40:32 -07:00
sam
6d3387dfe5 Add RR next-hop sanity check to the RR Loc-RIB Diff dashboard
Adds a panel that flags the next-hop-self-on-an-RR anti-pattern: reflected
routes (those carrying ORIGINATOR_ID) whose NEXT_HOP is an RR loopback while
the route was originated by a different router — meaning the RR rewrote
next-hop to itself and has been pulled into the forwarding path. RR-originated
routes and legitimately-imported eBGP routes (originator == next-hop) are
excluded. An editable rr_loopbacks template variable keeps it environment-
agnostic — useful for validating RR behavior during an IOS-XR to Junos
migration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 21:18:22 -07:00
sam
a662496e53 Fix telemetry dashboard variables and parameterize gNMI targets
The telemetry dashboards' router/interface variables used a keep|distinct
Flux pattern that returned only one source; switch to schema.tagValues so all
streaming routers and interfaces are listed. Parameterize telegraf.conf gNMI
addresses and credentials via GNMI_ADDRESSES/GNMI_USERNAME/GNMI_PASSWORD so
the telemetry fleet can scale without editing the config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 21:10:57 -07:00
sam
0732ebfa07 Add production-readiness deliverables: security, backup, alerting
Adds a prioritized security-hardening checklist, a PostgreSQL logical-backup
script (pg-backup.sh) with a documented restore procedure, and Grafana
alerting provisioning (peer-down, flap-storm, RPKI-invalid, router-down rules
plus a contact-point template). The alerting YAML and contact points need
operator review before being relied on for paging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:55:03 -07:00
sam
7e3370b5a5 Rework Grafana dashboard information architecture
Reorganizes 31 dashboards into an operator-first structure with real
navigation. Adds Router Detail and Peer Detail drilldown dashboards; merges
LS Nodes+Links and the two L3VPN dashboards; modernizes all deprecated panels
(table-old/graph/worldmap). Every dashboard gets the obmp-nav dropdown so the
whole set is reachable from anywhere. Graduates the operational "Learning"
dashboards into Operations/Routing/LinkState folders, retires the Tops folder,
and relabels folders (Base->Operations, History->Routing, Learning->Reference).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:55:03 -07:00
sam
f430758992 Scope NOC Overview "Peers Down" panels to the dashboard time range
The scorecard and table counted every bgp_peers row in a down state,
including peers removed long ago (OpenBMP never prunes bgp_peers). They now
filter on the peer's last state-change timestamp via $__timeFilter, so the
panel reflects current/recent problems rather than all-time history.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:29:59 -07:00
sam
f1558946ae Add production sizing guide for 40 full-table-edge routers
Documents compute, memory, and storage requirements for a production
deployment: ~100-150M NLRI estimate, 96-128 GB RAM, 16-32 vCPU, 3-5 TB NVMe,
a split-host architecture option, PostgreSQL tuning, and a BMP RIB-scope
recommendation (Adj-RIB-In only initially).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:06:25 -07:00
sam
960806fc06 Add NOC Overview dashboard and rebuild home as a navigation hub
NOC Overview is the new flagship operator landing dashboard — health
scorecards, peer session timeline, BGP update rate, and attention tables for
peers down, churning prefixes, RPKI invalids, and topology changes. All counts
come from stats_* aggregate tables so it stays fast at production scale.
OBMP-Home is rebuilt as a lightweight navigation hub pointing at NOC Overview.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:04:37 -07:00
sam
4e9bd7cc5a Add container memory limits to all services
Sets mem_limit on every service to cap the OOM/swap-exhaustion risk (the lab
host had only 5 MiB swap free). The three heavy services (psql, kafka,
psql-app) read their limits from .env so production can raise them; the rest
use lab-appropriate fixed values. Total ~25 GB, leaving headroom on the 31 GB
lab host.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:04:37 -07:00
sam
8ac156ce86 Add second-lab ExaBGP peering and bulk BMP config script
Generalizes exabgp/startup.sh to template BGP neighbors from an EXABGP_PEERS
list (ip:peer_as:description), so ExaBGP peers with multiple labs. Adds
cml/proxmox_bmp_config.py to apply the bmp server block to a lab's IOS-XR
routers over SSH (BMP config is not exposed via NETCONF YANG on current XR).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 19:21:11 -07:00
sam
cf4e5b07c6 Add Compose profiles, setup.sh bootstrap, and config templates for portable deployment
Pins the Compose project name and splits services into core / test / auth
profiles so the BMP collector core can deploy standalone. Adds setup.sh
(idempotent bootstrap), .env.example, and repo-resident Authelia config
templates so a fresh host deploys without manual steps. Parameterizes
hardcoded host IP and domain; points the Grafana InfluxDB datasource at the
container name.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 19:21:04 -07:00
sam
31286d5d3e Add platform roadmap: multi-lab CML integration and production deployment
Four-track roadmap covering configuration centralization (inventory.yaml),
CML API automation (virl2_client), production ISP deployment (multi-vendor
IOS-XR + Junos), and packaging for distribution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:23:38 -07:00
sam
da49b3e462 Add CML integration: XRd and ExaBGP node/image definitions and build scripts
CML 2.9 node definitions for XRd Control-Plane (third RR) and ExaBGP route
injector as Docker-based CML nodes. Includes build scripts to export Docker
images as tars for CML import, with IOS-XR startup configs for IS-IS, BGP,
and BMP.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:23:30 -07:00
sam
541f018bc5 Add RR Loc-RIB diff dashboard and route diversity config
Dashboard compares Adj-RIB-In tables between two Route Reflectors via BMP,
showing missing prefixes, attribute diffs (next-hop, AS path), and per-client
consistency. Route diversity script deploys 29 prefixes across R9K-01-07 via
NETCONF to create verifiable next-hop differences between RRs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:23:19 -07:00
sam
45f4c9859d Add Authelia auth gateway, portal landing page, and subpath routing
Adds Authelia (forward-auth) and nginx portal container for single-endpoint
authenticated access via Caddy reverse proxy. Configures Grafana auth proxy
for header-based auto-login. Updates Vue UI base paths and API routes for
/exabgp/ and /traffic/ subpath serving. Adds traffic-gen responder container
on dedicated Docker network.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:23:09 -07:00
sam
422b98d555 Fix telemetry dashboards: update Flux queries and InfluxDB datasource URL
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:22:58 -07:00
sam
d691b512f9 Add full internet table injection with background worker and progress tracking
Generates realistic IPv4 routing tables (1K-900K prefixes) with DFZ-like
prefix length distribution, varied AS paths, and transit ASN diversity.
Background injection with progress API, CLI follow mode, and Vue UI
component with preset sizes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:22:51 -07:00
sam
1f0936763b Add traffic generator improvements: mode switching, ping, responder echo, RFC2544 fixes
Adds sender/responder mode switching via API, QuickPing component, echo-mode
responder with dedicated container, improved flow state sync, and RFC2544
test runner enhancements. Includes UI improvements across all traffic-gen
components.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:22:41 -07:00
sam
c28c9b2527 Fix gNMI telemetry: OpenConfig paths, json_ietf encoding, SSH config
- Switch Telegraf from native IOS-XR YANG paths to OpenConfig
  (openconfig-interfaces:interfaces/interface/state/counters)
- Use json_ietf encoding instead of proto (IOS-XR 24.3.1 compat)
- Target only CORE-01/CORE-02 (R9K routers blocked by CML mgmt net)
- Update all 3 Grafana dashboard queries to match OpenConfig field
  names (in-octets, out-octets, in-pkts, out-pkts, in-errors, etc.)
- Rewrite gnmi_grpc_config.py to use SSH/CLI via paramiko instead of
  NETCONF (IOS-XR 24.3.1 rejects NETCONF gRPC edit-config)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 16:19:16 -07:00
sam
6b45f124f0 Remove __pycache__ from tracking and add to .gitignore
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 15:40:14 -07:00
sam
dcebf15bb3 Add Phase 4: gNMI streaming telemetry and traffic generator
- gNMI integration: NETCONF script to enable gRPC on all 9 routers,
  Telegraf container with gnmi input plugin, InfluxDB for time-series
  storage, 3 Grafana telemetry dashboards (utilization, errors, combined)
- Traffic generator: Scapy-based dual-mode container (sender/responder)
  with Flask API, RFC 2544 test suite (throughput, latency, frame-loss,
  back-to-back), Vue 3 web UI with flow builder, test runner, real-time
  stats monitor, and results export
- docker-compose.yml updated with influxdb, telegraf, traffic-gen,
  traffic-gen-ui services
- Full documentation in DOCS.md sections 15-16

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 15:29:44 -07:00
sam
f23e222bc0 Add Phase 3: TE/SR analytics, anomaly detection, DB schema reference
- 4 new Grafana dashboards:
  - Database Schema Map (obmp-learn-07): interactive schema reference
    with live row counts, relationship diagrams, column details
  - TE & Segment Routing Analytics (obmp-learn-08): exposes BGP-LS TE/SR
    fields (bandwidth, admin groups, SRLG, SR SIDs, protection types)
  - Topology Change & Anomaly Detection (obmp-learn-09): link state
    change tracking, origin AS hijack detection, convergence timeline
  - Link Utilization & TE Thought Experiment (obmp-learn-10): capacity
    data from BGP-LS + streaming telemetry integration guide
- DB_SCHEMA.md: standalone database reference (33 tables, 11 views)
- 3 new ExaBGP scenarios: te_community_steering, origin_shift, path_diversity
- Updated DOCS.md with Phase 3 dashboards and scenarios

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 13:31:03 -07:00
141 changed files with 22540 additions and 3505 deletions

85
.env.example Normal file
View File

@ -0,0 +1,85 @@
# OpenBMP stack configuration — copy to .env and fill in.
# cp .env.example .env && $EDITOR .env && ./setup.sh
# The real .env is git-ignored and never committed.
# ---------------------------------------------------------------------------
# Core deployment
# ---------------------------------------------------------------------------
# Host path for all persistent data (postgres, kafka, grafana, authelia, ...).
OBMP_DATA_ROOT=/var/openbmp
# IP of this host that routers and external clients connect to
# (Kafka external listener, BMP source, ExaBGP peering).
HOST_IP=changeme
# Auth mode:
# local — Grafana built-in login (admin / openbmp). Lab default.
# OBMP_DOMAIN / OBMP_COOKIE_DOMAIN below can stay blank.
# authelia — Authelia in front (docker compose --profile auth).
# OBMP_DOMAIN and OBMP_COOKIE_DOMAIN must be set, and a
# reverse proxy must terminate TLS at OBMP_DOMAIN.
OBMP_AUTH_MODE=local
# Public domain fronting Grafana / Authelia / portal (TLS terminates upstream).
# Only required when OBMP_AUTH_MODE=authelia.
OBMP_DOMAIN=
# Authelia session-cookie domain — the parent domain of OBMP_DOMAIN so the
# cookie is valid across subpaths/subdomains. Only required when
# OBMP_AUTH_MODE=authelia.
OBMP_COOKIE_DOMAIN=
# Grafana self-generated URL (alerts, share links). setup.sh writes this
# automatically based on OBMP_AUTH_MODE — leave blank, it will be filled in.
GF_SERVER_ROOT_URL=
# Container memory limits. Lab defaults shown; raise for production
# (see docs/production-sizing.md). psql-app's limit must exceed its MEM heap.
PSQL_MEM_LIMIT=6g
PSQL_APP_MEM_LIMIT=4g
KAFKA_MEM_LIMIT=4g
# ExaBGP — the full-table feature holds up to 900K route objects in memory.
EXABGP_MEM_LIMIT=6g
# gNMI streaming telemetry (telegraf, test profile). GNMI_ADDRESSES is a
# quoted, comma-separated host:port list — add a router here once gNMI/grpc
# is enabled on it and the management path is reachable.
GNMI_ADDRESSES="10.100.0.100:57400", "10.100.0.200:57400"
GNMI_USERNAME=changeme
GNMI_PASSWORD=changeme
# ---------------------------------------------------------------------------
# ExaBGP route injector (test profile)
# ---------------------------------------------------------------------------
EXABGP_LOCAL_IP=changeme
EXABGP_LOCAL_AS=65100
EXABGP_API_PORT=5050
# Semicolon-separated peer list, each entry "ip:peer_as:description".
EXABGP_PEERS=10.100.0.100:65020:CML-R9K-CORE-01;10.100.0.200:65020:CML-R9K-CORE-02
# ---------------------------------------------------------------------------
# CML lab API + IOS-XR NETCONF (used by cml/ automation scripts)
# ---------------------------------------------------------------------------
PROX-CML_URL=http://changeme
PROX-CML_USERNAME=changeme
PROX-CML_PASSWORD=changeme
# Default IOS-XR NETCONF credentials, plus the admin-tier override for routers
# that use a separate account.
IOSXR_NETCONF_USER=changeme
IOSXR_NETCONF_PASS=changeme
IOSXR_NETCONF_ADMIN_USER=changeme
IOSXR_NETCONF_ADMIN_PASS=changeme
# ---------------------------------------------------------------------------
# Integrations
# ---------------------------------------------------------------------------
GITEA_API_KEY=changeme
# ---------------------------------------------------------------------------
# Authelia secrets — leave BLANK; setup.sh generates them with openssl on a
# fresh host and appends them here. Existing values are never overwritten.
# ---------------------------------------------------------------------------
AUTHELIA_SESSION_SECRET=
AUTHELIA_JWT_SECRET=
AUTHELIA_STORAGE_ENCRYPTION_KEY=

4
.gitignore vendored
View File

@ -2,4 +2,8 @@
*.log
.env
.claude/
__pycache__/
*.pyc
gobgp/gobgpd.conf
gobgp-evpn/gobgpd.conf

387
DB_SCHEMA.md Normal file
View File

@ -0,0 +1,387 @@
# OpenBMP Database Schema Reference
PostgreSQL database `openbmp` with TimescaleDB extension for time-series data.
## Entity Relationship Diagram
```
collectors
└── routers (collector_hash_id)
└── bgp_peers (router_hash_id)
├── ip_rib (peer_hash_id) ──► base_attrs (base_attr_hash_id)
├── ip_rib_log (peer_hash_id)
├── l3vpn_rib (peer_hash_id) ──► base_attrs
├── ls_nodes (peer_hash_id)
├── ls_links (peer_hash_id) ──► ls_nodes (local/remote_node_hash_id)
├── ls_prefixes (peer_hash_id) ──► ls_nodes (local_node_hash_id)
├── peer_event_log (peer_hash_id)
├── stat_reports (peer_hash_id)
└── stats_* tables (peer_hash_id)
ip_rib.prefix ◄──► global_ip_rib.prefix (aggregated view)
├── rpki_origin_as ◄── rpki_validator
└── irr_origin_as ◄── info_route
base_attrs.origin_as ──► info_asn.asn (ASN enrichment)
routers.geo_ip_start ──► geo_ip.ip (geolocation)
```
---
## BMP Core Tables
### routers
BMP-monitored routers (one row per monitored device).
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Primary key |
| name | varchar(200) | Router hostname |
| ip_address | inet | Router management IP |
| router_as | bigint | Router ASN |
| bgp_id | inet | BGP router-id |
| collector_hash_id | uuid | FK to collectors |
| state | opstate | up / down |
| timestamp | timestamp | Last update time |
| description | varchar(255) | Router description |
| init_data | text | BMP init message data |
| term_reason_code | int | BMP termination reason |
### collectors
BMP collector instances.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Primary key |
| admin_id | varchar(64) | Admin identifier |
| name | varchar(200) | Collector name |
| ip_address | varchar(40) | Collector IP |
| state | opstate | up / down |
| router_count | smallint | Number of monitored routers |
### bgp_peers
BGP sessions per router (one row per peer per router).
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Primary key (composite with router_hash_id) |
| router_hash_id | uuid | FK to routers |
| peer_addr | inet | Peer IP address |
| peer_as | bigint | Peer ASN |
| peer_bgp_id | inet | Peer BGP router-id |
| name | varchar(200) | Peer name |
| state | opstate | up / down |
| isl3vpnpeer | boolean | L3VPN peer flag |
| isipv4 | boolean | IPv4 peer |
| isprepolicy | boolean | Pre-policy RIB |
| islocrib | boolean | Local RIB |
| local_ip | inet | Local IP |
| local_asn | bigint | Local ASN |
| local_hold_time | smallint | Local hold time |
| remote_hold_time | smallint | Remote hold time |
| sent_capabilities | varchar(4096) | BGP capabilities sent |
| recv_capabilities | varchar(4096) | BGP capabilities received |
| table_name | varchar(255) | VRF/table name |
### peer_event_log (TimescaleDB)
Historical BGP session state changes.
| Column | Type | Description |
|--------|------|-------------|
| id | bigint | Event sequence |
| peer_hash_id | uuid | FK to bgp_peers |
| state | opstate | up / down |
| timestamp | timestamp | Event time (partition key) |
| bmp_reason | smallint | BMP reason code |
| bgp_err_code | smallint | BGP error code |
| bgp_err_subcode | smallint | BGP error subcode |
| error_text | varchar(255) | Error description |
---
## BGP Path Attributes
### base_attrs
BGP path attributes shared across routes.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Primary key |
| peer_hash_id | uuid | FK to bgp_peers |
| origin | varchar(16) | IGP / EGP / Incomplete |
| as_path | bigint[] | AS path array |
| as_path_count | smallint | AS path length |
| origin_as | bigint | Origin ASN |
| next_hop | inet | BGP next-hop |
| med | bigint | Multi-Exit Discriminator |
| local_pref | bigint | Local preference |
| community_list | varchar(15)[] | Standard communities |
| ext_community_list | varchar(50)[] | Extended communities (RT, etc.) |
| large_community_list | varchar(40)[] | Large communities (RFC 8092) |
| cluster_list | varchar(40)[] | Route reflector cluster list |
| isatomicagg | boolean | Atomic aggregate flag |
| originator_id | inet | RR originator ID |
| aggregator | varchar(64) | Aggregator |
**Indexes**: GIN on as_path, community_list, ext_community_list, large_community_list
---
## IP RIB Tables
### ip_rib
Current IPv4/IPv6 unicast routing table.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Route hash |
| peer_hash_id | uuid | FK to bgp_peers (composite PK) |
| base_attr_hash_id | uuid | FK to base_attrs |
| prefix | inet | IP prefix |
| prefix_len | smallint | Prefix length |
| origin_as | bigint | Origin ASN |
| isipv4 | boolean | IPv4 flag |
| iswithdrawn | boolean | Withdrawn flag |
| labels | varchar(255) | MPLS labels |
| path_id | bigint | Add-Path ID |
| isprepolicy | boolean | Pre-policy flag |
| isadjribin | boolean | Adj-RIB-In flag |
| timestamp | timestamp | Last update |
| first_added_timestamp | timestamp | First seen |
### ip_rib_log (TimescaleDB)
Historical RIB changes — every advertisement and withdrawal.
| Column | Type | Description |
|--------|------|-------------|
| id | bigint | Change event ID |
| peer_hash_id | uuid | FK to bgp_peers |
| base_attr_hash_id | uuid | FK to base_attrs |
| prefix | inet | IP prefix |
| prefix_len | smallint | Prefix length |
| origin_as | bigint | Origin ASN |
| iswithdrawn | boolean | Withdrawal flag |
| timestamp | timestamp | Event time (partition key) |
### global_ip_rib
Aggregated prefix summary across all peers.
| Column | Type | Description |
|--------|------|-------------|
| prefix | inet | IP prefix (composite PK) |
| prefix_len | smallint | Prefix length |
| recv_origin_as | bigint | Received origin AS |
| rpki_origin_as | bigint | RPKI-validated origin AS |
| irr_origin_as | bigint | IRR-registered origin AS |
| irr_source | varchar(32) | IRR source (RADB, RIPE, etc.) |
| num_peers | int | Total advertising peers |
| iswithdrawn | boolean | Withdrawn flag |
---
## L3VPN Tables
### l3vpn_rib
L3VPN (RFC 4364) routes with Route Distinguisher.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Route hash |
| peer_hash_id | uuid | FK to bgp_peers |
| base_attr_hash_id | uuid | FK to base_attrs |
| rd | varchar(128) | Route Distinguisher |
| prefix | inet | VPN prefix |
| prefix_len | smallint | Prefix length |
| origin_as | bigint | Origin ASN |
| labels | varchar(255) | MPLS VPN labels |
| ext_community_list | varchar(50)[] | Route Targets |
| path_id | bigint | Add-Path ID |
| iswithdrawn | boolean | Withdrawn flag |
### l3vpn_rib_log (TimescaleDB)
Historical L3VPN route changes.
---
## Link-State Tables (BGP-LS / RFC 7752)
### ls_nodes
IS-IS / OSPF node information from BGP-LS.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Node hash |
| peer_hash_id | uuid | FK to bgp_peers (composite PK) |
| base_attr_hash_id | uuid | FK to base_attrs |
| asn | bigint | Node ASN |
| bgp_ls_id | bigint | BGP-LS Identifier |
| igp_router_id | varchar(46) | IGP Router ID |
| router_id | varchar(46) | BGP Router ID |
| protocol | ls_proto | IS-IS_L1, IS-IS_L2, OSPFv2, OSPFv3 |
| isis_area_id | varchar(46) | IS-IS area |
| ospf_area_id | varchar(16) | OSPF area |
| name | varchar(255) | Node hostname |
| flags | varchar(20) | Node flags |
| mt_ids | varchar(128) | Multi-Topology IDs |
| **sr_capabilities** | **varchar(255)** | **SR Global Block (SRGB) ranges** |
| iswithdrawn | boolean | Withdrawn flag |
### ls_links
IS-IS / OSPF links with full TE and SR attributes.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Link hash |
| peer_hash_id | uuid | FK to bgp_peers (composite PK) |
| local_node_hash_id | uuid | FK to ls_nodes (local end) |
| remote_node_hash_id | uuid | FK to ls_nodes (remote end) |
| local_router_id | varchar(46) | Local BGP Router ID |
| remote_router_id | varchar(46) | Remote BGP Router ID |
| local_igp_router_id | varchar(46) | Local IGP Router ID |
| remote_igp_router_id | varchar(46) | Remote IGP Router ID |
| interface_addr | inet | Local interface IP |
| neighbor_addr | inet | Remote interface IP |
| igp_metric | bigint | IGP metric |
| protocol | ls_proto | IGP protocol |
| mt_id | int | Multi-Topology ID |
| local_link_id | bigint | Local link identifier |
| remote_link_id | bigint | Remote link identifier |
| name | varchar(255) | Link name |
| **admin_group** | **bigint** | **TE admin group / link color bitmap** |
| **max_link_bw** | **bigint** | **Maximum link bandwidth (bytes/sec)** |
| **max_resv_bw** | **bigint** | **Maximum reservable bandwidth** |
| **unreserved_bw** | **varchar(128)** | **Unreserved BW per priority (8 values)** |
| **te_def_metric** | **bigint** | **TE default metric (for CSPF)** |
| **protection_type** | **varchar(60)** | **Link protection (FRR type)** |
| **mpls_proto_mask** | **ls_mpls_proto_mask** | **MPLS protocol support flags** |
| **srlg** | **varchar(128)** | **Shared Risk Link Group** |
| **peer_node_sid** | **varchar(128)** | **SR Peer Node SID (EPE, RFC 9086)** |
| **sr_adjacency_sids** | **varchar(255)** | **SR Adjacency SIDs** |
| iswithdrawn | boolean | Withdrawn flag |
**Bold** = TE/SR fields available via BGP-LS but not used by default dashboards.
### ls_prefixes
IS-IS / OSPF prefix information.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Prefix hash |
| peer_hash_id | uuid | FK to bgp_peers (composite PK) |
| local_node_hash_id | uuid | FK to ls_nodes |
| prefix | inet | Advertised prefix |
| prefix_len | smallint | Prefix length |
| protocol | ls_proto | IGP protocol |
| metric | bigint | Prefix metric |
| mt_id | int | Multi-Topology ID |
| ospf_route_type | ospf_route_type | Intra/Inter/Ext-1/Ext-2/NSSA |
| igp_flags | varchar(20) | IGP flags |
| route_tag | bigint | Route tag |
| **sr_prefix_sids** | **varchar(255)** | **SR Prefix SIDs (node SIDs)** |
| iswithdrawn | boolean | Withdrawn flag |
### ls_nodes_log, ls_links_log, ls_prefixes_log (TimescaleDB)
Historical link-state changes. Same columns as parent tables plus `id` (bigint) and timestamp as partition key.
---
## Statistics Tables (TimescaleDB)
| Table | Purpose | Key Columns |
|-------|---------|-------------|
| **stat_reports** | BMP stat messages per peer | prefixes_rejected, known_dup_prefixes, num_routes_adj_rib_in, num_routes_local_rib |
| **stats_chg_byprefix** | Per-prefix update/withdrawal counts | interval_time, prefix, updates, withdraws |
| **stats_chg_byasn** | Per-ASN update/withdrawal counts | interval_time, origin_as, updates, withdraws |
| **stats_chg_bypeer** | Per-peer update/withdrawal counts | interval_time, updates, withdraws |
| **stats_peer_rib** | Per-peer RIB size over time | interval_time, v4_prefixes, v6_prefixes |
| **stats_peer_update_counts** | Update rate statistics | interval_time, advertise_avg/min/max, withdraw_avg/min/max |
| **stats_ip_origins** | Per-ASN IP prefix counts | interval_time, asn, v4_prefixes, v6_prefixes, v4_with_rpki, v4_with_irr |
| **stats_l3vpn_chg_byprefix** | L3VPN per-prefix stats | interval_time, rd, prefix, updates, withdraws |
| **stats_l3vpn_chg_bypeer** | L3VPN per-peer stats | interval_time, updates, withdraws |
| **stats_l3vpn_chg_byrd** | L3VPN per-RD stats | interval_time, rd, updates, withdraws |
---
## Reference & Enrichment Tables
| Table | Purpose | Key Columns |
|-------|---------|-------------|
| **rpki_validator** | RPKI ROAs | prefix, prefix_len, prefix_len_max, origin_as |
| **info_asn** | ASN WHOIS/IRR data | asn, as_name, org_name, country, source |
| **info_route** | Route IRR data | prefix, origin_as, descr, source |
| **geo_ip** | IP geolocation (DB-IP) | ip, country, city, latitude, longitude, isp_name |
| **pdb_exchange_peers** | PeeringDB IXP peering | ix_name, peer_name, peer_asn, speed, peer_ipv4/ipv6 |
---
## Views
| View | Joins | Purpose |
|------|-------|---------|
| **v_peers** | bgp_peers + routers + info_asn | Complete peer info with router name and ASN details |
| **v_ip_routes** | ip_rib + bgp_peers + base_attrs + routers | Full route detail with path attributes |
| **v_ip_routes_geo** | v_ip_routes + geo_ip | Routes with geolocation |
| **v_ip_routes_history** | ip_rib_log + base_attrs + bgp_peers + routers | Historical route changes with attributes |
| **v_l3vpn_routes** | l3vpn_rib + bgp_peers + base_attrs + routers | L3VPN routes with path attributes |
| **v_l3vpn_routes_history** | l3vpn_rib_log + base_attrs + bgp_peers + routers | Historical L3VPN changes |
| **v_ls_nodes** | ls_nodes + base_attrs + bgp_peers + routers | Link-state nodes with peer/router info |
| **v_ls_links** | ls_links + ls_nodes(x2) + routers | Links with local/remote node names + all TE/SR fields |
| **v_ls_prefixes** | ls_prefixes + ls_nodes + routers | LS prefixes with originating node info |
---
## Custom Enum Types
| Type | Values |
|------|--------|
| **opstate** | up, down |
| **ls_proto** | IS-IS_L1, IS-IS_L2, OSPFv2, OSPFv3, Direct, Static |
| **ospf_route_type** | Intra, Inter, Ext-1, Ext-2, NSSA-1, NSSA-2 |
| **ls_mpls_proto_mask** | MPLS protocol bitmask |
| **user_role** | admin, oper |
---
## Key Query Patterns
### Get all active routes with full attributes
```sql
SELECT r.prefix, r.prefix_len, ba.origin_as, ba.as_path,
ba.med, ba.local_pref, ba.community_list, ba.next_hop
FROM ip_rib r
JOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id
WHERE r.iswithdrawn = false AND r.isipv4 = true
```
### Get link-state topology with TE attributes
```sql
SELECT local_router_name, remote_router_name,
igp_metric, te_def_metric, max_link_bw, admin_group, srlg,
sr_adjacency_sids
FROM v_ls_links
WHERE peer_hash_id = '<peer_hash>' AND iswithdrawn = false
```
### Time-series RIB changes
```sql
SELECT date_trunc('minute', timestamp) as time,
SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) as ads,
SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) as withdrawals
FROM ip_rib_log
WHERE timestamp > NOW() - INTERVAL '24 hours'
GROUP BY 1 ORDER BY 1
```
### RPKI validation status
```sql
SELECT CASE
WHEN rv.origin_as IS NOT NULL AND rv.origin_as = r.origin_as THEN 'Valid'
WHEN rv.origin_as IS NOT NULL THEN 'Invalid'
ELSE 'NotFound'
END as status,
COUNT(*)
FROM ip_rib r
LEFT JOIN rpki_validator rv ON rv.prefix = r.prefix AND rv.prefix_len = r.prefix_len
WHERE r.iswithdrawn = false
GROUP BY 1
```

309
DOCS.md
View File

@ -16,6 +16,8 @@
12. [Troubleshooting](#12-troubleshooting)
13. [Data Retention](#13-data-retention)
14. [Environment Variables Reference](#14-environment-variables-reference)
15. [gNMI Streaming Telemetry (Phase 4)](#15-gnmi-streaming-telemetry-phase-4)
16. [Traffic Generator (Phase 4)](#16-traffic-generator-phase-4)
---
@ -28,7 +30,7 @@ This is a **BGP Monitoring Platform (BMP) lab stack** deployed via Docker Compos
- Receives BMP (BGP Monitoring Protocol, RFC 7854) telemetry from routers on TCP port 5000
- Streams BMP data through Kafka into a TimescaleDB/PostgreSQL database
- Provides **23 Grafana dashboards** (17 operational + 6 learning-focused) for real-time and historical BGP analysis
- Provides **30 Grafana dashboards** (17 operational + 6 learning + 4 advanced analytics + 3 streaming telemetry) for real-time and historical BGP analysis
- Includes an **ExaBGP route injector** that peers with the two CORE routers and injects synthetic BGP routes, enabling testing of BGP policy, route propagation, and Grafana dashboards without needing internet connectivity
- Provides a **Vue 3 web UI** at `:5001` for point-and-click scenario management, live route tables, and peer monitoring
@ -64,7 +66,7 @@ IOS-XR Routers (9x, AS 65020)
PostgreSQL 14 + TimescaleDB
|
+---------> obmp-grafana (grafana/grafana:9.1.7) :3000
| 23 dashboards, PostgreSQL datasource
| 30 dashboards, PostgreSQL + InfluxDB datasources
+---------> obmp-whois (openbmp/whois:2.2.0) :4300
WHOIS query server backed by the DB
@ -73,6 +75,24 @@ ExaBGP (obmp-exabgp, built locally)
Peers eBGP to CORE-01 and CORE-02 (AS 65100 -> AS 65020)
HTTP API on :5050 — inject/withdraw routes on demand
Routes propagate via iBGP mesh to all 9 routers -> BMP -> DB -> Grafana
gNMI Streaming Telemetry (Phase 4):
IOS-XR Routers (gRPC :57400)
|
v
obmp-telegraf (telegraf:1.28 + gnmi plugin)
|
v
obmp-influxdb (influxdb:2.7) :8086
|
v
obmp-grafana (InfluxDB datasource -> Telemetry dashboards)
Traffic Generator (Phase 4):
obmp-traffic-gen (python:3.11 + Scapy + Flask) :5051
Dual-mode: sender (generate traffic) / responder (echo/log)
RFC 2544 testing, custom packet flows
obmp-traffic-gen-ui (Vue 3 + NGINX) :5002
```
### Container Summary
@ -87,7 +107,11 @@ ExaBGP (obmp-exabgp, built locally)
| obmp-grafana | grafana/grafana:9.1.7 | 3000 | Visualization |
| obmp-whois | openbmp/whois:2.2.0 | 4300 | WHOIS query server |
| obmp-exabgp | local build | 5050 (host net) | BGP route injector |
| obmp-exabgp-ui | local build | 5001 (host net) | Vue 3 web control panel |
| obmp-exabgp-ui | local build | 5001 (host net) | Route injector web UI |
| obmp-influxdb | influxdb:2.7 | 8086 | Time-series DB for telemetry |
| obmp-telegraf | local build | - (host net) | gNMI telemetry collector |
| obmp-traffic-gen | local build | 5051 (host net) | Scapy traffic generator |
| obmp-traffic-gen-ui | local build | 5002 (host net) | Traffic generator web UI |
---
@ -103,6 +127,37 @@ ExaBGP (obmp-exabgp, built locally)
## 4. Initial Setup (First Time)
### 4.0 Quick deploy (recommended)
`setup.sh` bootstraps a fresh host — it creates the data directories, syncs
Grafana provisioning, generates Authelia secrets, and renders config. It is
idempotent and safe to re-run.
```bash
git clone <this-repo-url>
cd obmp-docker
cp .env.example .env
$EDITOR .env # set HOST_IP, OBMP_DOMAIN, OBMP_COOKIE_DOMAIN, credentials
./setup.sh
docker compose up -d # BMP collector core only
docker compose --profile test --profile auth up -d # full stack (lab tools + auth)
```
The stack uses Docker Compose **profiles**:
| Command | Brings up |
|---------|-----------|
| `docker compose up -d` | Collector core only — zookeeper, kafka, collector, psql, psql-app, grafana, whois |
| `docker compose --profile test up -d` | Core **+** ExaBGP, traffic generator, telegraf, influxdb |
| `docker compose --profile auth up -d` | Core **+** Authelia gateway and portal |
| `docker compose --profile test --profile auth up -d` | Everything |
The bare `docker compose up` is the shippable standalone BMP collector — it has
no dependency on the lab/test tooling.
The sections below (4.14.6) document the equivalent **manual** steps if you
prefer not to use `setup.sh`.
### 4.1 Clone the repository
```bash
@ -224,6 +279,22 @@ See `exabgp/iosxr_bgp_config.md` for a Python/ncclient script that pushes all of
Credentials: `username=webui`, `password=cisco`, port 830.
### 5.6 Bulk BMP config (`cml/proxmox_bmp_config.py`)
To point a whole lab of IOS-XR routers at the BMP collector at once,
`cml/proxmox_bmp_config.py` applies the `bmp server 1` block over SSH (IOS-XR
BMP config is not exposed via NETCONF YANG on current releases). It is
idempotent.
```bash
pip install paramiko
python3 cml/proxmox_bmp_config.py # all routers in the inventory
python3 cml/proxmox_bmp_config.py r9k-05 # a single router (smoke test)
```
Edit the `ROUTERS` list at the top of the script for your inventory and the
`COLLECTOR_HOST` constant for the collector address.
---
## 6. Starting and Stopping
@ -312,6 +383,9 @@ python3 inject.py scenarios
| `convergence_test` | 10 | Prefixes for timing BGP convergence — announce then check ip_rib_log timestamps |
| `route_leak` | 10 | Real prefixes re-announced with short AS paths — simulates a route leak (community 65100:999) |
| `hijack_simulation` | 10 | Prefixes claimed directly by AS 65100 — simulates a prefix hijack (community 65100:hijack) |
| `te_community_steering` | 15 | Routes tagged with TE communities for color-based steering (65020:100=red, 65020:200=blue, 65020:300=green) |
| `origin_shift` | 5 | Prefixes with changed origin AS — simulates origin migration for anomaly detection |
| `path_diversity` | 10 | Same prefixes with different AS paths/MEDs — demonstrates best-path selection |
### 7.4 Load a scenario
@ -495,6 +569,23 @@ Six learning-focused dashboards in a separate folder, designed to teach BGP conc
> **RPKI note:** The `rpki_validator` table is populated by a cron job in `psql-app` every 2 hours. Dashboard `obmp-learn-04` will show zero counts until the cron runs — check `ENABLE_RPKI=1` in `docker-compose.yml`.
### Advanced Analytics Dashboards (folder: `OBMP-Learning`)
Four advanced dashboards that go beyond basic BMP monitoring, unlocking TE/SR data and providing heuristic analysis.
| Dashboard | UID | What it provides |
|-----------|-----|-----------------|
| Database Schema Map | `obmp-learn-07` | Interactive schema reference — live table row counts, entity relationships, column details for all 33 tables and 11 views |
| TE & Segment Routing Analytics | `obmp-learn-08` | Exposes TE/SR fields from BGP-LS: link bandwidth, admin groups, SRLG, SR SIDs, adjacency SIDs, protection types |
| Topology Change & Anomaly Detection | `obmp-learn-09` | Heuristic analysis: link state changes over time, origin AS hijack detection, convergence timeline, route consistency |
| Link Utilization & TE Thought Experiment | `obmp-learn-10` | BGP-LS capacity data (bandwidth, TE metrics) + integration guide for streaming telemetry (gNMI/MDT) |
> **TE/SR data note:** Some TE fields (admin_group, max_link_bw, srlg, sr_adjacency_sids) may be NULL if routers don't advertise those TLVs. Enable `mpls traffic-eng` under IS-IS and `segment-routing mpls` for full data.
### Database Schema Reference
A standalone database schema reference is also available at `DB_SCHEMA.md` in the repo root. It documents all 33 tables, 11 views, TE/SR columns, enum types, and common query patterns.
---
## 10. Sanity Checks
@ -810,3 +901,215 @@ Adjust in `docker-compose.yml` under the `psql-app` service environment block.
| Variable | Default | Description |
|----------|---------|-------------|
| `EXABGP_API` | `http://localhost:5050` | ExaBGP API base URL |
---
## 15. gNMI Streaming Telemetry (Phase 4)
### Overview
gNMI (gRPC Network Management Interface) adds **data-plane visibility** alongside BMP's control-plane monitoring. Telegraf collects real-time interface counters from all 9 IOS-XR routers via gNMI subscriptions and stores them in InfluxDB. Grafana queries InfluxDB for telemetry dashboards.
### Architecture
```
IOS-XR Routers (9x, gRPC port 57400)
|
gNMI subscriptions (10s sample)
|
v
obmp-telegraf (telegraf:1.28 + gnmi input plugin)
host networking → reaches routers on 10.100.0.x
|
v
obmp-influxdb (influxdb:2.7, port 8086)
bucket: "telemetry", org: "openbmp"
|
v
obmp-grafana (InfluxDB datasource, Flux queries)
3 dashboards in OBMP-Telemetry folder
```
### Enabling gRPC on Routers
The routers need gRPC enabled before Telegraf can collect telemetry. A NETCONF script is provided:
```bash
# From the host (requires ncclient: pip install ncclient)
cd /home/user/obmp-docker/gnmi
python3 gnmi_grpc_config.py
```
This connects to all 9 routers via NETCONF (port 830, credentials webui/cisco) and pushes:
```
grpc
port 57400
no-tls
```
**Verify on router:**
```
show grpc status
```
Expected: gRPC listening on port 57400.
### Telemetry Data Collected
Telegraf subscribes to two IOS-XR YANG paths at 10-second intervals:
| Subscription | YANG Path | Data |
|-------------|-----------|------|
| interface_counters | `Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/generic-counters` | bytes/packets in/out, errors, drops, CRC |
| interface_rates | `Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/data-rate` | bits/sec in/out, packet rate |
### InfluxDB Access
- **URL:** `http://localhost:8086`
- **Org:** `openbmp`
- **Bucket:** `telemetry`
- **Token:** `openbmp-telemetry-token`
- **Retention:** 30 days
### Grafana Telemetry Dashboards
Three dashboards in the **OBMP-Telemetry** folder:
| Dashboard | UID | Description |
|-----------|-----|-------------|
| Interface Utilization | obmp-telem-01 | Input/output bytes rate, packets rate, top interfaces by throughput |
| Interface Errors | obmp-telem-02 | CRC errors, input/output errors, drops, overruns |
| Combined BMP + Telemetry | obmp-telem-03 | Mixed datasource — BGP peer status (PostgreSQL) alongside interface counters (InfluxDB) |
All dashboards have `$router` and `$interface` template variables for filtering.
### Troubleshooting gNMI
```bash
# Check Telegraf logs for gNMI connection status
docker logs obmp-telegraf --tail 50
# Verify InfluxDB has data
curl -s -H "Authorization: Token openbmp-telemetry-token" \
"http://localhost:8086/api/v2/query?org=openbmp" \
--data-urlencode 'q=from(bucket:"telemetry") |> range(start: -5m) |> limit(n:5)'
# Check InfluxDB health
curl http://localhost:8086/health
```
---
## 16. Traffic Generator (Phase 4)
### Overview
A portable, containerized traffic generator with a web UI for RFC 2544 testing and custom packet flows. Built with Scapy + Flask (backend) and Vue 3 + NGINX (frontend). The container supports **dual-mode operation**: sender (generate traffic) or responder (receive/echo packets).
### Accessing the UI
- **Web UI:** `http://localhost:5002`
- **API:** `http://localhost:5051`
### Dual-Mode Operation
Set via `TRAFFIC_GEN_MODE` environment variable in `docker-compose.yml`:
| Mode | Description |
|------|-------------|
| `sender` (default) | Generates traffic, runs RFC 2544 tests, sends custom flows |
| `responder` | Listens for incoming test packets, echoes/timestamps them, reports receive stats |
**Typical deployment:** One instance as `sender` on the host, optionally a second instance as `responder` on another endpoint. Without a responder, the sender uses ICMP echo for latency measurement (routers respond natively).
### Creating Flows
Use the **Flow Builder** panel (left sidebar) in the UI:
| Field | Default | Description |
|-------|---------|-------------|
| Name | - | Human-readable flow name |
| Destination IP | `10.100.0.100` | Target router IP |
| Source IP | `10.40.40.202` | Host IP |
| Protocol | UDP | UDP, TCP, or ICMP |
| Source Port | 50000 | (UDP/TCP only) |
| Destination Port | 5001 | (UDP/TCP only) |
| Frame Size | 512 | Packet size in bytes |
| Rate (pps) | 1000 | Packets per second |
| Duration | 30 | Seconds (0 = infinite) |
| DSCP | 0 | Differentiated Services Code Point |
After creating a flow, use the **Flows** tab to Start/Stop/Delete flows.
### RFC 2544 Testing
Use the **Tests** tab to configure and run RFC 2544 tests:
| Test Type | Description |
|-----------|-------------|
| **Throughput** | Binary search for maximum zero-loss forwarding rate |
| **Latency** | Measure round-trip time at determined throughput rate |
| **Frame Loss** | Loss percentage vs. offered load curve |
| **Back-to-Back** | Maximum burst length at line rate with zero loss |
**Parameters:**
- **Base Flow:** Select a previously created flow as the test template
- **Frame Sizes:** Standard sizes: 64, 128, 256, 512, 1024, 1280, 1518 bytes
- **Trial Duration:** Per-frame-size test duration (5300 sec)
- **Max Rate (pps):** Upper bound for binary search
- **Acceptable Loss %:** Threshold for pass/fail
### Quick Presets
Six built-in presets are available in the **Tests** tab:
| Preset | Description |
|--------|-------------|
| quick_icmp | ICMP ping to CORE-01 at 10 pps |
| udp_flood_small | 64-byte UDP at 5000 pps |
| udp_flood_large | 1518-byte UDP at 1000 pps |
| rfc2544_throughput | Full throughput test with standard frame sizes |
| rfc2544_latency | Latency measurement with standard frame sizes |
| tcp_session | TCP flow at 500 pps |
### API Reference
| Method | Path | Description |
|--------|------|-------------|
| GET | `/healthz` | Health check + engine status |
| GET | `/interfaces` | Available network interfaces |
| GET | `/mode` | Current mode (sender/responder) |
| GET/POST | `/flows` | List / create flows |
| GET/PUT/DELETE | `/flows/<id>` | Get / update / delete flow |
| POST | `/flows/<id>/start` | Start sending |
| POST | `/flows/<id>/stop` | Stop sending |
| GET | `/flows/<id>/stats` | Real-time stats for a flow |
| GET/POST | `/tests` | List / create RFC 2544 tests |
| GET | `/tests/<id>` | Test details + results |
| POST | `/tests/<id>/start` | Start test execution |
| POST | `/tests/<id>/stop` | Abort test |
| GET | `/tests/<id>/results` | Exportable results |
| GET | `/presets` | Available test presets |
| POST | `/presets/<name>` | Create flow + test from preset |
| GET | `/stats/history` | Stats ring buffer (300 samples) |
| GET | `/responder/stats` | Responder-mode receive stats |
| POST | `/responder/reset` | Reset responder counters |
### Integration with gNMI Telemetry
The key value of combining the traffic generator with gNMI: **send traffic while watching real-time interface counters**.
1. Create a UDP flow targeting a router (e.g., R9K-01 at 10.100.0.1)
2. Open the Grafana **Interface Utilization** dashboard, select that router
3. Start the flow — gNMI counters show traffic appearing on the interface
4. Run an RFC 2544 throughput test — Grafana shows the stepped traffic pattern from binary search iterations
5. Compare Scapy-reported stats with gNMI-reported counters for cross-validation
The **Combined BMP + Telemetry** dashboard shows both control-plane (BMP BGP updates) and data-plane (gNMI interface counters) side by side, enabling correlation of BGP changes with traffic impact.
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `TRAFFIC_GEN_API_PORT` | `5051` | Flask API listen port |
| `TRAFFIC_GEN_MODE` | `sender` | Operating mode: `sender` or `responder` |
| `INFLUXDB_TOKEN` | `openbmp-telemetry-token` | InfluxDB auth token (Telegraf) |

View File

@ -35,6 +35,15 @@ Each docker file contains a readme file, see below:
## Using Docker Compose to run everything
> **Quick start (recommended):** copy `.env.example` to `.env`, fill it in, and
> run `./setup.sh` — it creates the data directories, syncs Grafana
> provisioning, and generates Authelia secrets. Then:
> ```
> docker compose up -d # BMP collector core
> docker compose --profile test --profile auth up -d # full stack
> ```
> See [DOCS.md](DOCS.md) section 4 for details and the manual alternative below.
### Install Docker Compose
You will need docker-compose. You can install that via [Docker Compose](https://docs.docker.com/compose/install/)
instructions. Docker compose will run everything, including handling restarts of containers.

View File

@ -0,0 +1,51 @@
---
# Authelia configuration template.
# setup.sh renders this to ${OBMP_DATA_ROOT}/authelia/configuration.yml,
# substituting the ${...} values from .env. Only rendered if the target
# file does not already exist — an existing deployment is never overwritten.
theme: dark
server:
address: 'tcp://0.0.0.0:9091/authelia'
endpoints:
authz:
forward-auth:
implementation: ForwardAuth
log:
level: info
totp:
issuer: openbmp
authentication_backend:
file:
path: /config/users_database.yml
password:
algorithm: bcrypt
iterations: 12
session:
name: authelia_session
secret: ${AUTHELIA_SESSION_SECRET}
expiration: 12h
inactivity: 6h
cookies:
- domain: ${OBMP_COOKIE_DOMAIN}
authelia_url: https://${OBMP_DOMAIN}/authelia
identity_validation:
reset_password:
jwt_secret: ${AUTHELIA_JWT_SECRET}
storage:
local:
path: /config/db.sqlite3
encryption_key: ${AUTHELIA_STORAGE_ENCRYPTION_KEY}
access_control:
default_policy: one_factor
notifier:
filesystem:
filename: /config/notification.txt

View File

@ -0,0 +1,15 @@
---
# Authelia user database template.
# setup.sh copies this to ${OBMP_DATA_ROOT}/authelia/users_database.yml only
# if that file does not already exist. The bcrypt hash below is the default
# demo account (username: openbmp). Change it after first login, or generate
# a new hash with:
# docker run --rm authelia/authelia:4.38 \
# authelia crypto hash generate bcrypt --password '<new-password>'
users:
openbmp:
displayname: "OpenBMP Demo"
password: "$2b$12$KQiQo1bYWqadD51HlgfgO.M1JfVlA5qP2YVRoBMTPmWq6osPljUTW"
email: demo@apodacalab.com
groups:
- admins

53
cml/build-cml-image.sh Executable file
View File

@ -0,0 +1,53 @@
#!/bin/bash
# Build the ExaBGP Docker image and export it for CML 2.9 import.
#
# Usage:
# ./cml/build-cml-image.sh
#
# Output:
# /tmp/obmp-exabgp.tar — upload this to CML via:
# Tools > Node and Image Definitions > Image Definitions > Manage Image Uploads
#
# After upload, also import the node + image definition YAMLs:
# Tools > Node and Image Definitions > Import > cml/exabgp-node-definition.yaml
# Tools > Node and Image Definitions > Import > cml/exabgp-image-definition.yaml
set -e
cd "$(dirname "$0")/.."
echo "=== Building ExaBGP Docker image ==="
docker build -t obmp-exabgp:latest ./exabgp/
echo ""
echo "=== Exporting image to /tmp/obmp-exabgp.tar ==="
docker save -o /tmp/obmp-exabgp.tar obmp-exabgp:latest
echo ""
echo "=== Image details ==="
SIZE=$(du -h /tmp/obmp-exabgp.tar | cut -f1)
echo " File: /tmp/obmp-exabgp.tar ($SIZE)"
SHA=$(sha256sum /tmp/obmp-exabgp.tar | awk '{print $1}')
echo " SHA256: $SHA"
IMAGE_ID=$(docker image inspect obmp-exabgp:latest --format='{{.Id}}')
echo " Image ID: $IMAGE_ID"
echo ""
echo "=== Next steps ==="
echo "1. Update cml/exabgp-image-definition.yaml with:"
echo " sha256: $SHA"
echo ""
echo "2. Upload to CML:"
echo " a. Tools > Node and Image Definitions > Import"
echo " Upload: cml/exabgp-node-definition.yaml"
echo " b. Tools > Node and Image Definitions > Import"
echo " Upload: cml/exabgp-image-definition.yaml"
echo " c. Tools > Node and Image Definitions > Image Definitions > Manage Image Uploads"
echo " Upload: /tmp/obmp-exabgp.tar"
echo ""
echo "3. In your CML lab topology:"
echo " a. Drag 'ExaBGP Route Injector' from the node palette"
echo " b. Draw links to CORE-01 and CORE-02"
echo " c. Edit the boot.sh in the node config to set correct IPs"
echo " d. Start the node"

62
cml/build-xrd-image.sh Executable file
View File

@ -0,0 +1,62 @@
#!/bin/bash
# Export the XRd control-plane Docker image for CML 2.9 import.
#
# Usage:
# ./cml/build-xrd-image.sh
#
# The XRd image already exists locally (ios-xr/xrd-control-plane:25.1.1).
# This script just exports it to a .tar file for CML upload.
set -e
IMAGE="ios-xr/xrd-control-plane:25.1.1"
OUTPUT="/tmp/xrd-control-plane.tar"
echo "=== Verifying XRd image exists ==="
if ! docker image inspect "$IMAGE" >/dev/null 2>&1; then
echo "ERROR: Image $IMAGE not found locally."
echo "Check with: docker images | grep xrd"
exit 1
fi
echo " Image: $IMAGE"
SIZE=$(docker image inspect "$IMAGE" --format='{{.Size}}' | numfmt --to=iec 2>/dev/null || echo "unknown")
echo " Size: $SIZE"
echo ""
echo "=== Exporting image to $OUTPUT ==="
echo " (This may take a minute for ~1.3GB image...)"
docker save -o "$OUTPUT" "$IMAGE"
echo ""
echo "=== Export complete ==="
TAR_SIZE=$(du -h "$OUTPUT" | cut -f1)
echo " File: $OUTPUT ($TAR_SIZE)"
SHA=$(sha256sum "$OUTPUT" | awk '{print $1}')
echo " SHA256: $SHA"
echo ""
echo "=== Next steps ==="
echo "1. Update cml/xrd-image-definition.yaml with:"
echo " sha256: $SHA"
echo ""
echo "2. Upload to CML:"
echo " a. Tools > Node and Image Definitions > Import"
echo " Upload: cml/xrd-node-definition.yaml"
echo " b. Tools > Node and Image Definitions > Import"
echo " Upload: cml/xrd-image-definition.yaml"
echo " c. Tools > Node and Image Definitions > Image Definitions > Manage Image Uploads"
echo " Upload: $OUTPUT"
echo " (For large files, consider SCP to CML server instead)"
echo ""
echo "3. In your CML lab topology:"
echo " a. Drag 'XRd Control-Plane (IOS-XR)' from the node palette"
echo " b. Draw links to CORE-01 (→Gi0/0/0/0) and CORE-02 (→Gi0/0/0/1)"
echo " c. Edit xrd-startup.cfg if needed (IPs, BMP target, etc.)"
echo " d. Start the node (allow ~3-5 min for XRd boot)"
echo ""
echo "4. After boot, verify via XRd console:"
echo " show isis adjacency"
echo " show bgp summary"
echo " show bmp server 1"

View File

@ -0,0 +1,10 @@
id: obmp-exabgp.latest
node_definition_id: obmp-exabgp
description: |-
OpenBMP ExaBGP Route Injector
Python 3.11 + ExaBGP + Flask API for BGP route injection testing.
label: ExaBGP Route Injector
disk_image: obmp-exabgp.tar
read_only: false
schema_version: 0.0.1
# sha256: <UPDATE after running: sha256sum /tmp/obmp-exabgp.tar>

View File

@ -0,0 +1,112 @@
id: obmp-exabgp
boot:
timeout: 60
completed:
- "ExaBGP Route Injector"
uses_regex: false
sim:
linux_native:
libvirt_domain_driver: docker
driver: ubuntu
ram: 512
cpus: 1
cpu_limit: 100
video:
memory: 1
general:
nature: server
description: OpenBMP ExaBGP Route Injector (Docker container)
read_only: false
configuration:
generator:
driver: null
provisioning:
files:
- editable: false
name: config.json
content: |-
{
"docker": {
"image": "obmp-exabgp:latest",
"mounts": [
"type=bind,source=cfg/boot.sh,target=/cml-boot.sh"
],
"misc_args": [],
"env": [
"EXABGP_LOCAL_AS=65100",
"EXABGP_PEER_AS=65020",
"EXABGP_API_PORT=5050"
]
},
"shell": "/bin/bash",
"day0cmd": [ "/bin/bash", "/cml-boot.sh" ],
"busybox": false
}
- editable: true
name: boot.sh
content: |-
#!/bin/bash
# CML boot script for ExaBGP container
# Configures data-plane interfaces before starting ExaBGP
#
# Interface mapping (assigned by CML topology links):
# eth0 = first connected interface (data-plane link 1)
# eth1 = second connected interface (data-plane link 2)
# ...additional interfaces as connected in topology
#
# Edit the IPs below to match your topology addressing.
# These are examples using 10.120.x.x/30 point-to-point links.
# --- Data-plane interface configuration ---
# Link to CORE-01: ExaBGP=10.120.1.2/30, CORE-01=10.120.1.1/30
ip address add 10.120.1.2/30 dev eth0
ip link set dev eth0 up
# Link to CORE-02: ExaBGP=10.120.2.2/30, CORE-02=10.120.2.1/30
ip address add 10.120.2.2/30 dev eth1
ip link set dev eth1 up
# --- Set environment for ExaBGP peering ---
export EXABGP_LOCAL_IP=10.120.1.2
export EXABGP_PEER_1=10.120.1.1
export EXABGP_PEER_2=10.120.2.1
# --- Start ExaBGP ---
exec /bin/bash /exabgp/startup.sh
media_type: raw
volume_name: cfg
device:
interfaces:
serial_ports: 1
physical:
- eth0
- eth1
- eth2
- eth3
has_loopback_zero: false
default_count: 2
ui:
label_prefix: exabgp-
icon: server
label: ExaBGP Route Injector
visible: true
group: Others
description: |-
OpenBMP ExaBGP Route Injector
BGP route injection for OpenBMP testing.
AS 65100 (eBGP) peering with IOS-XR routers (AS 65020).
Flask API on port 5050 for route management.
inherited:
image:
ram: true
cpus: false
data_volume: false
boot_disk_size: false
cpu_limit: false
node:
ram: true
cpus: false
data_volume: false
boot_disk_size: false
cpu_limit: false
schema_version: 0.0.1

215
cml/gobgp_peering_config.py Normal file
View File

@ -0,0 +1,215 @@
#!/usr/bin/env python3
"""Peer the CML core routers with the GoBGP full-table feed (roadmap E1).
GoBGP (AS65001, 10.40.40.250) holds the full real Internet table pulled from
the Bromirski route server. This script configures CORE-01/CORE-02 (AS65020)
to peer eBGP with GoBGP and accept that table. As route reflectors the cores
then propagate it to every R9K client -- so all 9 lab routers carry and
BMP-export a full table. This is an intentional lab stress test of the
OpenBMP ingestion/storage path.
Applied per core (additive -- no existing session/policy is modified):
* route-policy GOBGP-FEED-PASS (a plain `pass` policy; eBGP needs one)
* neighbor 10.40.40.202 remote-as 65001, ebgp-multihop, mgmt-sourced,
IPv4-unicast only, with a maximum-prefix safety cap.
The matching GoBGP side is gobgp/gobgpd.conf (neighbors 10.100.0.100/.200);
restart GoBGP after applying: docker compose up -d gobgp
IOS-XR BMP config is not exposed via NETCONF on this release, so -- like
cml/proxmox_bmp_config.py -- this applies config over the SSH CLI.
Covers both labs: CML cores (AS65020) and PROX cores (AS65021).
Usage:
python3 cml/gobgp_peering_config.py # apply, all 4 cores
python3 cml/gobgp_peering_config.py cml # apply, CML cores only
python3 cml/gobgp_peering_config.py prox # apply, PROX cores only
python3 cml/gobgp_peering_config.py --remove # ROLLBACK, all cores
python3 cml/gobgp_peering_config.py --remove prox # ROLLBACK, PROX only
Rollback: `--remove` deletes the GoBGP neighbor and the GOBGP-FEED-PASS
policy from the cores. To stop the feed instantly without touching router
config, `docker compose stop gobgp` -- the eBGP sessions drop and the full
table is withdrawn fleet-wide within seconds. See gobgp/README.md.
"""
import os
import sys
import time
import paramiko
def _env_default(key, default, dotenv=".env"):
"""Resolve a value from os.environ or the repo-root .env, else default."""
v = os.environ.get(key)
if v:
return v
try:
with open(dotenv) as fh:
for line in fh:
s = line.strip()
if s and not s.startswith("#") and s.startswith(f"{key}="):
return s.split("=", 1)[1].strip().strip('"').strip("'")
except FileNotFoundError:
pass
return default
# GoBGP runs network_mode: host, so it sources BGP TCP from the host's real
# interface IP -- NOT its router-id. The cores must peer with the host IP.
# Resolved from $HOST_IP or the HOST_IP= line in repo-root .env.
GOBGP_IP = _env_default("HOST_IP", "10.40.40.202")
GOBGP_AS = "65001"
# Additive config, built per core (asn = that core's local BGP AS:
# CML lab = 65020, PROX lab = 65021). Flat formal-form lines applied at the
# (config)# prompt.
# IPv4-unicast only: the cores have no global IPv6 address, so an ipv6-unicast
# AF on this IPv4-transport session holds the whole neighbor Idle. The IPv6
# full-table feed is a separate phase (needs a v6-transport session or v6
# addressing on the cores).
def apply_lines(asn):
n = f"router bgp {asn} neighbor {GOBGP_IP}"
return [
"route-policy GOBGP-FEED-PASS",
" pass",
"end-policy",
f"{n} remote-as {GOBGP_AS}",
f"{n} description GoBGP full-table feed (lab stress test)",
f"{n} ebgp-multihop 64",
f"{n} update-source MgmtEth0/RP0/CPU0/0",
f"{n} address-family ipv4 unicast route-policy GOBGP-FEED-PASS in",
f"{n} address-family ipv4 unicast route-policy GOBGP-FEED-PASS out",
f"{n} address-family ipv4 unicast maximum-prefix 1500000 90",
]
# Rollback -- remove the neighbor (and its sub-config) then the policy.
def remove_lines(asn):
return [
f"no router bgp {asn} neighbor {GOBGP_IP}",
"no route-policy GOBGP-FEED-PASS",
]
# (name, mgmt_ip, user, password, local_asn) -- both labs.
CORES = [
("CML-CORE-01", "10.100.0.100", "webui", "cisco", "65020"),
("CML-CORE-02", "10.100.0.200", "webui", "cisco", "65020"),
("PROX-CORE-01", "10.100.1.100", "admin", "cisco", "65021"),
("PROX-CORE-02", "10.100.1.200", "admin", "cisco", "65021"),
]
def _drain(shell, settle=0.4, limit=20.0, until=None):
out, start = "", time.time()
while time.time() - start < limit:
time.sleep(settle)
if shell.recv_ready():
out += shell.recv(65535).decode(errors="replace")
if until and until in out:
break
elif until is None:
break
elif until in out:
break
return out
def configure_core(name, ip, user, pwd, asn, mode):
verb = "applying" if mode == "apply" else "removing"
lines = apply_lines(asn) if mode == "apply" else remove_lines(asn)
print(f"\n=== {name} ({ip}) AS{asn} -- {verb} GoBGP peering ===")
try:
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(ip, username=user, password=pwd, timeout=15,
look_for_keys=False, allow_agent=False)
shell = ssh.invoke_shell(width=220, height=2000)
time.sleep(2)
shell.recv(65535)
CFG = "(config)#"
shell.send("terminal length 0\n")
_drain(shell, 0.4, 5)
shell.send("configure terminal\n")
out = _drain(shell, 0.4, 15, until=CFG)
if CFG not in out:
print(f" FAIL: could not enter config mode\n {out[-200:]}")
ssh.close()
return False
for line in lines:
shell.send(line + "\n")
time.sleep(0.4)
_drain(shell, 0.3, 8, until=CFG)
shell.send("show configuration\n")
cand = _drain(shell, 0.3, 10, until=CFG)
if GOBGP_IP not in cand and "GOBGP-FEED-PASS" not in cand:
print(f" OK: no changes ({mode} already in effect)")
shell.send("abort\n")
_drain(shell, 0.5, 5)
ssh.close()
return True
shell.send("commit\n")
result = _drain(shell, 0.3, 25, until=CFG)
if "fail" in result.lower() or "error" in result.lower():
print(f" FAIL: commit error\n {result[-300:]}")
shell.send("abort\n")
_drain(shell, 0.5, 5)
ssh.close()
return False
shell.send("end\n")
_drain(shell, 1.0, 8)
if mode == "apply":
shell.send(f"show bgp ipv4 unicast neighbors {GOBGP_IP} | include BGP state\n")
verify = _drain(shell, 1.0, 12)
state = next((l.strip() for l in verify.splitlines()
if "BGP state" in l), "(state not yet reported)")
print(f" committed. {state}")
else:
shell.send(f"show running-config router bgp | include {GOBGP_IP}\n")
verify = _drain(shell, 1.0, 12)
gone = GOBGP_IP not in verify.replace(f"include {GOBGP_IP}", "")
print(f" committed. neighbor removed: {gone}")
ssh.close()
return True
except Exception as e:
print(f" FAIL: {e}")
return False
def main():
args = [a for a in sys.argv[1:]]
mode = "apply"
if "--remove" in args:
mode = "remove"
args.remove("--remove")
target = args[0].lower() if args else None
if mode == "remove":
print("ROLLBACK: removing GoBGP peering from the core routers.")
results = {}
for name, ip, user, pwd, asn in CORES:
if target and target not in name.lower():
continue
results[name] = configure_core(name, ip, user, pwd, asn, mode)
print(f"\n{'='*48}\n SUMMARY ({mode})")
for name, ok in results.items():
print(f" {name:22s} {'OK' if ok else 'FAILED'}")
if mode == "apply":
print("\nNext: restart GoBGP to load the new neighbors:")
print(" docker compose up -d gobgp")
else:
print("\nGoBGP container config still lists the cores; that is inert")
print("with the neighbors removed. To fully revert, also restore the")
print("previous gobgp/gobgpd.conf and run: docker compose up -d gobgp")
sys.exit(0 if all(results.values()) else 1)
if __name__ == "__main__":
main()

191
cml/proxmox_bmp_config.py Normal file
View File

@ -0,0 +1,191 @@
#!/usr/bin/env python3
"""Apply the OpenBMP `bmp server 1` config to the Proxmox CML lab routers.
IOS-XR BMP configuration is not exposed via the device's NETCONF YANG schema
on this release, so this applies config over the SSH CLI. It is idempotent
re-applying an identical block commits no changes.
PROX-R9K-03 was built without `bmp-activate` on its BGP neighbor-group; this
script adds it (the other 8 routers already have it from the re-addressing).
Usage:
pip install paramiko
python3 cml/proxmox_bmp_config.py # all 9 routers
python3 cml/proxmox_bmp_config.py r9k-05 # one router (smoke test)
Verify afterwards in OpenBMP:
docker exec -i obmp-psql psql -U openbmp -d openbmp \\
-c "SELECT name, ip_address, bgp_id, isconnected FROM routers ORDER BY name;"
"""
import os
import sys
import time
import paramiko
def _env_default(key, default, dotenv=".env"):
"""Resolve a value from os.environ or the repo-root .env, else default."""
v = os.environ.get(key)
if v:
return v
try:
with open(dotenv) as fh:
for line in fh:
s = line.strip()
if s and not s.startswith("#") and s.startswith(f"{key}="):
return s.split("=", 1)[1].strip().strip('"').strip("'")
except FileNotFoundError:
pass
return default
# --- BMP collector ---------------------------------------------------------
# Resolved from $HOST_IP or the HOST_IP= line in repo-root .env.
COLLECTOR_HOST = _env_default("HOST_IP", "10.40.40.202")
COLLECTOR_PORT = "5000"
# `bmp server 1` block — flat formal form, identical to the ESXi lab.
# Each line is self-contained and applied at the (config)# prompt; a bare
# "bmp server 1" is deliberately omitted (it would drop into the bmp submode
# and the remaining flat lines would then be invalid).
BMP_LINES = [
f"bmp server 1 host {COLLECTOR_HOST} port {COLLECTOR_PORT}",
"bmp server 1 description OpenBMP-Collector",
"bmp server 1 update-source MgmtEth0/RP0/CPU0/0",
"bmp server 1 initial-delay 60",
"bmp server 1 stats-reporting-period 300",
"bmp server 1 initial-refresh delay 60 spread 30",
]
# Only PROX-R9K-03 needs this — its BMP-MONITORED neighbor-group was built
# without bmp-activate. AS 65021 is the Proxmox lab.
BMP_ACTIVATE_LINE = "router bgp 65021 neighbor-group BMP-MONITORED bmp-activate server 1"
# --- router inventory ------------------------------------------------------
# (name, mgmt_ip, user, password, needs_bmp_activate)
ROUTERS = [
("PROX-R9K-CORE-01", "10.100.1.100", "admin", "cisco", False),
("PROX-R9K-CORE-02", "10.100.1.200", "admin", "cisco", False),
("PROX-R9K-01", "10.100.1.1", "webui", "cisco", False),
("PROX-R9K-02", "10.100.1.2", "webui", "cisco", False),
("PROX-R9K-03", "10.100.1.3", "webui", "cisco", True),
("PROX-R9K-04", "10.100.1.4", "webui", "cisco", False),
("PROX-R9K-05", "10.100.1.5", "webui", "cisco", False),
("PROX-R9K-06", "10.100.1.6", "webui", "cisco", False),
("PROX-R9K-07", "10.100.1.7", "admin", "cisco", False),
]
def _drain(shell, settle=1.0, limit=15.0, until=None):
"""Read from the shell.
If `until` is given, keep reading until that substring appears (or `limit`
elapses). Otherwise return once output stops arriving for `settle` seconds.
"""
out = ""
start = time.time()
while time.time() - start < limit:
time.sleep(settle)
if shell.recv_ready():
out += shell.recv(65535).decode(errors="replace")
if until and until in out:
break
elif until is None:
break
elif until in out:
break
return out
def apply_router(name, ip, user, pwd, needs_activate):
"""Apply the BMP config to one router. Returns True on success."""
print(f"\n=== {name} ({ip}) ===")
lines = list(BMP_LINES)
if needs_activate:
lines.append(BMP_ACTIVATE_LINE)
try:
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(ip, username=user, password=pwd, timeout=15,
look_for_keys=False, allow_agent=False)
shell = ssh.invoke_shell(width=220, height=1000)
time.sleep(2)
shell.recv(65535) # banner
# "(config)#" is the universal IOS-XR config-prompt suffix — used as
# the wait marker so the device hostname is irrelevant.
CFG = "(config)#"
shell.send("terminal length 0\n")
_drain(shell, 0.5, 5)
# Enter config mode. IOS-XR may print an active-session banner first,
# so wait specifically for the (config) prompt.
shell.send("configure terminal\n")
out = _drain(shell, 0.4, 15, until=CFG)
if CFG not in out:
print(f" FAIL: could not enter config mode\n {out[-200:]}")
ssh.close()
return False
# Send config lines, paced.
for line in lines:
shell.send(line + "\n")
time.sleep(0.4)
_drain(shell, 0.3, 8, until=CFG)
# Confirm the candidate actually holds changes before committing.
shell.send("show configuration\n")
cand = _drain(shell, 0.3, 10, until=CFG)
if "bmp server" not in cand:
print(" OK: no changes (config already present) — nothing to commit")
shell.send("abort\n")
_drain(shell, 0.5, 5)
ssh.close()
return True
shell.send("commit\n")
result = _drain(shell, 0.3, 25, until=CFG)
if "fail" in result.lower() or "error" in result.lower():
print(f" FAIL: commit error\n {result[-300:]}")
shell.send("abort\n")
_drain(shell, 0.5, 5)
ssh.close()
return False
# Leave config mode and fully drain (settle-based, no marker) so the
# verify output is clean — not contaminated by echoed config lines.
shell.send("end\n")
_drain(shell, 1.0, 10)
shell.send("show run formal bmp\n")
verify = _drain(shell, 1.0, 12)
ok = f"host {COLLECTOR_HOST} port {COLLECTOR_PORT}" in verify
print(f" {'OK' if ok else 'FAIL'}: bmp server 1 "
f"{'present' if ok else 'NOT found'} in running config")
ssh.close()
return ok
except Exception as e:
print(f" FAIL: {e}")
return False
def main():
target = sys.argv[1].lower() if len(sys.argv) > 1 else None
results = {}
for name, ip, user, pwd, needs_activate in ROUTERS:
if target and target not in name.lower():
continue
results[name] = apply_router(name, ip, user, pwd, needs_activate)
print(f"\n{'='*48}\n SUMMARY")
for name, ok in results.items():
print(f" {name:22s} {'OK' if ok else 'FAILED'}")
sys.exit(0 if all(results.values()) else 1)
if __name__ == "__main__":
main()

View File

@ -0,0 +1,10 @@
id: xrd-control-plane.25.1.1
node_definition_id: xrd-control-plane-rr
description: |-
Cisco XRd Control-Plane 25.1.1
IOS-XR containerized routing daemon for BGP/IS-IS/BMP workloads.
label: XRd Control-Plane 25.1.1
disk_image: xrd-control-plane.tar
read_only: false
schema_version: 0.0.1
# sha256: <UPDATE after running: sha256sum /tmp/xrd-control-plane.tar>

View File

@ -0,0 +1,179 @@
id: xrd-control-plane-rr
boot:
timeout: 300
completed:
- "IOS XR RUN"
uses_regex: false
sim:
linux_native:
libvirt_domain_driver: docker
driver: ubuntu
ram: 2048
cpus: 2
cpu_limit: 100
video:
memory: 1
general:
nature: router
description: Cisco XRd Control-Plane - IOS-XR containerized routing daemon
read_only: false
configuration:
generator:
driver: null
provisioning:
files:
- editable: false
name: config.json
content: |-
{
"docker": {
"image": "ios-xr/xrd-control-plane:25.1.1",
"mounts": [
"type=bind,source=cfg/boot.sh,target=/cml-boot.sh",
"type=bind,source=cfg/xrd-startup.cfg,target=/etc/xrd/startup.cfg"
],
"misc_args": [
"--privileged"
],
"env": [
"XR_STARTUP_CFG=/etc/xrd/startup.cfg",
"XR_MGMT_INTERFACES=linux:eth0,chksum",
"XR_INTERFACES=linux:eth1,xr_name=Gi0/0/0/0;linux:eth2,xr_name=Gi0/0/0/1;linux:eth3,xr_name=Gi0/0/0/2;linux:eth4,xr_name=Gi0/0/0/3"
]
},
"shell": "/bin/bash",
"day0cmd": [ "/bin/bash", "/cml-boot.sh" ],
"busybox": false
}
- editable: true
name: boot.sh
content: |-
#!/bin/bash
# CML boot wrapper for XRd control-plane.
# XRd handles its own init — this script configures
# data-plane interfaces before XRd starts.
#
# Interface mapping (set via XR_INTERFACES env var):
# eth0 = MgmtEth0/RP0/CPU0/0 (CML mgmt)
# eth1 = Gi0/0/0/0 (data-plane link 1, e.g. to CORE-01)
# eth2 = Gi0/0/0/1 (data-plane link 2, e.g. to CORE-02)
# eth3+ = Gi0/0/0/2+ (additional links)
#
# Linux-level IP config is handled by XRd via startup.cfg.
# Just ensure interfaces are up.
for iface in eth0 eth1 eth2 eth3 eth4; do
[ -d /sys/class/net/$iface ] && ip link set dev $iface up
done
# XRd entrypoint
exec /usr/sbin/xrd
- editable: true
name: xrd-startup.cfg
content: |-
!! XRd Control-Plane - Third Route Reflector (RR3)
!! Peers with CORE-01 and CORE-02 as RR mesh (non-client iBGP)
!! Sends BMP to OpenBMP collector at 10.40.40.202:5000
!!
hostname XRd-RR3
!
interface Loopback0
ipv4 address 10.10.255.30 255.255.255.255
!
interface Gi0/0/0/0
description to-CORE-01
ipv4 address 10.120.3.2 255.255.255.252
no shutdown
!
interface Gi0/0/0/1
description to-CORE-02
ipv4 address 10.120.4.2 255.255.255.252
no shutdown
!
router isis 1
is-type level-2-only
net 49.0001.0100.1000.0030.00
address-family ipv4 unicast
metric-style wide
!
interface Loopback0
passive
address-family ipv4 unicast
!
!
interface Gi0/0/0/0
point-to-point
address-family ipv4 unicast
!
!
interface Gi0/0/0/1
point-to-point
address-family ipv4 unicast
!
!
!
router bgp 65020
bgp router-id 10.10.255.30
address-family ipv4 unicast
!
neighbor 10.10.255.0
remote-as 65020
update-source Loopback0
address-family ipv4 unicast
!
!
neighbor 10.10.255.20
remote-as 65020
update-source Loopback0
address-family ipv4 unicast
!
!
!
bmp server 1
host 10.40.40.202 port 5000
description OpenBMP
update-source Gi0/0/0/0
flapping-delay 60
initial-delay 5
stats-reporting-period 300
initial-refresh delay 30 spread 2
!
ssh server v2
end
media_type: raw
volume_name: cfg
device:
interfaces:
serial_ports: 1
physical:
- eth0
- eth1
- eth2
- eth3
- eth4
has_loopback_zero: false
default_count: 3
ui:
label_prefix: xrd-
icon: router
label: XRd Control-Plane (IOS-XR)
visible: true
group: Cisco
description: |-
Cisco XRd Control-Plane (IOS-XR 25.1.1)
Containerized IOS-XR routing daemon for control-plane workloads.
Full BGP, IS-IS, BMP, NETCONF support.
Configured as third Route Reflector (RR3) with BMP to OpenBMP.
inherited:
image:
ram: true
cpus: true
data_volume: false
boot_disk_size: false
cpu_limit: false
node:
ram: true
cpus: true
data_volume: false
boot_disk_size: false
cpu_limit: false
schema_version: 0.0.1

View File

@ -1,5 +1,5 @@
---
version: '3'
name: obmp
volumes:
data-volume:
driver_opts:
@ -17,7 +17,14 @@ services:
zookeeper:
restart: unless-stopped
container_name: obmp-zookeeper
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/2181'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
image: confluentinc/cp-zookeeper:7.1.1
mem_limit: 1g
volumes:
- ${OBMP_DATA_ROOT}/zk-data:/var/lib/zookeeper/data
- ${OBMP_DATA_ROOT}/zk-log:/var/lib/zookeeper/log
@ -28,7 +35,15 @@ services:
kafka:
restart: unless-stopped
container_name: obmp-kafka
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/9092'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 90s
image: confluentinc/cp-kafka:7.1.1
# Raise KAFKA_MEM_LIMIT for production (full-table initial dumps are bursty).
mem_limit: ${KAFKA_MEM_LIMIT:-4g}
# Change the mount point to where you want to store Kafka data.
# Normally 80GB or more
@ -45,7 +60,7 @@ services:
# Change/add listeners based on your FQDN that the host and other containers can access. You can use
# an IP address as well. By default, only within the compose/containers can Kafka be accesssed
# using port 29092. Outside access can be enabled, but you should use an FQDN listener.
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092,PLAINTEXT_HOST://10.40.40.202:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092,PLAINTEXT_HOST://${HOST_IP:-10.40.40.202}:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
@ -84,7 +99,14 @@ services:
grafana:
restart: unless-stopped
container_name: obmp-grafana
healthcheck:
test: ["CMD-SHELL", "wget -q --spider http://localhost:3000/api/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
image: grafana/grafana:9.1.7
mem_limit: 1g
ports:
- "3000:3000"
volumes:
@ -92,7 +114,17 @@ services:
- ${OBMP_DATA_ROOT}/grafana/provisioning:/etc/grafana/provisioning/
environment:
- GF_SECURITY_ADMIN_PASSWORD=openbmp
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ENABLED=false
# setup.sh writes GF_SERVER_ROOT_URL into .env based on OBMP_AUTH_MODE
# (http://HOST_IP:3000/grafana/ for local, https://OBMP_DOMAIN/grafana/
# for authelia). The fallback only matters if compose is run before
# setup.sh — keeps Grafana up at a sane URL.
- GF_SERVER_ROOT_URL=${GF_SERVER_ROOT_URL:-http://localhost:3000/grafana/}
- GF_SERVER_SERVE_FROM_SUB_PATH=true
- GF_AUTH_PROXY_ENABLED=true
- GF_AUTH_PROXY_HEADER_NAME=Remote-User
- GF_AUTH_PROXY_HEADER_PROPERTY=username
- GF_AUTH_PROXY_AUTO_SIGN_UP=true
- GF_USERS_HOME_PAGE=d/obmp-home/obmp-home
- GF_INSTALL_PLUGINS=agenty-flowcharting-panel,grafana-piechart-panel,grafana-worldmap-panel,grafana-simple-json-datasource,vonage-status-panel
@ -118,7 +150,15 @@ services:
psql:
restart: unless-stopped
container_name: obmp-psql
healthcheck:
test: ["CMD-SHELL", "pg_isready -U openbmp -d openbmp"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
image: openbmp/postgres:2.2.1
# Raise PSQL_MEM_LIMIT for production (see docs/production-sizing.md).
mem_limit: ${PSQL_MEM_LIMIT:-6g}
privileged: true
shm_size: 1536m
sysctls:
@ -141,7 +181,14 @@ services:
collector:
restart: unless-stopped
container_name: obmp-collector
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/5000'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
image: openbmp/collector:2.2.3
mem_limit: 2g
sysctls:
- net.ipv4.tcp_keepalive_intvl=30
- net.ipv4.tcp_keepalive_probes=5
@ -156,7 +203,22 @@ services:
psql-app:
restart: unless-stopped
container_name: obmp-psql-app
# Gate startup on Postgres being ready. psql-app's consumer connects to
# Postgres once at startup and, if it loses the cold-boot race (DB still
# initialising -> "the database system is starting up"), ConsumerApp.main
# throws and the consumer dies -- and the container does NOT exit, so
# restart: unless-stopped never fires. service_healthy avoids the race.
depends_on:
psql:
condition: service_healthy
kafka:
condition: service_started
# No healthcheck — the consumer exposes no health port; Docker's
# restart-on-exit covers process death.
image: openbmp/psql-app:2.2.2
# mem_limit must exceed the MEM (JVM heap) env below. Raise both for
# production — see docs/production-sizing.md.
mem_limit: ${PSQL_APP_MEM_LIMIT:-4g}
sysctls:
- net.ipv4.tcp_keepalive_intvl=30
- net.ipv4.tcp_keepalive_probes=5
@ -197,25 +259,60 @@ services:
- POSTGRES_DROP_stats_peer_rib='4 weeks'
- POSTGRES_DROP_stats_peer_update_counts='4 weeks'
# Consumer-only psql-app replica -- horizontal ingestion scale-out.
# Profile-gated; bring up on demand (the host needs spare CPU+RAM for it):
# docker compose --profile scale-out up -d --scale psql-app-consumer=2
# It shares the primary's /config (read-only) so it reuses obmp-psql.yml,
# whose fixed group.id "obmp-psql-consumer" makes Kafka rebalance partitions
# across the primary and every replica. The command runs ONLY the consumer
# jar -- no cron, no RPKI/IRR/DBIP, no initdb -- so a replica does NOT
# duplicate the primary's DB-maintenance jobs (update_global_ip_rib,
# update_chg_stats, retention, ...), which config_cron wires up
# unconditionally in /usr/sbin/run. Each replica brings its own consumer
# AND writer threads, so it adds real write throughput (the primary's
# writer_max_threads_per_type is 1).
psql-app-consumer:
profiles: ["scale-out"]
restart: unless-stopped
image: openbmp/psql-app:2.2.2
mem_limit: ${PSQL_APP_CONSUMER_MEM_LIMIT:-4g}
depends_on:
psql:
condition: service_healthy
kafka:
condition: service_started
volumes:
- ${OBMP_DATA_ROOT}/config:/config:ro
command: ["bash","-c","cd /var/log && exec java -Xmx3G -Xms128m -XX:+UseG1GC -XX:+UnlockExperimentalVMOptions -XX:InitiatingHeapOccupancyPercent=30 -XX:G1MixedGCLiveThresholdPercent=30 -XX:MaxGCPauseMillis=200 -XX:ParallelGCThreads=20 -XX:ConcGCThreads=5 -XX:+ExitOnOutOfMemoryError -Duser.timezone=UTC -jar /usr/local/openbmp/obmp-psql-consumer.jar -cf /config/obmp-psql.yml"]
exabgp:
restart: unless-stopped
container_name: obmp-exabgp
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/5050'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
profiles: ["test"]
# The full-table feature generates up to 900K route objects in memory;
# 512m OOM-killed it. Raise EXABGP_MEM_LIMIT in .env for larger tables.
mem_limit: ${EXABGP_MEM_LIMIT:-6g}
build:
context: ./exabgp
dockerfile: Dockerfile
# Host networking so ExaBGP can reach CML routers directly on port 179
network_mode: host
environment:
# IP on the host that CML routers can reach (matches Kafka external listener)
- EXABGP_LOCAL_IP=10.40.40.202
# ExaBGP presents as AS 65100 (eBGP peer to your AS 65020 lab)
- EXABGP_LOCAL_AS=65100
- EXABGP_PEER_AS=65020
# CORE routers to peer with — these propagate routes into the iBGP mesh
- EXABGP_PEER_1=10.100.0.100
- EXABGP_PEER_2=10.100.0.200
# IP on the host that CML routers reach (BGP peering source)
- EXABGP_LOCAL_IP=${HOST_IP:-10.40.40.202}
# ExaBGP presents as AS 65100 (eBGP peer to the lab route reflectors)
- EXABGP_LOCAL_AS=${EXABGP_LOCAL_AS:-65100}
# Peer list — ";"-separated entries of "ip:peer_as:description".
# Default covers both labs: AS 65020 (ESXi) and AS 65021 (Proxmox).
- EXABGP_PEERS=${EXABGP_PEERS:-10.100.0.100:65020:CML-R9K-CORE-01;10.100.0.200:65020:CML-R9K-CORE-02;10.100.1.100:65021:PROX-R9K-CORE-01;10.100.1.200:65021:PROX-R9K-CORE-02}
# Flask API port (also on host network)
- EXABGP_API_PORT=5050
- EXABGP_API_PORT=${EXABGP_API_PORT:-5050}
volumes:
# Mount scenarios dir so you can edit/add scenarios without rebuilding
- ./exabgp/scenarios:/exabgp/scenarios
@ -224,6 +321,14 @@ services:
exabgp-ui:
restart: unless-stopped
container_name: obmp-exabgp-ui
healthcheck:
test: ["CMD-SHELL", "wget -q --spider http://localhost:5001/ || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
profiles: ["test"]
mem_limit: 256m
build:
context: ./exabgp-ui
dockerfile: Dockerfile
@ -231,10 +336,265 @@ services:
network_mode: host
# Serves on port 5001 (host network, defined in nginx.conf)
# --- Phase 4: gNMI Streaming Telemetry ---
influxdb:
restart: unless-stopped
container_name: obmp-influxdb
healthcheck:
test: ["CMD-SHELL", "curl -fsS http://localhost:8086/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
profiles: ["test"]
image: influxdb:2.7
mem_limit: 2g
ports:
- "8086:8086"
volumes:
- ${OBMP_DATA_ROOT}/influxdb:/var/lib/influxdb2
environment:
- DOCKER_INFLUXDB_INIT_MODE=setup
- DOCKER_INFLUXDB_INIT_USERNAME=openbmp
- DOCKER_INFLUXDB_INIT_PASSWORD=openbmp123
- DOCKER_INFLUXDB_INIT_ORG=openbmp
- DOCKER_INFLUXDB_INIT_BUCKET=telemetry
- DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=openbmp-telemetry-token
- DOCKER_INFLUXDB_INIT_RETENTION=30d
telegraf:
restart: unless-stopped
container_name: obmp-telegraf
profiles: ["test"]
mem_limit: 512m
build:
context: ./telegraf
dockerfile: Dockerfile
network_mode: host
# Run telegraf as root and override the image entrypoint (which otherwise
# drops back to the telegraf user) so [[inputs.docker]] can read the
# Docker daemon socket for container resource metrics.
user: root
entrypoint: ["telegraf"]
volumes:
- /var/run/docker.sock:/var/run/docker.sock
# Host root, read-only — lets [[inputs.disk]] report the real host
# filesystems (Postgres/Kafka/InfluxDB data) instead of the container's.
- /:/hostfs:ro
depends_on:
- influxdb
environment:
- INFLUXDB_TOKEN=openbmp-telemetry-token
# Point gopsutil-based inputs (disk) at the host filesystem mount above.
- HOST_MOUNT_PREFIX=/hostfs
- HOST_PROC=/hostfs/proc
- HOST_SYS=/hostfs/sys
- HOST_ETC=/hostfs/etc
# PostgreSQL credentials for [[inputs.postgresql_extensible]] (DB size).
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:-openbmp}
# gNMI fleet — quoted, comma-separated host:port list. Default = the two
# ESXi CORE routers; extend via GNMI_ADDRESSES in .env for more routers.
- 'GNMI_ADDRESSES=${GNMI_ADDRESSES:-"10.100.0.100:57400", "10.100.0.200:57400"}'
- GNMI_USERNAME=${GNMI_USERNAME:-webui}
- GNMI_PASSWORD=${GNMI_PASSWORD:-cisco}
# --- Phase 4: Traffic Generator ---
traffic-gen:
restart: unless-stopped
container_name: obmp-traffic-gen
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/5051'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
profiles: ["test"]
mem_limit: 1g
build:
context: ./traffic-gen
dockerfile: Dockerfile
network_mode: host
cap_add:
- NET_RAW
- NET_ADMIN
environment:
- TRAFFIC_GEN_PORT=5051
- TRAFFIC_GEN_MODE=sender
- RESPONDER_URL=http://172.30.0.10:5053
traffic-gen-ui:
restart: unless-stopped
container_name: obmp-traffic-gen-ui
healthcheck:
test: ["CMD-SHELL", "wget -q --spider http://localhost:5002/ || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
profiles: ["test"]
mem_limit: 256m
build:
context: ./traffic-gen-ui
dockerfile: Dockerfile
network_mode: host
# Serves on port 5002 (host network, defined in nginx.conf)
traffic-gen-responder:
restart: unless-stopped
container_name: obmp-traffic-gen-responder
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/5053'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
profiles: ["test"]
mem_limit: 1g
build:
context: ./traffic-gen
dockerfile: Dockerfile
cap_add:
- NET_RAW
- NET_ADMIN
environment:
- TRAFFIC_GEN_PORT=5053
- TRAFFIC_GEN_MODE=responder
- TRAFFIC_GEN_RESPONDER_MODE=echo
- TRAFFIC_GEN_INTERFACE=eth0
networks:
traffic-test-net:
ipv4_address: 172.30.0.10
ports:
- "5053:5053"
# GoBGP -- pulls the full real Internet routing table (roadmap E1) from the
# AS57355 lab route server and BMP-exports it to the OpenBMP collector, where
# it lands in PostgreSQL ip_rib as a monitored peer. Config + MRT fallback
# script live in ./gobgp (see gobgp/README.md). Receive-only, local AS 65001.
gobgp:
restart: unless-stopped
container_name: obmp-gobgp
image: jauderho/gobgp:v4.5.0
# Host networking: the daemon uses the host's real IPv4 + IPv6 stack, so
# both the v4 and v6 eBGP sessions to AS57355 source from the host's
# public addresses (no Docker IPv6/NAT plumbing). BMP still reaches the
# collector on $HOST_IP:5000 (its published port).
network_mode: host
depends_on:
- collector
# gobgpd.conf.tmpl (in git, with __HOST_IP__ placeholders) is rendered to
# gobgpd.conf (gitignored) by setup.sh using $HOST_IP from .env. The
# gobgp image is distroless (no shell), so we cannot render at container
# start -- the rendered .conf must exist on the host before `docker
# compose up`. The same /config mount also carries mrt-refresh.sh and
# cached MRT dumps.
volumes:
- ./gobgp:/config
command: ["gobgpd", "-f", "/config/gobgpd.conf", "-t", "toml"]
# GoBGP -- modular EVPN test-route injector (roadmap E5). Profile-gated, so
# it is NOT part of the normal stack. Originates synthetic BGP EVPN routes
# and BMP-exports them so the EVPN pipeline can be exercised. Start only for
# testing: docker compose --profile evpn-test up -d gobgp-evpn
# then: bash gobgp-evpn/inject-evpn.sh
gobgp-evpn:
restart: unless-stopped
container_name: obmp-gobgp-evpn
profiles: ["evpn-test"]
image: jauderho/gobgp:v4.5.0
depends_on:
- collector
# gobgpd.conf is rendered by setup.sh from gobgpd.conf.tmpl. See gobgp/
# service notes above.
volumes:
- ./gobgp-evpn:/config
command: ["gobgpd", "-f", "/config/gobgpd.conf", "-t", "toml"]
# EVPN consumer -- subscribes to the openbmp.parsed.evpn Kafka topic (which
# the collector already populates) and writes BGP EVPN routes into evpn_rib;
# the stock psql-app does not handle EVPN. Profile-gated alongside the EVPN
# test injector: docker compose --profile evpn-test up -d
evpn-consumer:
restart: unless-stopped
container_name: obmp-evpn-consumer
profiles: ["evpn-test"]
build:
context: ./obmp-evpn-consumer
depends_on:
- kafka
- psql
environment:
- KAFKA_BROKER=obmp-kafka:29092
- EVPN_TOPIC=openbmp.parsed.evpn
- PG_DSN=host=obmp-psql port=5432 dbname=openbmp user=openbmp password=${POSTGRES_PASSWORD:-openbmp}
# Per-router BGP policy-diff collector. Pulls post-policy accepted/advertised
# prefix counts and route-policy bindings from the IOS-XR routers over CLI +
# NETCONF (BMP on XRv9000 24.3.1 only carries pre-policy Adj-RIB-In). Feeds
# the Policy Diff dashboard. Host networking: it must reach the lab
# management network (10.100.0.x) and the published Postgres port.
rib-poller:
restart: unless-stopped
container_name: obmp-rib-poller
build:
context: ./obmp-rib-poller
network_mode: host
depends_on:
- psql
environment:
- PG_DSN=host=${HOST_IP:-10.40.40.202} port=5432 dbname=openbmp user=openbmp password=${POSTGRES_PASSWORD:-openbmp}
- POLL_INTERVAL=900
- ROUTER_USER=webui
- ROUTER_PASS=cisco
# Samples Kafka consumer-group lag into PostgreSQL every 30s for the Kafka
# Lag dashboard -- visibility into the ingestion path under load (e.g. a
# full-table BGP convergence storm) and a sanity check when scaling psql-app.
kafka-lag-monitor:
restart: unless-stopped
container_name: obmp-kafka-lag-monitor
build:
context: ./kafka-lag-monitor
depends_on:
- kafka
- psql
environment:
- KAFKA_BROKER=obmp-kafka:29092
- PG_DSN=host=obmp-psql port=5432 dbname=openbmp user=openbmp password=${POSTGRES_PASSWORD:-openbmp}
- LAG_POLL_INTERVAL=30
- CONSUMER_GROUPS=obmp-psql-consumer,evpn-psql
# Decoupled fast-path BGP churn monitor. Reads openbmp.parsed.unicast_prefix
# with its own consumer group and only counts announcements/withdrawals --
# stays real-time during a churn storm even while psql-app lags, because
# counting is far cheaper than the relational RIB write. Featherweight.
churn-monitor:
restart: unless-stopped
container_name: obmp-churn-monitor
build:
context: ./obmp-churn-monitor
depends_on:
- kafka
- psql
environment:
- KAFKA_BROKER=obmp-kafka:29092
- PG_DSN=host=obmp-psql port=5432 dbname=openbmp user=openbmp password=${POSTGRES_PASSWORD:-openbmp}
- CHURN_TOPIC=openbmp.parsed.unicast_prefix
- FLUSH_INTERVAL=10
whois:
restart: unless-stopped
container_name: obmp-whois
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/43'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
image: openbmp/whois:2.2.0
mem_limit: 1g
sysctls:
- net.ipv4.tcp_keepalive_intvl=30
- net.ipv4.tcp_keepalive_probes=5
@ -249,3 +609,40 @@ services:
- POSTGRES_DB=openbmp
- POSTGRES_HOST=obmp-psql
- POSTGRES_PORT=5432
authelia:
restart: unless-stopped
container_name: obmp-authelia
profiles: ["auth"]
mem_limit: 256m
image: authelia/authelia:4.38
ports:
- "9091:9091"
volumes:
- ${OBMP_DATA_ROOT}/authelia:/config
environment:
- TZ=UTC
portal:
restart: unless-stopped
container_name: obmp-portal
healthcheck:
test: ["CMD-SHELL", "wget -q --spider http://localhost:80/ || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 20s
profiles: ["auth"]
mem_limit: 128m
image: nginx:alpine
ports:
- "8080:80"
volumes:
- ./portal:/usr/share/nginx/html:ro
networks:
traffic-test-net:
driver: bridge
ipam:
config:
- subnet: 172.30.0.0/24

402
docs/ROADMAP.md Normal file
View File

@ -0,0 +1,402 @@
# OpenBMP Platform Roadmap
## Context
This BMP monitoring platform is being developed against CML virtual labs (IOS-XR) and will be deployed into an ISP production network running IOS-XR and Juniper routers/route reflectors. The two tracks share a common foundation: configuration must be environment-agnostic so the same stack runs identically against virtual or production routers.
Currently, router IPs, AS numbers, and credentials are hardcoded across 8+ files, tightly coupling the stack to a single CML lab. This roadmap addresses both the multi-lab development workflow and production deployment.
---
## Track A: Configuration Centralization (Foundation for Both Tracks)
### A1. Create `inventory.yaml` — unified topology inventory
**File**: `inventory.yaml` (new)
Single source of truth for all environments. Structure:
```yaml
platform:
host_ip: 10.40.40.202
bmp_port: 5000
exabgp_port: 5050
environments:
cml-lab1:
type: cml # cml | production
description: "CML RR cluster - 9 IOS-XR virtual routers"
cml_server: "https://10.40.40.174"
cml_user: webui
bgp_as: 65020
netconf: { user: webui, password: cisco, port: 830 }
exabgp:
local_as: 65100
peers:
- { ip: 10.100.0.100, name: CORE-01, peer_as: 65020 }
- { ip: 10.100.0.200, name: CORE-02, peer_as: 65020 }
routers:
CORE-01: { mgmt: 10.100.0.100, loopback: 10.10.255.0, role: rr, vendor: iosxr, gnmi: true }
CORE-02: { mgmt: 10.100.0.200, loopback: 10.10.255.20, role: rr, vendor: iosxr, gnmi: true }
R9K-01: { mgmt: 10.100.0.1, loopback: 10.10.255.1, role: client, vendor: iosxr }
# ...
cml-lab2:
type: cml
description: "Second CML Lab (TBD topology)"
cml_server: "https://<lab2-ip>"
routers: {}
production:
type: production
description: "ISP production network"
bgp_as: <prod-as>
netconf: { user: <prod-user>, port: 830 }
routers:
# IOS-XR and Juniper RRs + routers
PROD-RR1: { mgmt: x.x.x.x, role: rr, vendor: iosxr, gnmi: true }
PROD-RR2: { mgmt: x.x.x.x, role: rr, vendor: junos }
# ...
```
Key design decisions:
- `vendor: iosxr | junos` — drives NETCONF dialect, gNMI paths, and config templates
- `type: cml | production` — CML environments have `cml_server` for API automation; production does not
- Credentials in `inventory.yaml` (gitignored) or pulled from env vars
### A2. Create `config_loader.py` — Python inventory helper
**File**: `config_loader.py` (new)
Functions: `get_env(name)`, `get_all_routers()`, `get_routers_by_vendor(vendor)`, `get_exabgp_peers()`, `get_gnmi_targets()`, `get_routers_for_env(env_name)`
### A3. Refactor hardcoded Python scripts
Replace `ROUTERS` dicts/lists with `config_loader` calls:
- `exabgp/route_diversity_config.py` (line 47)
- `exabgp/bgpls_config.py` (line 35)
- `gnmi/gnmi_grpc_config.py` (line 25)
### A4. Expand `.env` and parameterize `docker-compose.yml`
Add to `.env`:
```env
OBMP_DATA_ROOT=/var/openbmp
DOCKER_HOST_IP=10.40.40.202
EXABGP_LOCAL_IP=10.40.40.202
EXABGP_LOCAL_AS=65100
EXABGP_PEER_AS=65020
EXABGP_PEER_1=10.100.0.100
EXABGP_PEER_2=10.100.0.200
```
Replace hardcoded IPs in `docker-compose.yml` (Kafka listener, ExaBGP env vars).
### A5. Telegraf config parameterization
Replace hardcoded gNMI addresses in `telegraf/telegraf.conf` with env var substitution. Pass `GNMI_TARGETS` from docker-compose.yml.
### A6. Fix InfluxDB datasource URL
`obmp-grafana/provisioning/datasources/influxdb-ds.yml`: replace `http://10.40.40.202:8086` with `http://obmp-influxdb:8086`.
---
## Track B: Multi-Lab CML Development
### B1. Dynamic ExaBGP multi-peer support
**File**: `exabgp/startup.sh`
Accept `EXABGP_PEERS` env var (comma-separated `ip:as:description`), generate N neighbor blocks. Keep `PEER_1`/`PEER_2` fallback.
### B2. CML API client module
**File**: `cml/cml_client.py` (new)
Python module using `virl2_client` SDK:
- Connect to CML server (creds from `inventory.yaml`)
- Upload node/image definitions
- Import/export topology YAML
- Start/stop/destroy labs
- Get node status
### B3. Topology template system
**File**: `cml/templates/xrd_rr.j2` (new)
Jinja2 templates for XRd startup config. Parameterize: hostname, loopback, link IPs, IS-IS NET, BGP AS, neighbor IPs, BMP target.
### B4. CLI deployment tool
**File**: `cml/deploy.py` (new)
```bash
python3 cml/deploy.py --env cml-lab1 status
python3 cml/deploy.py --env cml-lab1 upload-images
python3 cml/deploy.py --env cml-lab2 create
python3 cml/deploy.py --env cml-lab2 start
python3 cml/deploy.py --env cml-lab2 destroy
```
### B5. Update build scripts with API push
`cml/build-cml-image.sh` and `cml/build-xrd-image.sh` get `--push <env-name>` flag.
---
## Track C: Production ISP Deployment
### C1. Multi-vendor NETCONF support
Current scripts assume IOS-XR NETCONF only. For Juniper RRs:
- `config_loader.py` provides `vendor` field per router
- NETCONF scripts branch on vendor for dialect differences (`device_params='iosxr'` vs `device_params='junos'`)
- Route diversity, BGP-LS config scripts get Junos templates alongside IOS-XR
### C2. Multi-vendor gNMI paths
Telegraf gNMI subscriptions currently use OpenConfig paths which work for both IOS-XR and Junos, but:
- Verify Juniper gNMI support on target hardware
- Add vendor-specific path overrides in `inventory.yaml` if needed
- Telegraf can subscribe to multiple targets with different configs via `[[inputs.gnmi]]` blocks
### C3. BMP considerations for production
- BMP collector (port 5000) accepts connections from any router — no changes needed
- Production routers need BMP config pushed (manual or via NETCONF automation)
- Consider: separate BMP server IDs per environment for dashboard filtering
- Juniper BMP config differs from IOS-XR — add Junos BMP config templates
### C4. Dashboard multi-environment awareness
- Add a Grafana template variable for environment filtering (by router name prefix or a tag)
- Consider a "Network Overview" dashboard that shows all environments side-by-side
- Existing dashboards work as-is — router dropdowns will show all BMP-reporting routers
### C5. Security hardening for production
- Move credentials out of `inventory.yaml` into environment variables or a secrets manager
- Authelia config: stronger passwords, TOTP enforcement, session timeouts
- PostgreSQL: restrict access, enable SSL
- Kafka: consider authentication if exposed beyond localhost
- BMP port: firewall to only accept connections from known router management IPs
### C6. Scalability considerations
- Monitor PostgreSQL disk usage and query performance with production-scale RIBs
- TimescaleDB compression policies for historical data (ip_rib_log, ls_*_log)
- Kafka topic partitioning if message throughput is high
- Consider read replicas or materialized views for heavy Grafana queries
---
## Track D: Packaging & Distribution
### D1. Configuration templates
- `inventory.yaml.example` — documented example with placeholder values
- `.env.example` — all environment variables with descriptions
### D2. Bootstrap script
`setup.sh` that:
- Creates required directories (`$OBMP_DATA_ROOT/authelia`, etc.)
- Copies example configs if originals don't exist
- Validates inventory.yaml syntax
- Generates Telegraf config from inventory
### D3. Published Docker images
Push custom images to a registry (Docker Hub or GHCR):
- `obmp-exabgp`
- `obmp-exabgp-ui`
- `obmp-traffic-gen`
- `obmp-traffic-gen-ui`
- `obmp-portal`
Replace `build:` with `image:` in docker-compose.yml (keep build as override).
### D4. Documentation
- `docs/quickstart.md` — 5-minute setup guide
- `docs/adding-a-lab.md` — how to add a CML lab environment
- `docs/production-deployment.md` — production hardening checklist
- `docs/architecture.md` — system diagram, data flow, port map
---
## Track E: Internet-Scale Routing Analytics
Adds a local copy of the real global routing table, generalizes router
comparison to an N-way diff, and threads VRF/RD scoping through the
dashboards. The full-table feed (E1) is the foundation — E2/E3 consume it.
### E1. GoBGP full-table feed → BMP → `ip_rib`
**Files**: `docker-compose.yml` (new `gobgp` service), `gobgp/gobgpd.conf` (new), `gobgp/mrt-refresh.sh` (new)
Stand up a GoBGP container that obtains a full Internet table (IPv4 ~1M +
IPv6 ~200k) and BMP-exports it to the existing OpenBMP collector, so the
global table lands in `ip_rib` as an ordinary monitored peer — every
existing dashboard and the diff then work against it for free.
- **Primary feed** — eBGP multihop session to Łukasz Bromirski's lab route
server, **AS57355** (`85.232.240.179`, `2001:1a68:2c:2::179`). Local ASN
private (e.g. 65199); announce nothing; `ebgp-multihop` TTL ~64; receive-only.
- **BMP export** — GoBGP `[[bmp-servers]]` block at the collector (port 5000),
`route-monitoring-policy = pre-policy`.
- **Fallback / seed**`gobgp/mrt-refresh.sh`, run every 2h (host cron or a
sidecar): download the latest RouteViews (`archive.routeviews.org`) or
RIPE-RIS MRT RIB dump and `gobgp mrt inject` it into the same instance.
- **Identification** — distinct BMP router name (e.g. `GLOBAL-FEED`) so
dashboards can include/exclude it.
Caveats:
- The route server is a single volunteer-run host, no SLA — the MRT fallback
is the reliability backstop, not optional.
- A full table roughly triples `ip_rib` size — see E-scale below.
- The feed carries **no VRF/L3VPN** routes — global unicast only.
### E2. Generic multi-router diff dashboard
**File**: `obmp-grafana/dashboards/.../router_diff.json` (new, uid `router-diff`), generalized from `rr_locrib_diff.json`
Replace the hardwired RR1-vs-RR2 model with up to **4 selectable routers**:
- Template vars `router1`-`router4` (query type); `router1`/`router2` required,
`router3`/`router4` default to a "— none —" sentinel and their panels hide
when unset.
- **Presence matrix** — rows = prefixes, columns = selected routers, cell =
present / next-hop / origin-AS; the core view.
- **Divergence view** — table of prefixes where the selected routers disagree
(missing on some, or differing best-path attributes).
- Keep the per-prefix all-paths drill-down from the RR diff.
- The global feed (E1) is selectable as any of the 4 → "lab vs the real
Internet." The existing `rr-locrib-diff` stays as the RR-specific quick view.
### E3. Global table exploration dashboard
**File**: `obmp-grafana/dashboards/.../global_table.json` (new)
Explorable dashboard over the `GLOBAL-FEED` peer: prefix count by AFI,
origin-AS distribution, prefix-length histogram, search by prefix/AS,
more-/less-specific lookups. Doubles as the comparison baseline for E2.
### E4. VRF / RD awareness
**Files**: existing unicast + L3VPN dashboards
Thread a Route-Distinguisher / VRF scoping dimension through the dashboards:
- Add a `vrf` / `rd` template variable to the L3VPN dashboards and unicast
dashboards where applicable.
- VRF/RD columns and filters on RIB tables.
- The diff (E2) gains a per-VRF scope.
Constraint (stated plainly): CML IOS-XR images can't originate L3VPN routes
and the global feed carries none — so E4 is **built to the L3VPN schema and
unverifiable in this lab**; it validates only against production routers.
Keep E4 scope minimal until there's a real L3VPN source.
### E5. L2VPN / EVPN support — platform-level, not a dashboard task
L2VPN/EVPN was requested alongside L3VPN. **It cannot be done as a dashboard
change.** Research findings on where the gap actually is:
- **Collector** (`openbmp/collector`) — *already decodes EVPN*. It has an
`EVPN.cpp` parser and emits a parsed `openbmp.parsed.evpn` Kafka topic
(RD, ESI, MAC, ethernet-tag, IP, labels, route-targets). No work needed.
- **psql-app** (`openbmp/psql-app`) — **drops it**. It never subscribes to
`openbmp.parsed.evpn`, has no `EvpnQuery` handler, and the PostgreSQL
schema has no EVPN table. This is the whole gap.
- **L2VPN-VPLS (SAFI 65)** — not supported anywhere; only EVPN (AFI 25).
Two viable paths:
1. **Fork the psql-app** (Java): subscribe to the evpn topic, add an
`EvpnQuery` class, add an `evpn_rib` table + history/stats. Keeps one
unified schema; cost is owning a Java fork of a slow-moving upstream and
inheriting the collector's older EVPN parser (likely no RFC 9251/9572
route types).
2. **Run GoBMP** (`sbezverk/gobmp`, Go) as a second collector — strongest,
most current EVPN decoding — plus a thin Kafka→Postgres consumer landing
an `evpn_rib` table. Less code than the Java fork, but two collectors and
two ingest paths.
Recommended: path 2 for fastest EVPN visibility; path 1 if a single unified
OpenBMP schema outweighs the extra effort. Either way, then build EVPN
dashboards (per-EVI, MAC mobility, RT scoping).
**Status — lab-testable scope DONE (path 1, type-2/3):**
- `evpn_rib` table — `postgres/scripts/007_obmp_evpn.sql`.
- `gobgp-evpn` — profile-gated synthetic EVPN injector (`evpn-test` profile).
- `obmp-evpn-consumer` — standalone Python consumer, `openbmp.parsed.evpn`
`evpn_rib` (the gap path 1 describes, done without forking the Java
psql-app — a small isolated container instead).
- `EVPN RIB` Grafana dashboard (OBMP-L3VPN folder).
- Verified end to end with synthetic type-2/type-3 routes.
**Known limitation:** collector 2.2.3 **mis-decodes EVPN type-5** (IP-prefix)
— the prefix corrupts the RD field — so type-5 is not ingested. Full type-5
support still needs path 2 (GoBMP) or a newer/fixed collector. Real EVPN
(vs the synthetic injector) also needs an EVPN-capable BMP source — the CML
IOS-XR lab has none.
### E-scale. PostgreSQL sizing for a full table
A full v4+v6 table is ~1.2M prefixes; with attributes and history this is a
multi-GB addition to `ip_rib` / `ip_rib_log`. Before enabling E1 continuously:
confirm disk headroom on `$OBMP_DATA_ROOT`, apply TimescaleDB compression to
`ip_rib_log` (also flagged in C6). The `mv_as_adjacency` materialized view
(already in place — `postgres/scripts/006_obmp_matviews.sql`) becomes far
more valuable once real-Internet AS paths are present.
---
## Implementation Order
| Priority | Step | Track | Description |
|----------|------|-------|-------------|
| 1 | A1 | Foundation | Create `inventory.yaml` |
| 2 | A2 | Foundation | Create `config_loader.py` |
| 3 | A3 | Foundation | Refactor hardcoded Python scripts |
| 4 | A4 | Foundation | Parameterize `.env` + docker-compose |
| 5 | A5-A6 | Foundation | Telegraf + InfluxDB datasource fixes |
| 6 | B1 | CML Dev | Dynamic ExaBGP multi-peer |
| 7 | B2-B4 | CML Dev | CML API client + deploy CLI |
| 8 | C1 | Production | Multi-vendor NETCONF (Junos support) |
| 9 | C3 | Production | Junos BMP config templates |
| 10 | C5 | Production | Security hardening |
| 11 | D1-D2 | Packaging | Config templates + bootstrap script |
| 12 | D3 | Packaging | Publish Docker images to registry |
| 13 | D4 | Packaging | Documentation |
| 14 | E1 | Analytics | GoBGP full-table feed (AS57355 live + MRT fallback) |
| 15 | E2 | Analytics | Generic 4-router diff dashboard |
| 16 | E3 | Analytics | Global table exploration dashboard |
| 17 | E4 | Analytics | VRF/RD scoping for L3VPN (to schema, lab-unverifiable) |
| 18 | E5 | Platform | L2VPN/EVPN support — research spike, then collector/schema work |
Steps 1-5 (Track A) unblock everything else. Steps 6-7 and 8-10 can proceed in parallel once the foundation is in place. Track E is independent of A-D: E1 is the foundation for E2/E3; E4 can proceed any time but is lab-unverifiable.
---
## Verification
1. **Config centralization**: Change a router IP in `inventory.yaml`, verify all scripts pick it up
2. **ExaBGP multi-peer**: Set 3+ peers, restart, verify BGP sessions establish
3. **CML API**: `deploy.py --env cml-lab1 status` connects and lists nodes
4. **BMP multi-source**: Router from lab 2 sends BMP, appears in `SELECT * FROM routers` and Grafana
5. **Junos support**: NETCONF script connects to a Juniper router, pushes config
6. **Production dry-run**: Point a test router from the ISP network at the collector, verify end-to-end
7. **Clean deploy**: Clone repo on a fresh host, run `setup.sh`, `docker compose up`, confirm stack starts
---
## Risks
- **Router name collisions**: Enforce unique hostnames across all environments
- **Address space overlap**: Each environment needs distinct management subnets
- **Juniper BMP differences**: Junos BMP implementation may differ in supported tables/TLVs — test early
- **Production scale**: 500K-route labs are slow; production full tables will stress PostgreSQL more
- **Credentials in inventory**: Must be gitignored; consider env var fallback for CI/CD
- **Volunteer route server (E1)**: the AS57355 full-table feed has no SLA and can flap or be retired — the 2-hourly MRT fallback is mandatory, not optional
- **Full-table DB growth (E1)**: a live global feed roughly triples `ip_rib`; size disk and enable `ip_rib_log` compression before turning it on continuously
- **VRF work unverifiable (E4)**: no L3VPN source in the CML lab — E4 ships to schema correctness only, validated later against production

223
docs/backup-restore.md Normal file
View File

@ -0,0 +1,223 @@
# OpenBMP Backup & Restore
How to back up and restore the OpenBMP PostgreSQL database, what the backup
covers, and what it deliberately does not.
---
## What `scripts/pg-backup.sh` backs up
The script runs `pg_dump` inside the `obmp-psql` container and produces a
single timestamped, compressed, custom-format dump of the **entire `openbmp`
database**:
- All BMP/BGP operational tables — `routers`, `bgp_peers`, `ip_rib`,
`base_attrs`, `global_ip_rib`, `l3vpn_rib`, the `ls_*` link-state tables.
- All history / TimescaleDB hypertables — `ip_rib_log`, `peer_event_log`,
`stat_reports`, and the `stats_*` aggregate tables.
- Reference / enrichment data — `geo_ip`, `info_asn`, `info_route`,
`rpki_validator`, `pdb_exchange_peers`.
- Schema objects — table definitions, indexes, views, functions, triggers,
enum types, and the TimescaleDB hypertable configuration.
The dump is taken against a **live database**`pg_dump` uses an MVCC
snapshot, so no downtime and no service stop is required. It is written
atomically (to a `.partial` file, renamed on success) so an interrupted run
never leaves a dump that looks valid but is truncated.
Output: `${OBMP_DATA_ROOT:-/var/openbmp}/backups/openbmp-YYYYMMDD-HHMMSS.dump`
### TimescaleDB note
The OpenBMP database uses TimescaleDB hypertables (`ip_rib_log`,
`peer_event_log`, the `stats_*` tables, with compression policies).
**A `pg_dump` logical backup restores hypertables correctly** — the dump
captures the `_timescaledb_catalog` metadata, and on restore the hypertable
structure, chunks, and compression settings are recreated. No special flags
are needed for the dump. The only requirement is that the **restore target
has the TimescaleDB extension available** — which the `openbmp/postgres`
image provides, so restoring into a fresh `obmp-psql` works out of the box.
---
## Scheduling
Make the script executable once:
```bash
chmod +x scripts/pg-backup.sh
```
Add a cron entry (`crontab -e`) — daily at 02:30, logging to a file:
```cron
30 2 * * * OBMP_DATA_ROOT=/var/openbmp /home/user/obmp-docker/scripts/pg-backup.sh >> /var/openbmp/backups/pg-backup.log 2>&1
```
The cron user must be able to reach the Docker daemon — run it as a user in
the `docker` group, or as root. A systemd timer is an equally valid
alternative.
### Configuration
All settings are environment variables with sensible defaults:
| Variable | Default | Purpose |
|----------|---------|---------|
| `OBMP_DATA_ROOT` | `/var/openbmp` | Base data dir; backups go to `${OBMP_DATA_ROOT}/backups` |
| `OBMP_BACKUP_DIR` | (unset) | Explicit backup dir, overrides the default |
| `OBMP_PG_CONTAINER` | `obmp-psql` | Postgres container name |
| `OBMP_PG_DB` | `openbmp` | Database name |
| `OBMP_PG_USER` | `openbmp` | Database user |
| `OBMP_BACKUP_RETENTION_DAYS` | `14` | Dumps older than this are pruned each run |
Retention only prunes files matching the script's own `openbmp-*.dump`
naming pattern — nothing else in the directory is touched.
### Production recommendations
- **Copy dumps off-host.** A local backup does not survive host loss. Sync
the backup directory to object storage / a backup server (e.g. nightly
`rclone`, `restic`, or your existing ISP backup tooling).
- **Size the backup volume** — at production scale (~100150M NLRIs) the
dump can be tens of GB even compressed. See `docs/production-sizing.md`.
- **Test restores periodically** — an untested backup is not a backup.
- For tighter RPO than once-daily logical dumps, consider PostgreSQL
continuous archiving / PITR (WAL archiving + `pg_basebackup`). That is out
of scope for this script but worth planning for a production deployment.
---
## Restore procedure
This restores a dump into a **fresh, empty** `obmp-psql` database. Restoring
over a populated database risks conflicts — start clean.
### 1. Stop the writers
Stop the services that write to the database so nothing races the restore:
```bash
docker compose -p obmp stop psql-app collector
```
Leave `obmp-psql` running.
### 2. Recreate an empty database
Drop and recreate the `openbmp` database inside the running container:
```bash
docker exec -i obmp-psql psql -U openbmp -d postgres <<'EOSQL'
DROP DATABASE IF EXISTS openbmp;
CREATE DATABASE openbmp OWNER openbmp;
EOSQL
```
> Restoring into a **brand-new container**? Bring `obmp-psql` up first and let
> it initialize, but **do not** create the `config/init_db` trigger file —
> the schema comes from the dump, not from psql-app's first-run migration.
### 3. Restore the dump
Copy the dump into the container and run `pg_restore`:
```bash
DUMP=/var/openbmp/backups/openbmp-YYYYMMDD-HHMMSS.dump
docker cp "${DUMP}" obmp-psql:/tmp/restore.dump
docker exec -i obmp-psql \
pg_restore -U openbmp -d openbmp --no-owner --no-privileges \
--jobs=4 /tmp/restore.dump
docker exec obmp-psql rm -f /tmp/restore.dump
```
- `--no-owner --no-privileges` — the dump was created with the same flags;
objects are recreated owned by the connecting role.
- `--jobs=4` — parallel restore; raise it on a many-core host to speed up the
large `ip_rib` / `ip_rib_log` tables. Custom-format dumps support this.
- Some non-fatal warnings (e.g. about the TimescaleDB extension or existing
objects) are normal. A non-zero exit with only warnings is usually fine —
inspect the output before assuming failure.
Alternatively, stream the restore without `docker cp`:
```bash
docker exec -i obmp-psql pg_restore -U openbmp -d openbmp \
--no-owner --no-privileges < "${DUMP}"
```
(Streaming via stdin disables `--jobs` parallelism — use `docker cp` for
large dumps.)
### 4. Verify
```bash
docker exec -i obmp-psql psql -U openbmp -d openbmp -c "
SELECT (SELECT count(*) FROM routers) AS routers,
(SELECT count(*) FROM bgp_peers) AS peers,
(SELECT count(*) FROM ip_rib) AS rib_rows;"
```
Confirm hypertables came back:
```bash
docker exec -i obmp-psql psql -U openbmp -d openbmp -c "
SELECT hypertable_name FROM timescaledb_information.hypertables;"
```
### 5. Restart the writers
```bash
docker compose -p obmp start collector psql-app
```
The collector reconnects to the routers' BMP sessions and psql-app resumes
consuming from Kafka. Live state catches up from the routers.
---
## What is NOT covered
This backup is **PostgreSQL only**. The following are out of scope and need
their own handling:
- **Kafka data is transient.** The `obmp-kafka` topics are a short-retention
pipeline buffer (`KAFKA_LOG_RETENTION_MINUTES: 720` — 12 hours). They are
not a system of record and do not need backing up. After a restore, routers
re-send BMP and the pipeline refills naturally.
- **InfluxDB telemetry has its own backup.** The gNMI streaming-telemetry
data lives in `obmp-influxdb` (bucket `telemetry`), not in PostgreSQL.
`pg_dump` does not touch it. Back it up separately with the Influx CLI:
```bash
# Backup
docker exec obmp-influxdb influx backup /var/lib/influxdb2/backup \
--token "$INFLUXDB_ADMIN_TOKEN"
docker cp obmp-influxdb:/var/lib/influxdb2/backup \
/var/openbmp/backups/influxdb-$(date +%Y%m%d)
# Restore
docker cp /var/openbmp/backups/influxdb-YYYYMMDD \
obmp-influxdb:/var/lib/influxdb2/restore
docker exec obmp-influxdb influx restore /var/lib/influxdb2/restore \
--token "$INFLUXDB_ADMIN_TOKEN"
```
Telemetry is also less critical than BMP data (30-day retention,
data-plane counters) — back it up if you need historical telemetry to
survive a host loss; otherwise the 30-day window simply re-fills.
- **Grafana** — dashboards and datasources are provisioned from files in the
repo (`obmp-grafana/provisioning/` and `obmp-grafana/dashboards/`), so they
are already version-controlled in git. The Grafana database under
`${OBMP_DATA_ROOT}/grafana` (users, preferences, manually-created
dashboards, alert state) is *not* covered by this script — back up that
directory separately if it holds anything not reproducible from the repo.
- **Configuration & secrets**`.env`, `docker-compose.yml`, and the
`${OBMP_DATA_ROOT}/config` directory. Keep these in version control /
your secrets manager.

146
docs/production-sizing.md Normal file
View File

@ -0,0 +1,146 @@
# OpenBMP Production Sizing — 40 Full-Table-Edge Routers
Sizing guidance for deploying the OpenBMP stack against a production ISP
network of **40 full-table-edge routers** with gNMI streaming telemetry.
Derived from the OpenBMP `psql-app` sizing guidance and measured lab behavior.
## Workload assumptions
| Parameter | Value |
|-----------|-------|
| Monitored routers | 40, full-table edge |
| BMP RIB scope | Adj-RIB-In (see recommendation below) |
| Full feeds per router | ~23 eBGP peers carrying the full DFZ |
| Routes per full feed | ~1.2M (≈1M IPv4 + ~0.2M IPv6) |
| **Estimated total NLRIs** | **~100150M** in Adj-RIB-In |
| Telemetry | gNMI via Telegraf → InfluxDB, ~50200 interfaces/router, 10 s interval |
| History retention | `ip_rib_log` 2 months, LS logs 8 weeks, `peer_event_log` 4 months (lab policy defaults; tunable) |
The NLRI estimate (40 × ~2.5 feeds × 1.2M) places this deployment at the top
of the OpenBMP `psql-app` guidance tier (150M NLRIs → 64 GB heap).
## Measured data point (lab, 2026)
Real numbers from the lab after adding **one** full-table feed (GoBGP →
AS57355, ~1.04M IPv4 + ~0.25M IPv6 routes):
| Metric | Before feed | After 1 full feed |
|--------|-------------|-------------------|
| `openbmp` DB size | ~25 GB | **~30 GB** |
| `ip_rib` (current state) | small | 5.3 GB |
| `ip_rib_log` (history hypertable) | — | 7.75 GB, 82/97 chunks compressed |
| `base_attrs` | ~1 GB | 2.3 GB |
| `geo_ip` (fixed reference data) | 8.8 GB | 8.8 GB |
So **one full feed ≈ +5 GB current-state**, plus history that accrues against
the 2-month `ip_rib_log` retention. The ~1.3M-route initial dump ingested in
minutes with no Kafka consumer lag. Extrapolating linearly, 40 routers × ~2.5
feeds ≈ 100 feed-equivalents → on the order of **0.5 TB current state** before
history and indexes; the 24 TB storage target below holds with headroom.
## BMP RIB scope — recommendation
**Deploy with Adj-RIB-In only.** It is the OpenBMP default, is what every
dashboard is built on, and captures the highest-value data — what each peer
advertises. Alternatives and their cost:
- **Loc-RIB** — adds a full post-best-path converged table per router
(~40 × 1.2M ≈ +48M NLRIs). Add later, selectively, only where best-path
analysis is needed; verify the IOS-XR release supports Loc-RIB BMP.
- **Adj-RIB-Out** — multiplies further (per advertised peer). Not recommended
for the initial deployment.
- **Post-policy Adj-RIB-In** — if inbound policy is restrictive this trims
volume meaningfully; with permissive import it is similar to pre-policy.
## Compute & memory
| Component | Lab today | Production target | Rationale |
|-----------|-----------|-------------------|-----------|
| **Total RAM** | 31 GB | **96128 GB** | psql-app heap 4864 GB + PostgreSQL shared_buffers/cache + Kafka 48 GB + InfluxDB + Grafana + collector |
| **CPU** | 8 cores | **1632 vCPU** | PostgreSQL is CPU-bound under full-table churn — lab psql already sustains ~287% (3 cores) at 18 routers |
| `psql-app` JVM heap (`MEM`) | 3 GB | **4864 GB** | OpenBMP guidance: 4 GB ≈ 10M NLRIs, 64 GB ≈ 150M NLRIs |
| `psql-app` container `mem_limit` | 4 GB | **heap + ~8 GB** | Set `PSQL_APP_MEM_LIMIT` above the JVM heap |
| `psql` container `mem_limit` | 6 GB | **4864 GB** | Set `PSQL_MEM_LIMIT`; PostgreSQL wants ~25% as `shared_buffers` and the rest for OS cache |
| `kafka` container `mem_limit` | 4 GB | **812 GB** | Set `KAFKA_MEM_LIMIT`; full-table initial dumps from 40 routers are bursty |
## Storage
| Store | Lab today | Production target | Notes |
|-------|-----------|-------------------|-------|
| **PostgreSQL** | 30 GB | **24 TB NVMe SSD** | `ip_rib` current state (~100150M rows) + `ip_rib_log` history (2-month retention, the dominant grower) + `base_attrs` + `geo_ip` (~9 GB fixed). OpenBMP guidance: 500 GB main + 1 TB TimescaleDB; add headroom. |
| **Kafka** | 0.2 GB | **100500 GB** | 12 h retention; sized for full-table initial-dump bursts × 40 routers |
| **InfluxDB (telemetry)** | minimal | **50200 GB** | 40 routers × ~50200 interfaces × 10 s gNMI × 30 d; compresses well |
| **Total** | — | **~35 TB fast NVMe** | Use NVMe; PostgreSQL random-IO under churn is the bottleneck on slow disks |
Put the PostgreSQL data directory and the TimescaleDB tablespace on NVMe.
`ip_rib_log` retention (2 months in the lab) is the main storage tuning knob
— revisit once production update volume is measured; halving it roughly
halves the dominant history table.
## Architecture
A single host is viable only if large (**≥128 GB RAM, ≥32 vCPU, multi-TB
NVMe**). **Preferred: split services across hosts**
| Host | Services | Profile |
|------|----------|---------|
| **DB host** (heaviest) | postgres | — |
| **Pipeline host** | kafka, zookeeper, collector, psql-app | core |
| **Presentation host** | grafana, influxdb, telegraf, whois | core + telemetry |
Whichever layout: every service already carries a Compose `mem_limit` — raise
`PSQL_MEM_LIMIT` / `PSQL_APP_MEM_LIMIT` / `KAFKA_MEM_LIMIT` in `.env` for the
production hosts.
## Horizontal scaling — where it actually helps
The ingestion bottleneck is **not** the collector or Kafka — it is the
`psql-app` consumer writing to PostgreSQL, and ultimately **disk IOPS**.
Plan scaling accordingly:
- **Scale `psql-app` as a Kafka consumer group.** Run multiple `psql-app`
containers with the **same group ID**; Kafka rebalances partitions across
them and fails over automatically. This is the real throughput lever and
also provides HA. **Hard cap = Kafka partition count** — the compose sets
`KAFKA_NUM_PARTITIONS: 8`, so ≤ 8 useful instances. **Raise the partition
count before scaling past a few consumers** — it cannot easily be reduced
later.
- **Disk IOPS is the named bottleneck.** Target **≥ 5000 IOPS** (NVMe) for
the PostgreSQL store; this buys more headroom than any container count.
- **Multiple collectors are an HA / locality decision, not a throughput
one.** A BMP session is one stateful TCP connection and cannot be load
balanced — you distribute routers by pointing each router's `bmp server`
config at a specific collector. All collectors feed one Kafka. Shard
collectors for fault isolation / POP locality, not for performance, and
note a dead collector's routers go dark until reconfigured (no auto-
failover at the collector tier).
- Within one `psql-app`, writer threads already auto-scale per type
(`writer_max_threads_per_type`); the consumer-group is the across-instance
layer on top.
Bursts (every collector restart triggers simultaneous full-table dumps from
all peers) are absorbed by Kafka — size Kafka retention so a slow consumer
never loses data during a convergence storm.
## PostgreSQL tuning
- `shared_buffers` ≈ 25% of host RAM; large `effective_cache_size`.
- Raise `work_mem` (dashboard aggregate queries) and `maintenance_work_mem`.
- `max_wal_size` already 10 GB — keep or raise for churn bursts.
- Enable parallel query (`max_parallel_workers_per_gather`).
- Aggressive autovacuum on churn tables (`ip_rib`, `base_attrs`, `ip_rib_log`)
— applied in the lab; persist these settings in production provisioning.
- TimescaleDB compression is already enabled on `ip_rib_log` and the `stats_*`
hypertables — keep it.
## Reference bill of materials (single-host option)
| Resource | Spec |
|----------|------|
| CPU | 32 vCPU |
| RAM | 128 GB |
| Storage | 4 TB NVMe SSD |
| Network | 1 GbE+ to the routers' BMP source network |
For the split-host option, divide per the architecture table — the DB host
takes the bulk of RAM and all of the fast storage.

488
docs/security-hardening.md Normal file
View File

@ -0,0 +1,488 @@
# OpenBMP Production Security Hardening
A prioritized checklist for hardening the OpenBMP Docker stack before exposing
it to a production ISP network of 40 full-table-edge routers. Work top to
bottom — items are ordered roughly by risk reduction per unit effort.
This document **recommends** changes. It does not modify `docker-compose.yml`
or any running service. Apply the changes in a maintenance window and test.
> Threat model in brief: the stack ingests BMP from production routers, stores
> the full DFZ in PostgreSQL, and exposes Grafana to operators. The crown
> jewels are (a) the database, (b) the Grafana admin plane, and (c) the BMP
> ingest port. Everything below protects one of those three.
---
## Priority 0 — Credentials (do this first)
Every service currently ships with the placeholder credential `openbmp` and
related defaults are committed in `docker-compose.yml`:
| Service | Setting | Current value |
|---------|---------|---------------|
| PostgreSQL | `POSTGRES_USER` / `POSTGRES_PASSWORD` | `openbmp` / `openbmp` |
| psql-app | `POSTGRES_PASSWORD` | `openbmp` |
| whois | `POSTGRES_PASSWORD` | `openbmp` |
| Grafana | `GF_SECURITY_ADMIN_PASSWORD` | `openbmp` |
| InfluxDB | `DOCKER_INFLUXDB_INIT_PASSWORD` | `openbmp123` |
| InfluxDB | `DOCKER_INFLUXDB_INIT_ADMIN_TOKEN` | `openbmp-telemetry-token` |
| Grafana datasource | `secureJsonData.password` | `openbmp` (in `openbmp-ds.yml`) |
### 0.1 Move every secret to `.env` (or a secrets manager)
`.env` is git-ignored. As a minimum, replace the hardcoded literals in
`docker-compose.yml` with `${VAR}` references and define them in `.env`:
```env
# .env — never commit this file
POSTGRES_PASSWORD=<long-random-string>
GF_SECURITY_ADMIN_PASSWORD=<long-random-string>
INFLUXDB_ADMIN_PASSWORD=<long-random-string>
INFLUXDB_ADMIN_TOKEN=<long-random-token>
```
```yaml
# docker-compose.yml (recommended edit — operator applies)
grafana:
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GF_SECURITY_ADMIN_PASSWORD:?set in .env}
psql:
environment:
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:?set in .env}
```
The `:?` form makes the stack fail fast if a secret is missing rather than
silently falling back to a default.
Generate strong values:
```bash
openssl rand -base64 32 # passwords
openssl rand -hex 32 # tokens
```
### 0.2 For a real production deployment, use a secrets manager
`.env` on disk is better than committed literals, but it is still a
plaintext file readable by anyone in the `docker` group. For production:
- **Docker Compose secrets** (`secrets:` block, files mounted at
`/run/secrets/...`) — the lowest-friction upgrade; keep the secret files
outside the repo, `chmod 600`, owned by root.
- **HashiCorp Vault**, **AWS Secrets Manager**, **Bitwarden Secrets**, or your
existing ISP secret store — inject at deploy time via a wrapper that renders
`.env` from the vault and shreds it after `docker compose up`.
Whatever the choice: rotate all six credentials above on first production
deploy — they have been in git history as `openbmp` and must be considered
compromised.
### 0.3 Rotate the Grafana datasource password in lockstep
`obmp-grafana/provisioning/datasources/openbmp-ds.yml` carries
`secureJsonData.password`. It is read at Grafana start. When you change the
PostgreSQL password, update this file too (it supports `$__file{}` and
env-var expansion: `password: $POSTGRES_PASSWORD`) and restart Grafana.
---
## Priority 1 — Network exposure / firewalling
The host currently publishes these ports to `0.0.0.0`: 5000 (BMP), 5432
(PostgreSQL), 9092 (Kafka), 3000 (Grafana), 8086 (InfluxDB), 4300 (whois),
9091 (Authelia). Most should not be world-reachable.
### 1.1 BMP collector (port 5000) — restrict to router management subnets
The collector accepts a BMP session from any source. A rogue BMP feed can
inject bogus routers/peers/prefixes into the database. Firewall it to the
router management subnets only.
`nftables` example (preferred on modern hosts):
```nft
# /etc/nftables.conf — adjust subnets to your router management ranges
table inet obmp {
chain input {
type filter hook input priority 0; policy accept;
# BMP ingest — only from router management subnets
tcp dport 5000 ip saddr { 10.100.0.0/24, 10.100.1.0/24 } accept
tcp dport 5000 drop
}
}
```
`iptables` equivalent:
```bash
iptables -A INPUT -p tcp --dport 5000 -s 10.100.0.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 5000 -s 10.100.1.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 5000 -j DROP
```
> Docker's `iptables` integration uses the `DOCKER-USER` chain for
> container-published ports. Put the rules above in `DOCKER-USER` so Docker
> does not bypass them:
> ```bash
> iptables -I DOCKER-USER -p tcp --dport 5000 -s 10.100.0.0/24 -j RETURN
> iptables -I DOCKER-USER -p tcp --dport 5000 -s 10.100.1.0/24 -j RETURN
> iptables -A DOCKER-USER -p tcp --dport 5000 -j DROP
> ```
### 1.2 PostgreSQL (5432), Kafka (9092), InfluxDB (8086), whois (4300)
None of these need to be reachable from outside the stack:
- **PostgreSQL** — only `psql-app`, `whois`, and `grafana` connect, all on the
Compose network. Bind the published port to loopback only, or drop the
`ports:` mapping entirely:
```yaml
# docker-compose.yml — psql service
ports:
- "127.0.0.1:5432:5432" # localhost only; or remove entirely
```
- **Kafka 9092** — see Priority 2.
- **InfluxDB 8086** — only Grafana and Telegraf use it; bind to loopback or
drop the mapping (Telegraf uses host networking and reaches it via
localhost; Grafana reaches it on the Compose network).
- **whois 4300** — expose only if you actually offer a public whois service;
otherwise bind to loopback.
For anything that genuinely must be reachable, restrict by source with the
firewall pattern from 1.1.
### 1.3 Grafana (3000) — keep it behind Authelia
Authelia already fronts Grafana (the `auth` profile + `GF_AUTH_PROXY_*`
settings). Make that the *only* path:
- Bind Grafana's published port to loopback: `127.0.0.1:3000:3000`, and let
the reverse proxy / Authelia terminate TLS and reach it internally.
- Do **not** leave port 3000 directly reachable — `GF_AUTH_PROXY_ENABLED=true`
trusts the `Remote-User` header, so any client that can reach 3000 directly
and set that header bypasses authentication entirely.
---
## Priority 2 — Kafka transport security
Kafka is currently **PLAINTEXT** and advertises a host-IP listener:
```yaml
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092,PLAINTEXT_HOST://${HOST_IP}:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
```
The `obmp-kafka:29092` listener is internal to the Compose network and is the
only one the collector and psql-app use. The `PLAINTEXT_HOST://...:9092`
listener exists only for outside access and is not needed by the core stack.
**Recommended (simplest, most secure): remove the host listener.** If nothing
outside the Compose network consumes Kafka, drop the `9092` port mapping and
the `PLAINTEXT_HOST` advertised listener so Kafka is reachable only on the
internal Docker network:
```yaml
kafka:
# remove the - "9092:9092" ports entry
environment:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
```
**If external Kafka access is genuinely required** (e.g. a separate analytics
consumer, or the split-host architecture in `production-sizing.md` where
Kafka and the DB are on different hosts), do **not** leave it PLAINTEXT on a
routed network. Enable SASL_SSL on the external listener:
```yaml
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092,SASL_SSL://${HOST_IP}:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,SASL_SSL:SASL_SSL
KAFKA_SASL_ENABLED_MECHANISMS: SCRAM-SHA-512
KAFKA_SSL_KEYSTORE_LOCATION: /etc/kafka/secrets/kafka.keystore.jks
KAFKA_SSL_KEYSTORE_PASSWORD: ${KAFKA_KEYSTORE_PASSWORD}
KAFKA_SSL_KEY_PASSWORD: ${KAFKA_KEY_PASSWORD}
KAFKA_SSL_TRUSTSTORE_LOCATION: /etc/kafka/secrets/kafka.truststore.jks
KAFKA_SSL_TRUSTSTORE_PASSWORD: ${KAFKA_TRUSTSTORE_PASSWORD}
```
Keep the internal `PLAINTEXT://obmp-kafka:29092` listener for the collector
and psql-app — intra-Compose traffic on a private bridge does not need TLS and
adding SASL there means re-configuring both clients. At minimum, never publish
a PLAINTEXT Kafka listener on an IP that routes beyond the host.
---
## Priority 3 — PostgreSQL hardening
### 3.1 Change the default `openbmp` / `openbmp` credentials
Covered in Priority 0. Note that `POSTGRES_USER`/`POSTGRES_PASSWORD` only take
effect when the data directory is initialized. To rotate on an existing
database, change the password in SQL and update every consumer:
```bash
docker exec -it obmp-psql psql -U openbmp -d openbmp \
-c "ALTER ROLE openbmp WITH PASSWORD '<new-strong-password>';"
```
Then update `POSTGRES_PASSWORD` for `psql-app` and `whois`, the
`secureJsonData.password` in `openbmp-ds.yml`, and restart those services.
### 3.2 Create a least-privilege role for Grafana
Grafana only needs to read. Do not let it connect as the owning role:
```sql
CREATE ROLE grafana_ro LOGIN PASSWORD '<strong-password>';
GRANT CONNECT ON DATABASE openbmp TO grafana_ro;
GRANT USAGE ON SCHEMA public TO grafana_ro;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO grafana_ro;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO grafana_ro;
```
Point `openbmp-ds.yml` at `grafana_ro`. This contains a Grafana compromise to
read-only and blocks SQL-panel writes.
### 3.3 Restrict `pg_hba.conf`
The default OpenBMP image is permissive (`host all all all md5` or similar).
Tighten it so only the stack's own subnet can connect, and require
`scram-sha-256`:
```conf
# pg_hba.conf (inside the obmp-psql container / mounted)
# TYPE DATABASE USER ADDRESS METHOD
local all all scram-sha-256
host openbmp openbmp 172.16.0.0/12 scram-sha-256 # Docker bridge range
host openbmp grafana_ro 172.16.0.0/12 scram-sha-256
hostssl openbmp openbmp 0.0.0.0/0 scram-sha-256 # only if remote DB host
# reject everything else
host all all 0.0.0.0/0 reject
```
Identify the actual Compose network subnet with
`docker network inspect obmp_default` and scope `ADDRESS` to it. Reload with
`docker exec obmp-psql psql -U openbmp -c "SELECT pg_reload_conf();"`.
> `scram-sha-256` requires `password_encryption = scram-sha-256` in
> `postgresql.conf` and that passwords were set/rotated *after* that change.
### 3.4 Enable SSL/TLS
The Grafana datasource already requests `sslmode: "require"` — but the server
must actually present a certificate. In `postgresql.conf`:
```conf
ssl = on
ssl_cert_file = '/var/lib/postgresql/server.crt'
ssl_key_file = '/var/lib/postgresql/server.key'
```
Generate a cert (self-signed is acceptable for an internal DB; use your
internal CA if you have one):
```bash
openssl req -new -x509 -days 825 -nodes -text \
-out server.crt -keyout server.key -subj "/CN=obmp-psql"
chmod 600 server.key # PostgreSQL refuses a world-readable key
```
Mount both files into the container's data directory. For the strongest
posture, move clients to `sslmode: verify-full` once a proper CA chain is in
place. This is most important if PostgreSQL runs on a separate host (the
split-host architecture in `production-sizing.md`) — intra-host Compose
traffic is lower-risk but TLS is still recommended.
### 3.5 Limit listen addresses
If PostgreSQL must accept connections from another host (split-host layout),
keep `listen_addresses` scoped — do not leave it at `*` if a single interface
suffices:
```conf
listen_addresses = 'localhost,172.18.0.1' # loopback + Docker bridge gateway
```
On a single-host deployment, drop the `5432` port mapping entirely (1.2) so
the listener is reachable only on the Compose network.
---
## Priority 4 — Drop `privileged: true` on the `psql` service
```yaml
psql:
privileged: true # <-- remove or replace
shm_size: 1536m
sysctls:
- net.ipv4.tcp_keepalive_intvl=30
- net.ipv4.tcp_keepalive_probes=5
- net.ipv4.tcp_keepalive_time=180
```
**Why it is a risk:** `privileged: true` gives the container *all* Linux
capabilities, disables seccomp/AppArmor confinement, and grants access to all
host devices. A compromise of PostgreSQL — the process most exposed to
untrusted route data — would then be a near-complete host compromise. This is
the single largest container-isolation gap in the stack.
**Why it is probably there:** PostgreSQL needs adequate shared memory and
benefits from the TCP keepalive `sysctls`. The compose file already sets
`shm_size: 1536m` and the `sysctls:` list explicitly — both of which Docker
applies *without* needing privileged mode. So `privileged: true` is most
likely a leftover, not a hard requirement.
**Recommended action — test without it:**
1. In a maintenance window, remove `privileged: true` and start the service.
2. Confirm PostgreSQL starts, the namespaced `sysctls` apply
(`docker exec obmp-psql sysctl net.ipv4.tcp_keepalive_time`), and shared
memory is honored (`docker exec obmp-psql cat /proc/meminfo | grep Shmem`,
and watch for `could not resize shared memory segment` errors in the log).
3. If everything is healthy, leave it removed.
If a specific capability turns out to be needed, add only that one instead of
going fully privileged:
```yaml
psql:
# privileged: true <-- removed
shm_size: 1536m
cap_drop:
- ALL
cap_add:
- CHOWN
- SETUID
- SETGID
- DAC_OVERRIDE # add only capabilities proven necessary by testing
sysctls:
- net.ipv4.tcp_keepalive_intvl=30
- net.ipv4.tcp_keepalive_probes=5
- net.ipv4.tcp_keepalive_time=180
```
The `sysctls:` block stays — those are namespaced and do not require
privileged mode.
---
## Priority 5 — Container hardening (defense in depth)
Apply across services after the higher-priority items. Test each service
individually — `read_only` in particular will surface paths a service writes
to that then need explicit `tmpfs` mounts.
### 5.1 `no-new-privileges`
Prevents a process inside a container from gaining privileges via setuid
binaries. Safe to apply to every service:
```yaml
security_opt:
- no-new-privileges:true
```
### 5.2 Drop capabilities
Most of these services need almost no Linux capabilities. Start from zero and
add back only what breaks:
```yaml
cap_drop:
- ALL
```
- `grafana`, `whois`, `portal`, `zookeeper` — typically run fine with
`cap_drop: [ALL]`.
- `collector`, `kafka`, `psql`, `psql-app` — drop ALL, then add back any
capability proven necessary (see Priority 4 for `psql`).
- `traffic-gen*` legitimately need `NET_RAW`/`NET_ADMIN` (Scapy) — leave those
`cap_add` entries; they are already minimal.
### 5.3 Read-only root filesystem
Make the root filesystem immutable where the service only writes to known
volumes:
```yaml
grafana:
read_only: true
tmpfs:
- /tmp
# /var/lib/grafana is already a bind mount — writes go there, not to rootfs
portal:
read_only: true # nginx:alpine static site; add tmpfs for nginx
tmpfs:
- /tmp
- /var/cache/nginx
- /var/run
```
`read_only` is straightforward for `grafana`, `portal`, and `whois`. It is
trickier for `psql`, `kafka`, and `zookeeper` (they write to data volumes but
also expect a writable rootfs in places) — test individually and add `tmpfs`
mounts for any write paths, or skip `read_only` for those and rely on
`cap_drop` + `no-new-privileges`.
### 5.4 Pin and scan images
Images are already version-pinned (`grafana:9.1.7`, `cp-kafka:7.1.1`,
`openbmp/postgres:2.2.1`, etc.) — good. Add periodic vulnerability scanning:
```bash
trivy image openbmp/postgres:2.2.1
trivy image grafana/grafana:9.1.7
```
Note Grafana 9.1.7 is old; review Grafana security advisories and plan an
upgrade path. Track CVEs for the pinned Confluent and OpenBMP images too.
### 5.5 Resource limits
Every service already has a `mem_limit`. For production also set `cpus:` (or
`deploy.resources.limits`) so a runaway query or ingest burst cannot starve
the host — this also mitigates local denial-of-service. See
`docs/production-sizing.md` for target values.
---
## Priority 6 — Authelia / access control
Authelia fronts Grafana (ROADMAP C5). For production:
- Enforce **TOTP / 2FA** for all operator accounts; do not allow `one_factor`
for the Grafana route.
- Set short session timeouts and an inactivity expiry in the Authelia config.
- Use strong, unique passwords; back the user store with your IdP / LDAP if
available rather than the file backend.
- Ensure Authelia's own secrets (`jwt_secret`, `session.secret`,
`storage.encryption_key`) are strong and stored as secrets, not literals.
- Confirm the reverse proxy strips any client-supplied `Remote-User` header
before Authelia sets it — otherwise the auth-proxy trust model is bypassable
(see 1.3).
---
## Quick checklist
- [ ] Rotate all six default credentials; remove literals from compose, move to `.env` / secrets manager
- [ ] Update `openbmp-ds.yml` datasource password to match
- [ ] Firewall BMP port 5000 to router management subnets (`DOCKER-USER` chain)
- [ ] Bind 5432 / 8086 / 4300 to loopback or drop the port mappings
- [ ] Bind Grafana 3000 to loopback; reach it only via Authelia
- [ ] Remove the Kafka `PLAINTEXT_HOST` listener + 9092 mapping (or enable SASL_SSL if external access needed)
- [ ] Create `grafana_ro` least-privilege DB role; repoint the datasource
- [ ] Tighten `pg_hba.conf`; require `scram-sha-256`
- [ ] Enable PostgreSQL `ssl = on` with a server certificate
- [ ] Test removing `privileged: true` from `psql`; replace with specific `cap_add` if needed
- [ ] Add `security_opt: [no-new-privileges:true]` to all services
- [ ] Add `cap_drop: [ALL]` and add back only required capabilities
- [ ] Add `read_only: true` + `tmpfs` to `grafana` / `portal` / `whois`
- [ ] Add `cpus:` limits per service
- [ ] Scan images with `trivy`; plan a Grafana upgrade off 9.1.7
- [ ] Enforce TOTP and short sessions in Authelia

View File

@ -41,6 +41,7 @@
<AnnounceForm v-else-if="activeTab === 'inject'" @routes-changed="fetchRoutes" />
<PeerStatus v-else-if="activeTab === 'peers'" :peers="peers" />
<ChurnControl v-else-if="activeTab === 'churn'" />
<FullTable v-else-if="activeTab === 'full-table'" @routes-changed="fetchRoutes" />
</div>
</main>
</div>
@ -63,6 +64,7 @@ import RouteTable from './components/RouteTable.vue'
import AnnounceForm from './components/AnnounceForm.vue'
import PeerStatus from './components/PeerStatus.vue'
import ChurnControl from './components/ChurnControl.vue'
import FullTable from './components/FullTable.vue'
const health = ref(null)
const routes = ref([])
@ -75,6 +77,7 @@ const tabs = [
{ id: 'inject', label: 'Inject' },
{ id: 'peers', label: 'Peers' },
{ id: 'churn', label: 'Churn' },
{ id: 'full-table', label: 'Full Table' },
]
async function fetchHealth() {

View File

@ -1,4 +1,4 @@
const BASE = '/api'
const BASE = '/exabgp/api'
async function req(method, path, body) {
const opts = { method, headers: { 'Content-Type': 'application/json' } }
@ -18,4 +18,7 @@ export const api = {
announce: payload => req('POST', '/announce', payload),
withdraw: prefixes => req('POST', '/withdraw', { prefixes }),
withdrawAll: () => req('POST', '/withdraw/all'),
fullTableStart: (count, batchSize) => req('POST', '/full-table/start', { count, batch_size: batchSize }),
fullTableStatus: () => req('GET', '/full-table/status'),
fullTableStop: () => req('POST', '/full-table/stop'),
}

View File

@ -0,0 +1,477 @@
<template>
<div class="full-table">
<h2 class="section-title">Full Table Injection</h2>
<p class="section-desc">
Inject a realistic IPv4 routing table into ExaBGP for stress testing.
Routes are generated with varied AS paths, prefix lengths, and communities matching real DFZ distribution.
</p>
<div class="config-card">
<!-- Level selector -->
<div class="form-group">
<label>Table Size</label>
<div class="level-grid">
<button
v-for="level in levels"
:key="level.count"
class="level-btn"
:class="{ selected: selectedCount === level.count }"
:disabled="injecting"
@click="selectedCount = level.count"
>
<span class="level-count">{{ level.label }}</span>
<span class="level-desc">{{ level.desc }}</span>
</button>
</div>
</div>
<!-- Custom count -->
<div class="form-group">
<label>Custom Count</label>
<input
v-model.number="selectedCount"
type="number"
min="100"
max="950000"
step="1000"
:disabled="injecting"
class="mono-input"
/>
</div>
<!-- Action buttons -->
<div class="action-row">
<button v-if="!injecting" class="btn-start" @click="startInjection" :disabled="!selectedCount">
<span>&#9654;</span> Inject {{ formatNum(selectedCount) }} Routes
</button>
<button v-else class="btn-stop" @click="stopInjection">
<span>&#9632;</span> Stop Injection
</button>
<button
v-if="!injecting && lastCompleted"
class="btn-withdraw"
@click="withdrawAll"
:disabled="withdrawing"
>
{{ withdrawing ? 'Withdrawing...' : 'Withdraw All' }}
</button>
</div>
</div>
<!-- Status display -->
<div v-if="injecting || statusMsg" class="status-card">
<div class="status-header">
<span class="status-dot" :class="injecting ? 'dot-active' : 'dot-idle'"></span>
<span class="status-text">{{ statusMsg || 'Idle' }}</span>
</div>
<!-- Progress bar -->
<div v-if="state.total > 0" class="progress-section">
<div class="progress-labels">
<span>{{ formatNum(state.injected) }} / {{ formatNum(state.total) }}</span>
<span>{{ state.progress_pct || 0 }}%</span>
</div>
<div class="progress-track">
<div class="progress-fill" :style="{ width: (state.progress_pct || 0) + '%' }"></div>
</div>
</div>
<!-- Stats row -->
<div v-if="state.total > 0" class="stats-row">
<div class="stat-item">
<span class="stat-label">Rate</span>
<span class="stat-val">{{ formatNum(state.rate_pps || 0) }}/s</span>
</div>
<div class="stat-item">
<span class="stat-label">Elapsed</span>
<span class="stat-val">{{ state.elapsed_sec || 0 }}s</span>
</div>
<div class="stat-item">
<span class="stat-label">Active Routes</span>
<span class="stat-val">{{ formatNum(state.active_routes || 0) }}</span>
</div>
</div>
<!-- Error -->
<div v-if="state.error" class="inject-error">{{ state.error }}</div>
</div>
</div>
</template>
<script setup>
import { ref, onUnmounted } from 'vue'
import { api } from '../api.js'
const emit = defineEmits(['routes-changed'])
const levels = [
{ count: 1000, label: '1K', desc: 'Quick test' },
{ count: 10000, label: '10K', desc: 'Light load' },
{ count: 50000, label: '50K', desc: 'Medium load' },
{ count: 100000, label: '100K', desc: 'Stress test' },
{ count: 500000, label: '500K', desc: 'Heavy load' },
{ count: 900000, label: '900K', desc: 'Full DFZ' },
]
const selectedCount = ref(10000)
const injecting = ref(false)
const statusMsg = ref('')
const lastCompleted = ref(false)
const withdrawing = ref(false)
const state = ref({})
let pollTimer = null
function formatNum(n) {
if (n == null) return '0'
return Number(n).toLocaleString()
}
async function startInjection() {
try {
statusMsg.value = 'Starting injection...'
injecting.value = true
lastCompleted.value = false
state.value = {}
await api.fullTableStart(selectedCount.value, 1000)
startPolling()
} catch (e) {
statusMsg.value = `Start failed: ${e.message}`
injecting.value = false
}
}
async function stopInjection() {
try {
await api.fullTableStop()
statusMsg.value = 'Stop requested...'
} catch (e) {
statusMsg.value = `Stop failed: ${e.message}`
}
}
async function withdrawAll() {
withdrawing.value = true
try {
const data = await api.withdrawAll()
statusMsg.value = `Withdrew ${data.count} routes`
lastCompleted.value = false
state.value = {}
emit('routes-changed')
} catch (e) {
statusMsg.value = `Withdraw failed: ${e.message}`
} finally {
withdrawing.value = false
}
}
function startPolling() {
stopPolling()
pollStatus()
pollTimer = setInterval(pollStatus, 2000)
}
function stopPolling() {
if (pollTimer) {
clearInterval(pollTimer)
pollTimer = null
}
}
async function pollStatus() {
try {
const data = await api.fullTableStatus()
state.value = data
if (data.active) {
statusMsg.value = `Injecting: ${formatNum(data.injected)} / ${formatNum(data.total)} (${data.rate_pps || 0}/s)`
} else if (data.error) {
statusMsg.value = `Error: ${data.error}`
injecting.value = false
stopPolling()
} else if (data.injected > 0) {
statusMsg.value = `Complete: ${formatNum(data.injected)} routes in ${data.elapsed_sec}s (${data.rate_pps}/s)`
injecting.value = false
lastCompleted.value = true
stopPolling()
emit('routes-changed')
}
} catch (e) {
// keep polling
}
}
onUnmounted(() => {
stopPolling()
})
</script>
<style scoped>
.full-table {
display: flex;
flex-direction: column;
gap: 18px;
max-width: 680px;
}
.section-title {
font-size: 14px;
font-weight: 600;
color: var(--muted);
text-transform: uppercase;
letter-spacing: 0.08em;
}
.section-desc {
color: var(--muted);
font-size: 13px;
line-height: 1.6;
margin-top: -8px;
}
.config-card {
background: var(--card-bg);
border: 1px solid var(--border);
border-radius: var(--radius);
padding: 20px;
display: flex;
flex-direction: column;
gap: 18px;
}
.form-group {
display: flex;
flex-direction: column;
gap: 6px;
}
label {
font-size: 12px;
font-weight: 600;
color: var(--muted);
text-transform: uppercase;
letter-spacing: 0.05em;
}
.level-grid {
display: grid;
grid-template-columns: repeat(3, 1fr);
gap: 8px;
}
.level-btn {
display: flex;
flex-direction: column;
align-items: center;
gap: 2px;
padding: 12px 8px;
background: var(--bg);
border: 1px solid var(--border);
border-radius: var(--radius);
color: var(--text);
transition: all 0.15s;
}
.level-btn:hover:not(:disabled) {
border-color: var(--accent);
background: rgba(79, 156, 249, 0.08);
}
.level-btn.selected {
border-color: var(--accent);
background: rgba(79, 156, 249, 0.15);
box-shadow: 0 0 0 1px var(--accent);
}
.level-count {
font-size: 18px;
font-weight: 700;
font-family: 'Cascadia Code', 'Fira Code', 'Consolas', monospace;
color: var(--accent);
}
.level-desc {
font-size: 11px;
color: var(--muted);
}
.mono-input {
font-family: 'Cascadia Code', 'Fira Code', 'Consolas', monospace;
background: var(--bg);
color: var(--text);
border: 1px solid var(--border);
border-radius: var(--radius);
padding: 7px 10px;
font-size: 13px;
outline: none;
max-width: 200px;
}
.mono-input:focus {
border-color: var(--accent);
}
.action-row {
display: flex;
gap: 10px;
padding-top: 4px;
border-top: 1px solid var(--border);
}
.btn-start {
padding: 9px 22px;
background: rgba(72, 187, 120, 0.15);
color: #48bb78;
border: 1px solid rgba(72, 187, 120, 0.3);
font-weight: 700;
font-size: 14px;
display: flex;
align-items: center;
gap: 7px;
}
.btn-start:hover:not(:disabled) {
background: rgba(72, 187, 120, 0.25);
}
.btn-stop {
padding: 9px 22px;
background: rgba(252, 129, 129, 0.15);
color: #fc8181;
border: 1px solid rgba(252, 129, 129, 0.3);
font-weight: 700;
font-size: 14px;
display: flex;
align-items: center;
gap: 7px;
animation: pulse-border 1.5s ease-in-out infinite;
}
.btn-stop:hover {
background: rgba(252, 129, 129, 0.25);
}
.btn-withdraw {
padding: 9px 18px;
background: rgba(246, 173, 85, 0.15);
color: #f6ad55;
border: 1px solid rgba(246, 173, 85, 0.3);
font-weight: 600;
font-size: 13px;
}
.btn-withdraw:hover:not(:disabled) {
background: rgba(246, 173, 85, 0.25);
}
.status-card {
background: var(--card-bg);
border: 1px solid var(--border);
border-radius: var(--radius);
padding: 16px 18px;
display: flex;
flex-direction: column;
gap: 12px;
}
.status-header {
display: flex;
align-items: center;
gap: 10px;
}
.status-dot {
width: 10px;
height: 10px;
border-radius: 50%;
flex-shrink: 0;
}
.dot-active {
background: #48bb78;
box-shadow: 0 0 8px #48bb78;
animation: pulse-dot 1s ease-in-out infinite;
}
.dot-idle {
background: var(--muted);
}
.status-text {
font-size: 14px;
font-weight: 600;
color: var(--text);
font-family: 'Cascadia Code', 'Fira Code', 'Consolas', monospace;
}
.progress-section {
display: flex;
flex-direction: column;
gap: 5px;
}
.progress-labels {
display: flex;
justify-content: space-between;
font-size: 11px;
color: var(--muted);
}
.progress-track {
height: 6px;
background: var(--border);
border-radius: 3px;
overflow: hidden;
}
.progress-fill {
height: 100%;
background: var(--accent);
border-radius: 3px;
transition: width 0.5s ease;
}
.stats-row {
display: flex;
gap: 24px;
}
.stat-item {
display: flex;
flex-direction: column;
gap: 2px;
}
.stat-label {
font-size: 10px;
color: var(--muted);
text-transform: uppercase;
letter-spacing: 0.05em;
}
.stat-val {
font-size: 15px;
font-weight: 700;
color: var(--text);
font-family: 'Cascadia Code', 'Fira Code', 'Consolas', monospace;
}
.inject-error {
font-size: 12px;
color: #fc8181;
padding: 6px 10px;
background: rgba(252, 129, 129, 0.08);
border-radius: 4px;
border: 1px solid rgba(252, 129, 129, 0.2);
}
@keyframes pulse-dot {
0%, 100% { opacity: 1; }
50% { opacity: 0.5; }
}
@keyframes pulse-border {
0%, 100% { border-color: rgba(252, 129, 129, 0.3); }
50% { border-color: rgba(252, 129, 129, 0.6); }
}
</style>

View File

@ -2,6 +2,7 @@ import { defineConfig } from 'vite'
import vue from '@vitejs/plugin-vue'
export default defineConfig({
base: '/exabgp/',
plugins: [vue()],
server: {
proxy: {

View File

@ -48,12 +48,16 @@ peer_states = {}
# ExaBGP command helpers
# ---------------------------------------------------------------------------
_quiet_mode = False
def _send(cmd: str):
"""Write a command to ExaBGP via stdout."""
with _stdout_lock:
sys.stdout.write(cmd + '\n')
sys.stdout.flush()
log.info('→ ExaBGP: %s', cmd)
if not _quiet_mode:
log.info('→ ExaBGP: %s', cmd)
def _build_announce(prefix, next_hop='self', as_path=None, communities=None, med=None, local_pref=None):
@ -162,7 +166,22 @@ def api_withdraw_all():
# ---------------------------------------------------------------------------
sys.path.insert(0, '/exabgp')
from scenarios import SCENARIOS
from scenarios import SCENARIOS, generate_full_internet
# ---------------------------------------------------------------------------
# Full-table background injection
# ---------------------------------------------------------------------------
_injection_state = {
'active': False,
'total': 0,
'injected': 0,
'elapsed_sec': 0,
'rate_pps': 0,
'error': None,
'stop_requested': False,
}
_injection_lock = threading.Lock()
@app.route('/scenarios', methods=['GET'])
@ -223,6 +242,131 @@ def get_peers():
return jsonify({'peers': peer_states})
# ---------------------------------------------------------------------------
# Full-table injection endpoints
# ---------------------------------------------------------------------------
def _injection_worker(count, batch_size):
"""Background thread: generate and inject full internet table."""
global _quiet_mode
try:
_quiet_mode = True # suppress per-route logging
log.info('Generating %d full-table prefixes...', count)
routes = generate_full_internet(count)
with _injection_lock:
_injection_state['total'] = len(routes)
log.info('Generated %d routes, starting injection at batch_size=%d', len(routes), batch_size)
start_time = time.time()
for i, route in enumerate(routes):
with _injection_lock:
if _injection_state['stop_requested']:
log.info('Injection stopped by user at %d/%d', i, len(routes))
break
prefix = route['prefix']
announce_route(
prefix,
next_hop=route.get('next_hop', 'self'),
as_path=route.get('as_path', []),
communities=route.get('communities', []),
med=route.get('med'),
local_pref=route.get('local_pref'),
)
# Update progress periodically (every batch_size routes)
if (i + 1) % batch_size == 0:
elapsed = time.time() - start_time
with _injection_lock:
_injection_state['injected'] = i + 1
_injection_state['elapsed_sec'] = round(elapsed, 1)
_injection_state['rate_pps'] = round((i + 1) / elapsed, 1) if elapsed > 0 else 0
log.info('Injection progress: %d/%d (%.0f/s)',
i + 1, len(routes), (i + 1) / elapsed if elapsed > 0 else 0)
elapsed = time.time() - start_time
with _injection_lock:
_injection_state['injected'] = min(i + 1, len(routes))
_injection_state['elapsed_sec'] = round(elapsed, 1)
_injection_state['rate_pps'] = round(_injection_state['injected'] / elapsed, 1) if elapsed > 0 else 0
_injection_state['active'] = False
log.info('Injection complete: %d routes in %.1fs (%.0f/s)',
_injection_state['injected'], elapsed,
_injection_state['injected'] / elapsed if elapsed > 0 else 0)
except Exception as e:
log.error('Injection error: %s', e)
with _injection_lock:
_injection_state['error'] = str(e)
_injection_state['active'] = False
finally:
_quiet_mode = False
@app.route('/full-table/start', methods=['POST'])
def start_full_table():
"""Start background injection of a full IPv4 routing table.
POST body (all optional):
count: Number of prefixes (default 900000)
batch_size: Progress update interval (default 1000)
"""
with _injection_lock:
if _injection_state['active']:
return jsonify({
'error': 'Injection already in progress',
'state': dict(_injection_state),
}), 409
data = request.get_json(force=True) if request.data else {}
count = int(data.get('count', 900000))
batch_size = int(data.get('batch_size', 1000))
with _injection_lock:
_injection_state.update({
'active': True,
'total': count,
'injected': 0,
'elapsed_sec': 0,
'rate_pps': 0,
'error': None,
'stop_requested': False,
})
t = threading.Thread(target=_injection_worker, args=(count, batch_size), daemon=True)
t.start()
log.info('Started full-table injection: %d prefixes', count)
return jsonify({
'status': 'started',
'count': count,
'message': f'Generating and injecting {count} prefixes in background. GET /full-table/status to track progress.',
})
@app.route('/full-table/status', methods=['GET'])
def full_table_status():
"""Get current full-table injection progress."""
with _injection_lock:
state = dict(_injection_state)
if state['total'] > 0:
state['progress_pct'] = round(state['injected'] / state['total'] * 100, 1)
else:
state['progress_pct'] = 0
state['active_routes'] = len(active_routes)
return jsonify(state)
@app.route('/full-table/stop', methods=['POST'])
def stop_full_table():
"""Stop an in-progress full-table injection."""
with _injection_lock:
if not _injection_state['active']:
return jsonify({'error': 'No injection in progress'}), 400
_injection_state['stop_requested'] = True
return jsonify({'status': 'stop_requested', 'injected_so_far': _injection_state['injected']})
# ---------------------------------------------------------------------------
# ExaBGP event loop (main thread)
# ---------------------------------------------------------------------------

View File

@ -12,6 +12,9 @@ Usage:
inject.py withdraw-all
inject.py scenario <name>
inject.py withdraw-scenario <name>
inject.py full-table [--count N] [--follow] # inject full IPv4 table (background)
inject.py full-table-status # show injection progress
inject.py full-table-stop # stop injection
inject.py churn [--count N] [--interval SEC] # cycle announce/withdraw for ip_rib_log population
inject.py monitor # live-refresh terminal view
@ -29,8 +32,8 @@ import requests
API = os.environ.get('EXABGP_API', 'http://localhost:5050')
def _post(path, data=None):
r = requests.post(f'{API}{path}', json=data or {}, timeout=10)
def _post(path, data=None, timeout=10):
r = requests.post(f'{API}{path}', json=data or {}, timeout=timeout)
r.raise_for_status()
return r.json()
@ -174,6 +177,101 @@ def cmd_withdraw_scenario(args):
print(f"Withdrew scenario '{args.name}': {data['count']} routes withdrawn")
def cmd_full_table(args):
"""Inject a full IPv4 routing table for stress testing."""
count = args.count
print(f"Starting full-table injection: {count} prefixes")
print("This generates routes in background. Use 'inject.py full-table-status' to track.\n")
data = _post('/full-table/start', {'count': count, 'batch_size': args.batch_size}, timeout=120)
print(data.get('message', 'Started'))
if args.follow:
print()
_follow_injection()
def cmd_full_table_status(args):
"""Show full-table injection progress."""
data = _get('/full-table/status')
active = data.get('active', False)
total = data.get('total', 0)
injected = data.get('injected', 0)
pct = data.get('progress_pct', 0)
rate = data.get('rate_pps', 0)
elapsed = data.get('elapsed_sec', 0)
error = data.get('error')
active_routes = data.get('active_routes', 0)
if error:
print(f"ERROR: {error}")
elif active:
bar_len = 40
filled = int(bar_len * pct / 100)
bar = '#' * filled + '-' * (bar_len - filled)
print(f"[{bar}] {pct:.1f}%")
print(f" Injected: {injected:,} / {total:,} ({rate:.0f} routes/s)")
print(f" Elapsed: {elapsed:.0f}s")
print(f" Active routes in ExaBGP: {active_routes:,}")
elif total > 0:
print(f"Injection complete: {injected:,} / {total:,} routes in {elapsed:.0f}s ({rate:.0f}/s)")
print(f"Active routes in ExaBGP: {active_routes:,}")
else:
print("No injection running or completed.")
print(f"Active routes: {active_routes:,}")
def cmd_full_table_stop(args):
"""Stop an in-progress full-table injection."""
try:
data = _post('/full-table/stop')
print(f"Stop requested. Injected so far: {data.get('injected_so_far', '?'):,}")
except requests.exceptions.HTTPError as e:
if e.response.status_code == 400:
print("No injection in progress.")
else:
raise
def _follow_injection():
"""Poll injection status until complete."""
import shutil
lines_printed = 0
try:
while True:
data = _get('/full-table/status')
active = data.get('active', False)
total = data.get('total', 0)
injected = data.get('injected', 0)
pct = data.get('progress_pct', 0)
rate = data.get('rate_pps', 0)
elapsed = data.get('elapsed_sec', 0)
active_routes = data.get('active_routes', 0)
# Move cursor up to overwrite
if lines_printed > 0:
print(f"\033[{lines_printed}A", end='')
bar_len = 40
filled = int(bar_len * pct / 100)
bar = '#' * filled + '-' * (bar_len - filled)
output_lines = [
f" [{bar}] {pct:.1f}%",
f" Injected: {injected:,} / {total:,} ({rate:.0f} routes/s) elapsed: {elapsed:.0f}s",
f" Active routes: {active_routes:,}",
]
print('\n'.join(output_lines))
lines_printed = len(output_lines)
if not active:
print(f"\nDone! {injected:,} routes injected in {elapsed:.0f}s")
break
time.sleep(2)
except KeyboardInterrupt:
print("\n\nFollowing stopped (injection continues in background).")
def cmd_churn(args):
"""
Cycle announce/withdraw on the 'churn' scenario to generate ip_rib_log
@ -236,6 +334,17 @@ def main():
p = sub.add_parser('withdraw-scenario', help='Withdraw a named scenario')
p.add_argument('name')
p = sub.add_parser('full-table', help='Inject full IPv4 routing table (background)')
p.add_argument('--count', type=int, default=900000, metavar='N',
help='Number of prefixes to inject (default: 900000)')
p.add_argument('--batch-size', type=int, default=1000, metavar='N',
help='Progress update interval (default: 1000)')
p.add_argument('--follow', '-f', action='store_true',
help='Follow progress until complete')
sub.add_parser('full-table-status', help='Show full-table injection progress')
sub.add_parser('full-table-stop', help='Stop full-table injection')
p = sub.add_parser('churn', help='Cycle announce/withdraw to populate ip_rib_log')
p.add_argument('--count', type=int, default=0, metavar='N',
help='Number of cycles (0 = infinite)')
@ -255,6 +364,9 @@ def main():
'withdraw-all': cmd_withdraw_all,
'scenario': cmd_scenario,
'withdraw-scenario': cmd_withdraw_scenario,
'full-table': cmd_full_table,
'full-table-status': cmd_full_table_status,
'full-table-stop': cmd_full_table_stop,
'churn': cmd_churn,
}

View File

@ -0,0 +1,658 @@
#!/usr/bin/env python3
"""
Route Diversity Configuration Script
=====================================
Adds loopbacks, static routes, route-policies, and BGP redistribution
to R9K-01 through R9K-07 to create locally-originated routes that
produce meaningful RR Loc-RIB diffs between CORE-01 and CORE-02.
IS-IS topology (natural asymmetry no metric tuning needed):
CORE-01 R9K-01, R9K-02, R9K-03, R9K-04, R9K-05
CORE-02 R9K-05, R9K-06, R9K-07
R9K-04 R9K-06 (cross-link)
R9K-05 dual-homed to both COREs
Overlapping prefixes from CORE-01-side and CORE-02-side routers produce
next-hop diffs because each RR picks the client with lowest IGP cost.
Address plan:
Loopbacks: 10.110.{router_id}.{1,2}/32 (unique per router)
Overlap LBs: 10.110.{100-103}.1/32 (shared across router pairs)
Static routes: 10.111.{router_id}.0/24 (unique per router)
Overlap statics: 10.111.{100-103}.0/24 (shared across router pairs)
Usage:
python3 route_diversity_config.py # apply all config
python3 route_diversity_config.py --verify-only # just check current state
python3 route_diversity_config.py --rollback # remove all added config
"""
from ncclient import manager
import xml.etree.ElementTree as ET
import sys
import argparse
# YANG namespaces
IFMGR_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-ifmgr-cfg'
IPV4IO_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-ipv4-io-cfg'
STATIC_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-ip-static-cfg'
BGP_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-ipv4-bgp-cfg'
ISIS_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-clns-isis-cfg'
RPL_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-policy-repository-cfg'
# ──────────────────────────────────────────────────────────────────────
# Router definitions
# ──────────────────────────────────────────────────────────────────────
ROUTERS = {
'R9K-01': {
'mgmt': '10.100.0.1',
'loopbacks': [
('Loopback10', '10.110.1.1', '255.255.255.255'),
('Loopback11', '10.110.1.2', '255.255.255.255'),
('Loopback100', '10.110.100.1', '255.255.255.255'), # overlap with R9K-06
('Loopback103', '10.110.103.1', '255.255.255.255'), # overlap with R9K-04, R9K-07
],
'statics': [
('10.111.1.0', 24, 100), # unique, tag=100 → LP=200
('10.111.100.0', 24, 100), # overlap with R9K-06 (tag=200)
('10.111.103.0', 24, 100), # overlap with R9K-04, R9K-07 (same tag)
],
},
'R9K-02': {
'mgmt': '10.100.0.2',
'loopbacks': [
('Loopback10', '10.110.2.1', '255.255.255.255'),
('Loopback11', '10.110.2.2', '255.255.255.255'),
('Loopback101', '10.110.101.1', '255.255.255.255'), # overlap with R9K-07
],
'statics': [
('10.111.2.0', 24, 100),
('10.111.101.0', 24, 100), # overlap with R9K-07 (tag=300)
],
},
'R9K-03': {
'mgmt': '10.100.0.3',
'loopbacks': [
('Loopback10', '10.110.3.1', '255.255.255.255'),
('Loopback11', '10.110.3.2', '255.255.255.255'),
('Loopback102', '10.110.102.1', '255.255.255.255'), # overlap with R9K-05
],
'statics': [
('10.111.3.0', 24, 100),
('10.111.102.0', 24, 100), # overlap with R9K-04 (tag=200)
],
},
'R9K-04': {
'mgmt': '10.100.0.4',
'loopbacks': [
('Loopback10', '10.110.4.1', '255.255.255.255'),
('Loopback11', '10.110.4.2', '255.255.255.255'),
('Loopback103', '10.110.103.1', '255.255.255.255'), # overlap with R9K-01, R9K-07
],
'statics': [
('10.111.4.0', 24, 200), # tag=200 → LP=150
('10.111.102.0', 24, 200), # overlap with R9K-03 (tag=100)
('10.111.103.0', 24, 100), # overlap with R9K-01, R9K-07 (same tag)
],
},
'R9K-05': {
'mgmt': '10.100.0.5',
'loopbacks': [
('Loopback10', '10.110.5.1', '255.255.255.255'),
('Loopback11', '10.110.5.2', '255.255.255.255'),
('Loopback102', '10.110.102.1', '255.255.255.255'), # overlap with R9K-03
],
'statics': [
('10.111.5.0', 24, 100),
],
},
'R9K-06': {
'mgmt': '10.100.0.6',
'loopbacks': [
('Loopback10', '10.110.6.1', '255.255.255.255'),
('Loopback11', '10.110.6.2', '255.255.255.255'),
('Loopback100', '10.110.100.1', '255.255.255.255'), # overlap with R9K-01
],
'statics': [
('10.111.6.0', 24, 200), # tag=200 → LP=150
('10.111.100.0', 24, 200), # overlap with R9K-01 (tag=100)
],
},
'R9K-07': {
'mgmt': '10.100.0.7',
'loopbacks': [
('Loopback10', '10.110.7.1', '255.255.255.255'),
('Loopback11', '10.110.7.2', '255.255.255.255'),
('Loopback101', '10.110.101.1', '255.255.255.255'), # overlap with R9K-02
('Loopback103', '10.110.103.1', '255.255.255.255'), # overlap with R9K-01, R9K-04
],
'statics': [
('10.111.7.0', 24, 300), # tag=300 → LP=100
('10.111.101.0', 24, 300), # overlap with R9K-02 (tag=100)
('10.111.103.0', 24, 100), # overlap with R9K-01, R9K-04 (same tag)
],
},
}
# ──────────────────────────────────────────────────────────────────────
# Route-policy (RPL text blob)
# ──────────────────────────────────────────────────────────────────────
ROUTE_POLICY_NAME = 'REDIST-TO-BGP'
ROUTE_POLICY_BODY = """\
route-policy REDIST-TO-BGP
if tag is 100 then
set local-preference 200
set med 50
set community (65020:100) additive
pass
elseif tag is 200 then
set local-preference 150
set med 100
set community (65020:200) additive
pass
elseif tag is 300 then
set local-preference 100
set med 200
set community (65020:300) additive
pass
else
set local-preference 100
pass
endif
end-policy
"""
# ──────────────────────────────────────────────────────────────────────
# XML builders
# ──────────────────────────────────────────────────────────────────────
def loopback_xml(name, addr, mask):
"""Create a loopback interface with an IPv4 address."""
return f"""
<config>
<interface-configurations xmlns="{IFMGR_NS}">
<interface-configuration>
<active>act</active>
<interface-name>{name}</interface-name>
<interface-virtual/>
<ipv4-network xmlns="{IPV4IO_NS}">
<addresses>
<primary>
<address>{addr}</address>
<netmask>{mask}</netmask>
</primary>
</addresses>
</ipv4-network>
</interface-configuration>
</interface-configurations>
</config>
"""
def static_route_xml(prefix, prefix_len, tag):
"""Create a static route to Null0 with a tag."""
return f"""
<config>
<router-static xmlns="{STATIC_NS}">
<default-vrf>
<address-family>
<vrfipv4>
<vrf-unicast>
<vrf-prefixes>
<vrf-prefix>
<prefix>{prefix}</prefix>
<prefix-length>{prefix_len}</prefix-length>
<vrf-route>
<vrf-next-hop-table>
<vrf-next-hop-interface-name>
<interface-name>Null0</interface-name>
<tag>{tag}</tag>
</vrf-next-hop-interface-name>
</vrf-next-hop-table>
</vrf-route>
</vrf-prefix>
</vrf-prefixes>
</vrf-unicast>
</vrfipv4>
</address-family>
</default-vrf>
</router-static>
</config>
"""
def route_policy_xml(name, body):
"""Create/replace a route-policy (RPL text blob)."""
return f"""
<config>
<routing-policy xmlns="{RPL_NS}">
<route-policies>
<route-policy>
<route-policy-name>{name}</route-policy-name>
<rpl-route-policy>{body}</rpl-route-policy>
</route-policy>
</route-policies>
</routing-policy>
</config>
"""
def isis_passive_xml(intf_name):
"""Add a loopback to IS-IS instance 1 (passive by default for loopbacks)."""
return f"""
<config>
<isis xmlns="{ISIS_NS}">
<instances>
<instance>
<instance-name>1</instance-name>
<interfaces>
<interface>
<interface-name>{intf_name}</interface-name>
<running/>
<interface-afs>
<interface-af>
<af-name>ipv4</af-name>
<saf-name>unicast</saf-name>
<interface-af-data/>
</interface-af>
</interface-afs>
</interface>
</interfaces>
</instance>
</instances>
</isis>
</config>
"""
def bgp_redistribute_xml():
"""Configure redistribute connected + static with REDIST-TO-BGP policy."""
return f"""
<config>
<bgp xmlns="{BGP_NS}">
<instance>
<instance-name>default</instance-name>
<instance-as>
<as>0</as>
<four-byte-as>
<as>65020</as>
<bgp-running/>
<default-vrf>
<global>
<global-afs>
<global-af>
<af-name>ipv4-unicast</af-name>
<enable/>
<connected-routes>
<route-policy-name>{ROUTE_POLICY_NAME}</route-policy-name>
</connected-routes>
<static-routes>
<route-policy-name>{ROUTE_POLICY_NAME}</route-policy-name>
</static-routes>
</global-af>
</global-afs>
</global>
</default-vrf>
</four-byte-as>
</instance-as>
</instance>
</bgp>
</config>
"""
# ──────────────────────────────────────────────────────────────────────
# Rollback XML builders (delete operations)
# ──────────────────────────────────────────────────────────────────────
NC_NS = 'urn:ietf:params:xml:ns:netconf:base:1.0'
def delete_loopback_xml(name):
return f"""
<config>
<interface-configurations xmlns="{IFMGR_NS}">
<interface-configuration xmlns:nc="{NC_NS}" nc:operation="delete">
<active>act</active>
<interface-name>{name}</interface-name>
</interface-configuration>
</interface-configurations>
</config>
"""
def delete_static_route_xml(prefix, prefix_len):
return f"""
<config>
<router-static xmlns="{STATIC_NS}">
<default-vrf>
<address-family>
<vrfipv4>
<vrf-unicast>
<vrf-prefixes>
<vrf-prefix xmlns:nc="{NC_NS}" nc:operation="delete">
<prefix>{prefix}</prefix>
<prefix-length>{prefix_len}</prefix-length>
</vrf-prefix>
</vrf-prefixes>
</vrf-unicast>
</vrfipv4>
</address-family>
</default-vrf>
</router-static>
</config>
"""
def delete_bgp_redistribute_xml():
return f"""
<config>
<bgp xmlns="{BGP_NS}">
<instance>
<instance-name>default</instance-name>
<instance-as>
<as>0</as>
<four-byte-as>
<as>65020</as>
<bgp-running/>
<default-vrf>
<global>
<global-afs>
<global-af>
<af-name>ipv4-unicast</af-name>
<enable/>
<connected-routes xmlns:nc="{NC_NS}" nc:operation="delete"/>
<static-routes xmlns:nc="{NC_NS}" nc:operation="delete"/>
</global-af>
</global-afs>
</global>
</default-vrf>
</four-byte-as>
</instance-as>
</instance>
</bgp>
</config>
"""
def delete_isis_interface_xml(intf_name):
return f"""
<config>
<isis xmlns="{ISIS_NS}">
<instances>
<instance>
<instance-name>1</instance-name>
<interfaces>
<interface xmlns:nc="{NC_NS}" nc:operation="delete">
<interface-name>{intf_name}</interface-name>
</interface>
</interfaces>
</instance>
</instances>
</isis>
</config>
"""
def delete_route_policy_xml(name):
return f"""
<config>
<routing-policy xmlns="{RPL_NS}">
<route-policies>
<route-policy xmlns:nc="{NC_NS}" nc:operation="delete">
<route-policy-name>{name}</route-policy-name>
</route-policy>
</route-policies>
</routing-policy>
</config>
"""
# ──────────────────────────────────────────────────────────────────────
# Configuration functions
# ──────────────────────────────────────────────────────────────────────
def nc_connect(mgmt_ip):
"""Open NETCONF session."""
return manager.connect(
host=mgmt_ip,
port=830,
username='webui',
password='cisco',
hostkey_verify=False,
device_params={'name': 'iosxr'},
timeout=30,
)
def configure_router(label, cfg):
"""Apply full route-diversity config to a single router."""
mgmt_ip = cfg['mgmt']
print(f"\n{''*60}")
print(f" Configuring {label} ({mgmt_ip})")
print(f"{''*60}")
lb_names = [lb[0] for lb in cfg['loopbacks']]
static_prefixes = [f"{s[0]}/{s[1]}" for s in cfg['statics']]
print(f" Loopbacks: {', '.join(lb_names)}")
print(f" Statics: {', '.join(static_prefixes)}")
try:
with nc_connect(mgmt_ip) as m:
# Phase 1: Route-policy (must exist before BGP references it)
print(f" → Creating route-policy {ROUTE_POLICY_NAME}...")
m.edit_config(target='candidate', config=route_policy_xml(ROUTE_POLICY_NAME, ROUTE_POLICY_BODY))
# Phase 2: Loopback interfaces
for name, addr, mask in cfg['loopbacks']:
print(f" → Creating {name} ({addr})...")
m.edit_config(target='candidate', config=loopback_xml(name, addr, mask))
# Phase 3: Static routes
for prefix, plen, tag in cfg['statics']:
print(f" → Static {prefix}/{plen} → Null0 tag={tag}...")
m.edit_config(target='candidate', config=static_route_xml(prefix, plen, tag))
# Phase 4: IS-IS passive on new loopbacks
for name, _, _ in cfg['loopbacks']:
print(f" → IS-IS passive: {name}...")
m.edit_config(target='candidate', config=isis_passive_xml(name))
# Phase 5: BGP redistribution
print(f" → BGP redistribute connected + static...")
m.edit_config(target='candidate', config=bgp_redistribute_xml())
# Phase 6: Commit
print(f" → Committing...")
m.commit()
print(f"{label} done.")
return True
except Exception as e:
print(f" ✗ ERROR on {label}: {e}")
return False
def rollback_router(label, cfg):
"""Remove all route-diversity config from a single router."""
mgmt_ip = cfg['mgmt']
print(f"\n{''*60}")
print(f" Rolling back {label} ({mgmt_ip})")
print(f"{''*60}")
try:
with nc_connect(mgmt_ip) as m:
# Remove BGP redistribution first (references the policy)
print(f" → Removing BGP redistribute...")
try:
m.edit_config(target='candidate', config=delete_bgp_redistribute_xml())
except Exception as e:
print(f" (skip — may not exist: {e})")
# Remove IS-IS interfaces
for name, _, _ in cfg['loopbacks']:
print(f" → Removing IS-IS interface {name}...")
try:
m.edit_config(target='candidate', config=delete_isis_interface_xml(name))
except Exception as e:
print(f" (skip: {e})")
# Remove static routes
for prefix, plen, _ in cfg['statics']:
print(f" → Removing static {prefix}/{plen}...")
try:
m.edit_config(target='candidate', config=delete_static_route_xml(prefix, plen))
except Exception as e:
print(f" (skip: {e})")
# Remove loopbacks
for name, _, _ in cfg['loopbacks']:
print(f" → Removing {name}...")
try:
m.edit_config(target='candidate', config=delete_loopback_xml(name))
except Exception as e:
print(f" (skip: {e})")
# Remove route-policy
print(f" → Removing route-policy {ROUTE_POLICY_NAME}...")
try:
m.edit_config(target='candidate', config=delete_route_policy_xml(ROUTE_POLICY_NAME))
except Exception as e:
print(f" (skip: {e})")
print(f" → Committing rollback...")
m.commit()
print(f"{label} rolled back.")
return True
except Exception as e:
print(f" ✗ ERROR rolling back {label}: {e}")
return False
def verify_router(label, cfg):
"""Check if route-diversity config is present on a router."""
mgmt_ip = cfg['mgmt']
try:
with nc_connect(mgmt_ip) as m:
# Check loopbacks
filt_intf = f"""<filter>
<interface-configurations xmlns="{IFMGR_NS}"/>
</filter>"""
r_intf = str(m.get_config(source='running', filter=filt_intf))
found_lbs = []
for name, _, _ in cfg['loopbacks']:
if name in r_intf:
found_lbs.append(name)
# Check route-policy
filt_rpl = f"""<filter>
<routing-policy xmlns="{RPL_NS}"/>
</filter>"""
r_rpl = str(m.get_config(source='running', filter=filt_rpl))
has_policy = ROUTE_POLICY_NAME in r_rpl
# Check BGP redistribute
filt_bgp = f"""<filter>
<bgp xmlns="{BGP_NS}">
<instance><instance-name>default</instance-name></instance>
</bgp>
</filter>"""
r_bgp = str(m.get_config(source='running', filter=filt_bgp))
has_redist_connected = 'connected-routes' in r_bgp and ROUTE_POLICY_NAME in r_bgp
has_redist_static = 'static-routes' in r_bgp and ROUTE_POLICY_NAME in r_bgp
total_lbs = len(cfg['loopbacks'])
lb_str = f"{len(found_lbs)}/{total_lbs}"
pol = '' if has_policy else ''
rc = '' if has_redist_connected else ''
rs = '' if has_redist_static else ''
ok = len(found_lbs) == total_lbs and has_policy and has_redist_connected and has_redist_static
status = 'OK' if ok else 'INCOMPLETE'
print(f" {label:8s} LBs={lb_str:5s} Policy={pol} Redist-C={rc} Redist-S={rs} [{status}]")
except Exception as e:
print(f" {label:8s} verify error: {e}")
# ──────────────────────────────────────────────────────────────────────
# Main
# ──────────────────────────────────────────────────────────────────────
def main():
parser = argparse.ArgumentParser(description='Route Diversity Configuration for RR Diff Analysis')
parser.add_argument('--verify-only', action='store_true', help='Only verify current state')
parser.add_argument('--rollback', action='store_true', help='Remove all added config')
args = parser.parse_args()
print("Route Diversity Configuration Script")
print("=" * 60)
print(f"Targets: {len(ROUTERS)} routers ({', '.join(ROUTERS.keys())})")
print()
if args.verify_only:
print("Verify-only mode")
print('-' * 60)
print(f" {'Router':8s} {'LBs':5s} {'Policy':6s} {'Redist-C':8s} {'Redist-S':8s} Status")
for label, cfg in ROUTERS.items():
verify_router(label, cfg)
return
if args.rollback:
print("ROLLBACK mode — removing all route-diversity config")
print('-' * 60)
results = []
for label, cfg in ROUTERS.items():
ok = rollback_router(label, cfg)
results.append((label, ok))
failed = [l for l, ok in results if not ok]
print()
if failed:
print(f"FAILED rollback: {', '.join(failed)}")
sys.exit(1)
else:
print("All routers rolled back successfully.")
return
# Apply mode
results = []
for label, cfg in ROUTERS.items():
ok = configure_router(label, cfg)
results.append((label, ok))
# Post-apply verification
print(f"\n{'='*60}")
print("Post-apply verification")
print('=' * 60)
print(f" {'Router':8s} {'LBs':5s} {'Policy':6s} {'Redist-C':8s} {'Redist-S':8s} Status")
for label, cfg in ROUTERS.items():
verify_router(label, cfg)
failed = [l for l, ok in results if not ok]
print()
if failed:
print(f"FAILED: {', '.join(failed)}")
sys.exit(1)
else:
total_lbs = sum(len(c['loopbacks']) for c in ROUTERS.values())
total_statics = sum(len(c['statics']) for c in ROUTERS.values())
print(f"All routers configured successfully.")
print(f" {total_lbs} loopbacks + {total_statics} static routes created")
print()
print("Wait ~60s for BGP convergence and BMP collection, then verify:")
print()
print(" # Check new prefixes in OpenBMP")
print(" docker exec -i obmp-psql psql -U openbmp -d openbmp -c \\")
print(" \"SELECT prefix::text, COUNT(*) FROM ip_rib")
print(" WHERE (prefix::text LIKE '10.110.%' OR prefix::text LIKE '10.111.%')")
print(" AND iswithdrawn = false GROUP BY prefix ORDER BY prefix;\"")
if __name__ == '__main__':
main()

View File

@ -363,10 +363,178 @@ _HIJACK_ROUTES = [
]
# ---------------------------------------------------------------------------
# Scenario: te_community_steering
# Routes tagged with TE communities representing different "colors" for
# community-based TE policy steering. Shows how communities drive path
# selection when routers apply route-policy based on community values.
# ---------------------------------------------------------------------------
_TE_COMMUNITY_ROUTES = [
# Red paths (community 65020:100) — high-priority, low-latency
_r('10.210.0.0/24', [65100, 65020], communities=['65020:100'], med=10),
_r('10.210.1.0/24', [65100, 65020], communities=['65020:100'], med=10),
_r('10.210.2.0/24', [65100, 65020], communities=['65020:100'], med=10),
_r('10.210.3.0/24', [65100, 65020], communities=['65020:100'], med=10),
_r('10.210.4.0/24', [65100, 65020], communities=['65020:100'], med=10),
# Blue paths (community 65020:200) — bulk transfer, cost-optimized
_r('10.220.0.0/24', [65100, 65020, 3356], communities=['65020:200'], med=100),
_r('10.220.1.0/24', [65100, 65020, 3356], communities=['65020:200'], med=100),
_r('10.220.2.0/24', [65100, 65020, 3356], communities=['65020:200'], med=100),
_r('10.220.3.0/24', [65100, 65020, 3356], communities=['65020:200'], med=100),
_r('10.220.4.0/24', [65100, 65020, 3356], communities=['65020:200'], med=100),
# Green paths (community 65020:300) — backup/diverse paths
_r('10.230.0.0/24', [65100, 65020, 1299, 6762], communities=['65020:300'], med=200),
_r('10.230.1.0/24', [65100, 65020, 1299, 6762], communities=['65020:300'], med=200),
_r('10.230.2.0/24', [65100, 65020, 1299, 6762], communities=['65020:300'], med=200),
_r('10.230.3.0/24', [65100, 65020, 1299, 6762], communities=['65020:300'], med=200),
_r('10.230.4.0/24', [65100, 65020, 1299, 6762], communities=['65020:300'], med=200),
]
# ---------------------------------------------------------------------------
# Scenario: origin_shift
# Simulates an origin AS change: prefixes initially associated with
# well-known origin ASNs are re-announced with a different origin.
# Use: load internet_sample first, then load origin_shift to see the
# origin_as column change in ip_rib_log (visible on Anomaly dashboard).
# ---------------------------------------------------------------------------
_ORIGIN_SHIFT_ROUTES = [
# These prefixes overlap with internet_sample but have different origin ASNs
_r('8.8.8.0/24', [65100, 64999], communities=['65100:origin-shift']), # was 15169 (Google)
_r('1.1.1.0/24', [65100, 64998], communities=['65100:origin-shift']), # was 13335 (Cloudflare)
_r('9.9.9.0/24', [65100, 64997], communities=['65100:origin-shift']), # was 19281 (Quad9)
_r('208.67.222.0/24', [65100, 64996], communities=['65100:origin-shift']), # was 36692 (OpenDNS)
_r('156.154.70.0/24', [65100, 64995], communities=['65100:origin-shift']), # was 19318 (Neustar)
]
# ---------------------------------------------------------------------------
# Scenario: path_diversity
# Multiple announcements of the same prefix with different AS paths,
# MEDs, and communities. Demonstrates best-path selection:
# - Shorter AS path wins (unless local-pref overrides)
# - Lower MED preferred among paths from same neighbor AS
# - Communities tag paths for policy identification
# ---------------------------------------------------------------------------
_PATH_DIVERSITY_ROUTES = [
# Prefix 1: 3 paths with varying length and MED
_r('10.250.0.0/24', [65100, 174], communities=['65100:path-a'], med=50),
_r('10.250.0.0/24', [65100, 174, 3356], communities=['65100:path-b'], med=100),
_r('10.250.0.0/24', [65100, 174, 3356, 15169], communities=['65100:path-c'], med=150),
# Prefix 2: paths with same length but different MED
_r('10.250.1.0/24', [65100, 1299, 15169], communities=['65100:low-med'], med=10),
_r('10.250.1.0/24', [65100, 3356, 15169], communities=['65100:high-med'], med=500),
# Prefix 3: local-pref override (higher local-pref wins over shorter path)
_r('10.250.2.0/24', [65100, 2914], communities=['65100:low-lp'], local_pref=50),
_r('10.250.2.0/24', [65100, 2914, 7018], communities=['65100:high-lp'], local_pref=200),
# Prefix 4: transit diversity
_r('10.250.3.0/24', [65100, 174, 32934], communities=['65100:via-cogent']),
_r('10.250.3.0/24', [65100, 3356, 32934], communities=['65100:via-lumen']),
_r('10.250.3.0/24', [65100, 2914, 32934], communities=['65100:via-ntt']),
]
# ---------------------------------------------------------------------------
# Registry
# ---------------------------------------------------------------------------
# ---------------------------------------------------------------------------
# Full Internet Table Generator
# Generates realistic-looking IPv4 prefixes across the routable address space
# with varied AS paths, prefix lengths, origins, and communities.
# Configurable count: 10K (quick test) to 900K+ (full table stress test).
# ---------------------------------------------------------------------------
# Well-known transit ASNs for realistic path construction
_TRANSIT_ASNS = [174, 701, 1299, 2914, 3257, 3356, 6461, 6762, 7018, 3491, 5400, 1239]
# Realistic origin ASNs (mix of large providers and small networks)
_ORIGIN_POOL = [
13335, 15169, 16509, 8075, 20940, 32934, 714, 54113, 13414, 7922,
36459, 46489, 14618, 16276, 24940, 47541, 35916, 49981, 9808, 4134,
4837, 9121, 12322, 3320, 6830, 5511, 1273, 6939, 4766, 9318,
23693, 38001, 45102, 58453, 10026, 18881, 28573, 7738, 26599, 8151,
11888, 17676, 4713, 7545, 9299, 50304, 51167, 60068, 41095, 34984,
]
# IANA-allocated first octets for routable IPv4 (subset for realism)
_ROUTABLE_FIRST_OCTETS = list(range(1, 56)) + list(range(57, 127)) + list(range(128, 224))
def generate_full_internet(count=900000):
"""Generate a realistic full IPv4 routing table.
Distributes prefixes across the IPv4 address space with realistic
prefix lengths (/8 through /24) and varied AS paths.
Args:
count: Number of prefixes to generate (default 900K).
Returns:
List of route dicts.
"""
import random
rng = random.Random(42) # deterministic for reproducibility
routes = []
generated = set()
# Prefix length distribution (approximates real DFZ):
# /24: ~55%, /23: ~8%, /22: ~7%, /21: ~5%, /20: ~5%,
# /19: ~4%, /18: ~3%, /17: ~2%, /16: ~5%, /15-/8: ~6%
prefix_len_weights = {
24: 55, 23: 8, 22: 7, 21: 5, 20: 5,
19: 4, 18: 3, 17: 2, 16: 5, 15: 2,
14: 1, 13: 1, 12: 1, 11: 0.5, 10: 0.3,
9: 0.1, 8: 0.1,
}
plen_choices = list(prefix_len_weights.keys())
plen_weights = list(prefix_len_weights.values())
# AS path length distribution: 1-hop: 5%, 2-hop: 30%, 3-hop: 40%, 4-hop: 20%, 5-hop: 5%
path_len_weights = [5, 30, 40, 20, 5]
while len(routes) < count:
# Pick a routable first octet weighted by allocation density
first = rng.choice(_ROUTABLE_FIRST_OCTETS)
plen = rng.choices(plen_choices, weights=plen_weights, k=1)[0]
# Generate random prefix within this /8
if plen <= 8:
prefix = f'{first}.0.0.0/{plen}'
elif plen <= 16:
second = rng.randint(0, 255) & (0xFF << (16 - plen))
prefix = f'{first}.{second}.0.0/{plen}'
elif plen <= 24:
second = rng.randint(0, 255)
third = rng.randint(0, 255) & (0xFF << (24 - plen))
prefix = f'{first}.{second}.{third}.0/{plen}'
else:
continue
if prefix in generated:
continue
generated.add(prefix)
# Build realistic AS path
path_len = rng.choices([1, 2, 3, 4, 5], weights=path_len_weights, k=1)[0]
origin = rng.choice(_ORIGIN_POOL) if rng.random() < 0.3 else (64512 + rng.randint(0, 65535 - 64512))
transits = rng.sample(_TRANSIT_ASNS, min(path_len - 1, len(_TRANSIT_ASNS)))
as_path = [65100] + transits[:path_len - 1] + [origin]
# Occasionally add communities (~20% of routes)
communities = []
if rng.random() < 0.2:
communities.append(f'65100:{rng.choice([100, 200, 300, 400, 500])}')
routes.append(_r(prefix, as_path, communities=communities or None))
return routes
SCENARIOS = {
'internet_sample': {
'description': 'Partial internet table (~80 IPv4 + 14 IPv6 prefixes with realistic AS paths)',
@ -404,4 +572,16 @@ SCENARIOS = {
'description': '10 prefixes announced as if directly originated by AS 65100 — simulates a prefix hijack (community 65100:hijack)',
'routes': _HIJACK_ROUTES,
},
'te_community_steering': {
'description': 'Routes tagged with TE communities for color-based steering (65020:100=red, 65020:200=blue, 65020:300=green)',
'routes': _TE_COMMUNITY_ROUTES,
},
'origin_shift': {
'description': '5 prefixes with changed origin AS — simulates origin migration/hijack for anomaly detection',
'routes': _ORIGIN_SHIFT_ROUTES,
},
'path_diversity': {
'description': 'Same prefixes with different AS paths and MEDs — demonstrates best-path selection and path diversity',
'routes': _PATH_DIVERSITY_ROUTES,
},
}

View File

@ -3,16 +3,23 @@ set -e
LOCAL_IP=${EXABGP_LOCAL_IP:-10.40.40.202}
LOCAL_AS=${EXABGP_LOCAL_AS:-65100}
PEER_AS=${EXABGP_PEER_AS:-65020}
PEER_1=${EXABGP_PEER_1:-10.100.0.100}
PEER_2=${EXABGP_PEER_2:-10.100.0.200}
API_PORT=${EXABGP_API_PORT:-5050}
# Peer list — ";"-separated entries of "ip:peer_as:description".
# Default reproduces the original single-lab (AS 65020) config.
EXABGP_PEERS=${EXABGP_PEERS:-10.100.0.100:65020:CML-R9K-CORE-01;10.100.0.200:65020:CML-R9K-CORE-02}
echo "================================================================"
echo " ExaBGP Route Injector"
echo " Local: ${LOCAL_IP} AS${LOCAL_AS}"
echo " Peers: ${PEER_1}, ${PEER_2} (AS${PEER_AS})"
echo " API: http://0.0.0.0:${API_PORT}"
echo " Peers:"
IFS=';' read -ra PEER_ENTRIES <<< "$EXABGP_PEERS"
for entry in "${PEER_ENTRIES[@]}"; do
[ -z "$entry" ] && continue
IFS=':' read -r p_ip p_as p_desc <<< "$entry"
echo " - ${p_ip} AS${p_as} (${p_desc})"
done
echo "================================================================"
# Generate ExaBGP 5.x env file — ExaBGP looks here based on pip install prefix
@ -22,41 +29,30 @@ sed -i 's/drop = true/drop = false/' /usr/local/etc/exabgp/exabgp.env
sed -i 's/cli = true/cli = false/' /usr/local/etc/exabgp/exabgp.env
sed -i "s/destination = 'stdout'/destination = 'stderr'/" /usr/local/etc/exabgp/exabgp.env
# Generate exabgp.conf from environment
# Generate exabgp.conf — one neighbor block per peer-list entry
cat > /tmp/exabgp.conf << EOF
process api {
run /usr/local/bin/python3 /exabgp/api/server.py;
encoder text;
}
EOF
neighbor ${PEER_1} {
for entry in "${PEER_ENTRIES[@]}"; do
[ -z "$entry" ] && continue
IFS=':' read -r p_ip p_as p_desc <<< "$entry"
cat >> /tmp/exabgp.conf << EOF
neighbor ${p_ip} {
router-id ${LOCAL_IP};
local-address ${LOCAL_IP};
local-as ${LOCAL_AS};
peer-as ${PEER_AS};
description "CML-R9K-CORE-01";
hold-time 90;
family {
ipv4 unicast;
}
api {
processes [ api ];
neighbor-changes;
}
}
neighbor ${PEER_2} {
router-id ${LOCAL_IP};
local-address ${LOCAL_IP};
local-as ${LOCAL_AS};
peer-as ${PEER_AS};
description "CML-R9K-CORE-02";
peer-as ${p_as};
description "${p_desc}";
hold-time 90;
family {
ipv4 unicast;
ipv6 unicast;
}
api {
@ -65,5 +61,6 @@ neighbor ${PEER_2} {
}
}
EOF
done
exec exabgp server /tmp/exabgp.conf

158
gnmi/gnmi_grpc_config.py Normal file
View File

@ -0,0 +1,158 @@
#!/usr/bin/env python3
"""
gNMI gRPC Configuration Script
===============================
Enables gRPC dial-in telemetry on all 9 IOS-XR routers so that
Telegraf (or any gNMI collector) can subscribe to streaming
telemetry data.
What this script applies per router:
- gRPC server on port 57400 with TLS disabled
Uses SSH/CLI (paramiko) instead of NETCONF because IOS-XR 24.3.1
rejects the NETCONF edit-config for gRPC with "Need to enable GRPC first".
Router targets:
CORE-01 (10.100.0.100)
CORE-02 (10.100.0.200)
R9K-01 (10.100.0.1) through R9K-07 (10.100.0.7)
"""
import paramiko
import time
import sys
ROUTERS = [
('10.100.0.100', 'CORE-01'),
('10.100.0.200', 'CORE-02'),
('10.100.0.1', 'R9K-01'),
('10.100.0.2', 'R9K-02'),
('10.100.0.3', 'R9K-03'),
('10.100.0.4', 'R9K-04'),
('10.100.0.5', 'R9K-05'),
('10.100.0.6', 'R9K-06'),
('10.100.0.7', 'R9K-07'),
]
USERNAME = 'webui'
PASSWORD = 'cisco'
GRPC_PORT = 57400
CONFIG_COMMANDS = [
'configure terminal',
'grpc',
f'port {GRPC_PORT}',
'no-tls',
'commit',
'end',
]
def configure_router(mgmt_ip, label):
"""Apply gRPC configuration via SSH CLI."""
print(f"\n{''*60}")
print(f" Configuring {label} ({mgmt_ip})")
print(f"{''*60}")
print(f" Applying: gRPC port={GRPC_PORT} no-tls")
try:
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(mgmt_ip, username=USERNAME, password=PASSWORD, timeout=10)
shell = client.invoke_shell()
time.sleep(1)
shell.recv(65535) # clear banner
for cmd in CONFIG_COMMANDS:
shell.send(cmd + '\n')
time.sleep(1.5)
output = shell.recv(65535).decode()
client.close()
if 'error' in output.lower() or 'fail' in output.lower():
print(f" ✗ ERROR on {label}: {output.strip()}")
return False
print(f"{label} done.")
return True
except Exception as e:
print(f" ✗ ERROR on {label}: {e}")
return False
def verify_router(mgmt_ip, label):
"""Verify gRPC configuration via SSH."""
try:
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(mgmt_ip, username=USERNAME, password=PASSWORD, timeout=10)
shell = client.invoke_shell()
time.sleep(1)
shell.recv(65535)
shell.send('show running-config grpc\n')
time.sleep(3)
output = shell.recv(65535).decode()
client.close()
has_port = f'port {GRPC_PORT}' in output
has_notls = 'no-tls' in output
p = '' if has_port else ''
t = '' if has_notls else ''
status = 'OK' if (has_port and has_notls) else 'INCOMPLETE'
print(f" {label:8s} grpc-port={p} no-tls={t} [{status}]")
return has_port and has_notls
except Exception as e:
print(f" {label:8s} verify error: {e}")
return False
def main():
print("gNMI gRPC Configuration Script")
print("================================")
print(f"Targets: all {len(ROUTERS)} routers")
print()
results = []
for mgmt_ip, label in ROUTERS:
ok = configure_router(mgmt_ip, label)
results.append((mgmt_ip, label, ok))
# Verification pass
print(f"\n{'='*60}")
print("Post-apply verification")
print('='*60)
print(f" {'Router':8s} {'gRPC Port':9s} {'No-TLS':6s} Status")
all_ok = True
for mgmt_ip, label, applied_ok in results:
if applied_ok:
if not verify_router(mgmt_ip, label):
all_ok = False
else:
print(f" {label:8s} skipped (apply failed)")
all_ok = False
failed = [label for _, label, ok in results if not ok]
print()
if failed:
print(f"FAILED: {', '.join(failed)}")
sys.exit(1)
elif all_ok:
print("All routers configured successfully.")
print()
print(f"gRPC is now listening on port {GRPC_PORT} (no TLS) on all routers.")
print("Next: start Telegraf with gNMI input plugin to begin collecting telemetry.")
else:
print("Some routers may have incomplete configuration. Check output above.")
sys.exit(1)
if __name__ == '__main__':
main()

45
gobgp-evpn/README.md Normal file
View File

@ -0,0 +1,45 @@
# gobgp-evpn — modular EVPN test-route injector
A **profile-gated, non-production** GoBGP instance for exercising the EVPN
ingestion pipeline (roadmap E5). The CML IOS-XR lab cannot originate EVPN
routes, so this container synthesises them.
## What it does
`gobgp-evpn` runs GoBGP with no BGP peers, BMP-exporting its local RIB
(`route-monitoring-policy = local-rib`) to the OpenBMP collector. Routes
injected with `inject-evpn.sh` are parsed by the collector and published to
the `openbmp.parsed.evpn` Kafka topic, where the EVPN consumer picks them up
and writes the `evpn_rib` table.
## Usage
```sh
# start the injector (not started by a normal `docker compose up`)
docker compose --profile evpn-test up -d gobgp-evpn
# push synthetic type-2 / type-3 / type-5 EVPN routes
bash gobgp-evpn/inject-evpn.sh
# inspect what GoBGP holds
docker exec obmp-gobgp-evpn gobgp global rib -a evpn
# stop it when done testing
docker compose --profile evpn-test stop gobgp-evpn
```
## Notes
- Local AS 65010, router-id 10.40.40.251 — distinct from the production
`gobgp` global-table feed (AS 65001).
- It is *not* part of the default stack: the `evpn-test` Compose profile
keeps it out of production and lets it be started/stopped on demand.
## Collector type-5 limitation
The OpenBMP collector 2.2.3 parses EVPN **type-2 (MAC/IP)** and **type-3
(inclusive multicast)** cleanly, but **mis-decodes type-5 (IP-prefix)**: the
IP prefix bleeds into the RD field on the `openbmp.parsed.evpn` topic
(observed garbage RDs such as `6154:3523870730`). `inject-evpn.sh` therefore
injects only type-2 and type-3. Full type-5 support needs a newer collector
or the GoBMP path — see `docs/ROADMAP.md` Track E (E5).

View File

@ -0,0 +1,29 @@
# GoBGP -- modular EVPN test-route injector (roadmap E5)
#
# A profile-gated, throwaway GoBGP instance whose only job is to originate
# synthetic BGP EVPN routes and BMP-export them to the OpenBMP collector, so
# the EVPN ingestion pipeline (collector -> Kafka openbmp.parsed.evpn ->
# evpn-consumer -> evpn_rib) can be exercised. NOT a production component --
# start it only when testing:
# docker compose --profile evpn-test up -d gobgp-evpn
# bash gobgp-evpn/inject-evpn.sh
#
# It has no BGP peers; routes are injected straight into the local RIB, so
# BMP export uses route-monitoring-policy = local-rib.
[global]
[global.config]
as = 65010
router-id = "10.40.40.251"
# No inbound BGP listener -- we only originate locally and BMP-export.
port = -1
# --- BMP export to the OpenBMP collector ------------------------------------
[[bmp-servers]]
[bmp-servers.config]
address = "__HOST_IP__"
port = 5000
# local-rib: the injected EVPN routes live in the loc-rib (there are no
# BGP peers / no adj-rib-in), so export the local RIB.
route-monitoring-policy = "local-rib"
statistics-timeout = 3600

34
gobgp-evpn/inject-evpn.sh Executable file
View File

@ -0,0 +1,34 @@
#!/usr/bin/env bash
#
# inject-evpn.sh -- push synthetic BGP EVPN routes into the gobgp-evpn
# instance so the EVPN ingestion pipeline can be tested end to end.
#
# Run from the docker host after starting the injector:
# docker compose --profile evpn-test up -d gobgp-evpn
# bash gobgp-evpn/inject-evpn.sh
#
# Routes land in gobgp-evpn's local RIB and are BMP-exported to the collector
# (route-monitoring-policy = local-rib), parsed onto the openbmp.parsed.evpn
# Kafka topic. Re-running is harmless (GoBGP de-dupes identical routes).
set -euo pipefail
G=(docker exec obmp-gobgp-evpn gobgp global rib add -a evpn)
echo "Injecting EVPN type-2 (MAC/IP advertisement) routes..."
"${G[@]}" macadv aa:bb:cc:00:00:01 10.200.10.1 etag 100 label 10100 rd 65010:100 rt 65010:100 encap vxlan
"${G[@]}" macadv aa:bb:cc:00:00:02 10.200.10.2 etag 100 label 10100 rd 65010:100 rt 65010:100 encap vxlan
"${G[@]}" macadv aa:bb:cc:00:00:03 10.200.20.1 etag 200 label 10200 rd 65010:200 rt 65010:200 encap vxlan
echo "Injecting EVPN type-3 (inclusive multicast) routes..."
"${G[@]}" multicast 10.40.40.251 etag 100 rd 65010:100 rt 65010:100
"${G[@]}" multicast 10.40.40.251 etag 200 rd 65010:200 rt 65010:200
# NOTE: EVPN type-5 (IP-prefix) routes are intentionally NOT injected.
# The OpenBMP collector 2.2.3 parses type-2 (MAC/IP) and type-3 (multicast)
# cleanly, but mis-decodes the type-5 NLRI — the IP prefix bleeds into the
# RD field (observed RDs like '6154:3523870730'). Type-5 visibility needs a
# newer collector or the GoBMP path — see docs/ROADMAP.md E5.
echo
echo "Current EVPN RIB on gobgp-evpn:"
docker exec obmp-gobgp-evpn gobgp global rib -a evpn

162
gobgp/README.md Normal file
View File

@ -0,0 +1,162 @@
# GoBGP global Internet table feed (roadmap E1)
This service runs [GoBGP](https://github.com/osrg/gobgp) to pull the **full real
Internet routing table** (IPv4 ~1M + IPv6 ~200k routes) from Łukasz Bromirski's
lab route server (**AS57355**) and BMP-export every received route to the
OpenBMP collector. The table lands in PostgreSQL `ip_rib` as a monitored peer.
- Image: `jauderho/gobgp:v4.5.0` — community-maintained, multi-arch, tracks
upstream GoBGP releases (rebuilt within an hour of each release). Chosen
because the official `osrg/gobgp` image is published less consistently.
- Local AS: **65001** (private). Router-id: `10.40.40.250`.
- The session is **receive-only** — we announce nothing to the route server.
## Files
| File | Purpose |
|------------------|----------------------------------------------------------------|
| `gobgpd.conf` | GoBGP daemon config (global, neighbors, BMP export). TOML. |
| `mrt-refresh.sh` | MRT full-table fallback loader (cron-driven). |
| `mrt/` | Created at runtime; cached RouteViews RIB dumps. |
## Bring it up
The `gobgp` service is defined in the repo `docker-compose.yml`, on the same
default compose network as `collector`, and `depends_on` it.
```sh
docker compose config # validate compose is well-formed
docker compose up -d gobgp # start (collector must be running)
docker logs -f obmp-gobgp
```
> The live BGP cutover is performed by a human — bringing the container up is
> all that is needed; GoBGP initiates the eBGP-multihop sessions automatically.
## Confirm the session and route count
```sh
# session state — expect both neighbors in "Establ"
docker exec obmp-gobgp gobgp neighbor
# received route counts — expect ~1M IPv4, ~200k IPv6
docker exec obmp-gobgp gobgp global rib summary -a ipv4
docker exec obmp-gobgp gobgp global rib summary -a ipv6
```
## How the data appears in OpenBMP
GoBGP opens an outbound **BMP** session to `obmp-collector:5000` with
`route-monitoring-policy = "pre-policy"` (Adj-RIB-In, pre import-policy —
consistent with the rest of the OpenBMP fleet).
In OpenBMP / PostgreSQL the source is identified by the **BMP router**, which
GoBGP reports using its `router-id` (`10.40.40.250`) and `local-as` (`65001`):
- `routers` table — a row with `ip_address` / name derived from `10.40.40.250`.
- `bgp_peers` table — two peer rows for `85.232.240.179` and
`2001:1a68:2c:2::179`, both `peer_as = 57355`.
- `ip_rib` — every prefix from the global table, attributed to those peers.
To find it in Grafana/SQL, filter on `peer_as = 57355` or the router-id above.
## Fleet-wide full-table feed into the CML lab (stress test)
GoBGP additionally re-advertises the full table to the two CML core routers
(CORE-01/CORE-02, AS65020). As route reflectors the cores propagate it to all
seven R9K clients, so every lab router carries and BMP-exports a full table —
an intentional stress test of the OpenBMP ingestion/storage path (the database
grows toward ~55-65 GB).
- **GoBGP side**`gobgpd.conf` neighbors `10.100.0.100` / `10.100.0.200`
(peer-as 65020, eBGP-multihop, IPv4+IPv6, `prefix-limit` caps). The
route-server sessions carry `default-export-policy = "reject-route"` so the
lab's own routes can never leak back to AS57355.
- **Router side**`cml/gobgp_peering_config.py` adds the `neighbor
10.40.40.202` config (with `maximum-prefix 1.5M`/`400k` caps) to both cores.
GoBGP is host-networked, so it sources BGP TCP from the host IP
`10.40.40.202`, not its router-id `10.40.40.250` — the cores peer with the
host IP.
### Apply
```sh
python3 cml/gobgp_peering_config.py # configure both cores
docker compose up -d --force-recreate gobgp # load gobgpd.conf changes
```
> A volume-mounted config change does NOT trigger a recreate on its own —
> `--force-recreate` is required for GoBGP to re-read `gobgpd.conf`.
### Rollback
**Emergency stop** (fastest — feed off within seconds, no router change):
```sh
docker compose stop gobgp
```
Stopping GoBGP drops the eBGP sessions; the cores withdraw the full table and
the withdrawal propagates to every client. The `ip_rib` rows are marked
withdrawn and aged out by the existing TimescaleDB retention.
**Full revert** (also removes the router-side config):
```sh
python3 cml/gobgp_peering_config.py --remove # delete neighbor from cores
docker compose stop gobgp
```
To keep the Bromirski feed running but drop only the lab injection, delete the
two `10.100.0.x` `[[neighbors]]` blocks from `gobgpd.conf` and
`docker compose up -d --force-recreate gobgp`.
### What to watch during convergence
```sh
docker exec obmp-gobgp gobgp neighbor # 4 sessions Establ
docker logs --tail 20 obmp-psql-app # consumer lag
docker exec obmp-psql psql -U openbmp -d openbmp -c \
"SELECT count(*) FROM ip_rib WHERE iswithdrawn = false;" # row growth
```
If `psql-app` consumer lag climbs without draining, or PostgreSQL CPU/IO
saturates, use the emergency stop above.
## MRT fallback
AS57355 is a **single volunteer-run host with no SLA** — it can and does go
away. `mrt-refresh.sh` keeps the global table in `ip_rib` warm when the live
feed is down:
1. If any AS57355 session is `Established`, the script does nothing — the live
feed is authoritative and must not be overwritten with a stale dump.
2. Otherwise it downloads the latest full RIB dump from RouteViews
(`https://archive.routeviews.org/route-views/bgpdata/YYYY.MM/RIBS/rib.YYYYMMDD.HHMM.bz2`,
published every 2 hours UTC) and runs `gobgp mrt inject global <file>`,
which installs every prefix into the running daemon. BMP export to the
collector then happens automatically.
The script is idempotent (re-uses an already-downloaded dump), guarded by a
`flock` against overlapping runs, and prunes to the 4 most recent dumps.
### Schedule it (host crontab, 2-hour cadence)
```cron
0 */2 * * * docker exec obmp-gobgp /config/mrt-refresh.sh >> /var/log/gobgp-mrt.log 2>&1
```
Run it once manually to verify:
```sh
docker exec obmp-gobgp /config/mrt-refresh.sh
```
## Caveats
- **No SLA.** AS57355 is a volunteer lab route server; treat the live feed as
best-effort and rely on the MRT fallback for continuity.
- eBGP-multihop TTL is set to 64 — the route server is many hops away.
- A full table is ~1M+ prefixes; expect a noticeable load spike in the
collector and PostgreSQL when the session first establishes or an MRT dump
is injected.

170
gobgp/gobgpd.conf.tmpl Normal file
View File

@ -0,0 +1,170 @@
# GoBGP daemon configuration -- OpenBMP "global Internet table" feed (roadmap E1)
#
# Pulls the full real Internet routing table (IPv4 ~1M + IPv6 ~200k routes)
# from Lukasz Bromirski's lab route server (AS57355) and BMP-exports every
# received route to the OpenBMP collector, where it lands in PostgreSQL ip_rib.
# Peering spec: https://lukasz.bromirski.net/post/bgp-w-labie-3/
#
# It ALSO re-advertises the full table to the two CML core routers
# (CORE-01/CORE-02, AS65020) over eBGP. As route reflectors the cores
# propagate it to every R9K client -- so all 9 lab routers carry and
# BMP-export a full table. This is an intentional lab stress test of the
# OpenBMP ingestion/storage path (~9x full feeds; DB grows to ~55-65 GB).
#
# Local AS is 65001 (the value the Bromirski route server expects).
# Bromirski peering: eBGP multihop, no password, keepalive 3600 / hold 7200.
# TOML syntax targets GoBGP v3.x / v4.x.
[global]
[global.config]
as = 65001
router-id = "10.40.40.250"
# We only originate outbound sessions (to the route server and to the
# two cores) so the inbound BGP listener stays disabled (port -1) -- no
# privileged (<1024) bind needed under docker network_mode: host.
port = -1
# Note: once we peer with the cores, GoBGP learns the cores' lab routes over
# eBGP. To guarantee none of that leaks back to AS57355 (which asks peers to
# announce NOTHING), the route-server sessions below carry an apply-policy
# with default-export-policy = "reject-route" -- every export is dropped.
# --- Neighbor: route server, IPv4 feed --------------------------------------
# The IPv4 transport session carries the full IPv4 table only.
[[neighbors]]
[neighbors.config]
neighbor-address = "85.232.240.179"
peer-as = 57355
description = "AS57355 Bromirski lab route-server (IPv4 feed)"
[neighbors.timers.config]
keepalive-interval = 3600
hold-time = 7200
[neighbors.ebgp-multihop.config]
enabled = true
multihop-ttl = 64
[neighbors.transport.config]
# we initiate the session; no local-address pinning
passive-mode = false
[neighbors.apply-policy.config]
# reject every export toward the route server
default-export-policy = "reject-route"
[[neighbors.afi-safis]]
[neighbors.afi-safis.config]
afi-safi-name = "ipv4-unicast"
# --- Neighbor: route server, IPv6 feed --------------------------------------
# The IPv6 transport session carries the full IPv6 table only.
[[neighbors]]
[neighbors.config]
neighbor-address = "2001:1a68:2c:2::179"
peer-as = 57355
description = "AS57355 Bromirski lab route-server (IPv6 feed)"
[neighbors.timers.config]
keepalive-interval = 3600
hold-time = 7200
[neighbors.ebgp-multihop.config]
enabled = true
multihop-ttl = 64
[neighbors.transport.config]
passive-mode = false
[neighbors.apply-policy.config]
# reject every export toward the route server
default-export-policy = "reject-route"
[[neighbors.afi-safis]]
[neighbors.afi-safis.config]
afi-safi-name = "ipv6-unicast"
# --- Neighbor: CML CORE-01 (AS65020) ----------------------------------------
# GoBGP initiates outbound to the core's mgmt IP (reachable from the docker
# host -- the cores already reach the host for BMP). GoBGP sources the session
# from the host IP __HOST_IP__. eBGP multihop: the host is several hops from
# the core. Default export policy (accept) re-advertises the full Bromirski
# table to the core. prefix-limit is a safety cap on what the core can send
# back (its lab routes only -- small).
# IPv4-unicast only: the cores have no global IPv6 address, so an ipv6 AF
# would hold the session Idle. IPv6 full-table feed is a separate phase.
[[neighbors]]
[neighbors.config]
neighbor-address = "10.100.0.100"
peer-as = 65020
description = "CML CORE-01 -- full-table injection (lab stress test)"
[neighbors.ebgp-multihop.config]
enabled = true
multihop-ttl = 64
[neighbors.transport.config]
passive-mode = false
[[neighbors.afi-safis]]
[neighbors.afi-safis.config]
afi-safi-name = "ipv4-unicast"
[neighbors.afi-safis.prefix-limit.config]
max-prefixes = 2000000
shutdown-threshold-pct = 90
# --- Neighbor: CML CORE-02 (AS65020) ----------------------------------------
[[neighbors]]
[neighbors.config]
neighbor-address = "10.100.0.200"
peer-as = 65020
description = "CML CORE-02 -- full-table injection (lab stress test)"
[neighbors.ebgp-multihop.config]
enabled = true
multihop-ttl = 64
[neighbors.transport.config]
passive-mode = false
[[neighbors.afi-safis]]
[neighbors.afi-safis.config]
afi-safi-name = "ipv4-unicast"
[neighbors.afi-safis.prefix-limit.config]
max-prefixes = 2000000
shutdown-threshold-pct = 90
# --- Neighbor: PROX CORE-01 (AS65021) ---------------------------------------
# Second lab. Same IPv4-unicast-only full-table injection as the CML cores.
[[neighbors]]
[neighbors.config]
neighbor-address = "10.100.1.100"
peer-as = 65021
description = "PROX CORE-01 -- full-table injection (lab stress test)"
[neighbors.ebgp-multihop.config]
enabled = true
multihop-ttl = 64
[neighbors.transport.config]
passive-mode = false
[[neighbors.afi-safis]]
[neighbors.afi-safis.config]
afi-safi-name = "ipv4-unicast"
[neighbors.afi-safis.prefix-limit.config]
max-prefixes = 2000000
shutdown-threshold-pct = 90
# --- Neighbor: PROX CORE-02 (AS65021) ---------------------------------------
[[neighbors]]
[neighbors.config]
neighbor-address = "10.100.1.200"
peer-as = 65021
description = "PROX CORE-02 -- full-table injection (lab stress test)"
[neighbors.ebgp-multihop.config]
enabled = true
multihop-ttl = 64
[neighbors.transport.config]
passive-mode = false
[[neighbors.afi-safis]]
[neighbors.afi-safis.config]
afi-safi-name = "ipv4-unicast"
[neighbors.afi-safis.prefix-limit.config]
max-prefixes = 2000000
shutdown-threshold-pct = 90
# --- BMP export to the OpenBMP collector ------------------------------------
# GoBGP connects OUT to the collector. GoBGP's BMP config requires a literal
# IP (it cannot resolve a hostname), so we target the docker host IP where the
# collector publishes port 5000 -- stable across container recreation, unlike
# the collector's internal docker IP. Matches HOST_IP in .env.
# route-monitoring-policy = "pre-policy" exports the Adj-RIB-In (received
# routes, pre import-policy) -- consistent with the rest of the OpenBMP fleet.
[[bmp-servers]]
[bmp-servers.config]
address = "__HOST_IP__"
port = 5000
route-monitoring-policy = "pre-policy"
statistics-timeout = 3600

104
gobgp/mrt-refresh.sh Executable file
View File

@ -0,0 +1,104 @@
#!/usr/bin/env bash
#
# mrt-refresh.sh -- MRT full-table fallback loader for the OpenBMP GoBGP feed.
#
# Roadmap E1. The live route server (AS57355) is a single volunteer-run host
# with no SLA. When it is unreachable, the global table in PostgreSQL ip_rib
# would otherwise age out. This script downloads the latest RouteViews full
# MRT RIB dump and injects it into the running gobgpd so the table stays warm.
#
# Designed to be idempotent and cron-safe at a 2-hour cadence:
# - it only downloads a dump it does not already have,
# - it only injects when the live route server session is NOT established,
# - concurrent runs are guarded by a flock.
#
# Run it INSIDE the gobgp container (it shells out to the local `gobgp` CLI):
# docker exec obmp-gobgp /config/mrt-refresh.sh
#
# Example crontab entry on the docker host (every 2 hours):
# 0 */2 * * * docker exec obmp-gobgp /config/mrt-refresh.sh >> /var/log/gobgp-mrt.log 2>&1
set -euo pipefail
# --- tunables ---------------------------------------------------------------
MRT_DIR="${MRT_DIR:-/config/mrt}"
RV_BASE="${RV_BASE:-https://archive.routeviews.org/route-views/bgpdata}"
GOBGP="${GOBGP:-gobgp}"
LOCKFILE="${LOCKFILE:-/tmp/gobgp-mrt-refresh.lock}"
# RouteViews publishes a full RIB dump every 2 hours; dumps land a few minutes
# after the even hour, so we look back a safe margin.
LOOKBACK_HOURS="${LOOKBACK_HOURS:-4}"
log() { echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] $*"; }
# --- single-instance guard --------------------------------------------------
exec 9>"${LOCKFILE}"
if ! flock -n 9; then
log "another mrt-refresh run is in progress; exiting"
exit 0
fi
mkdir -p "${MRT_DIR}"
# --- skip if the live route server is up ------------------------------------
# If any AS57355 neighbor is Established, the live feed is authoritative and
# we must NOT inject a stale MRT dump on top of it.
if ${GOBGP} neighbor 2>/dev/null | grep -qiE 'Establ'; then
log "a BGP session is Established; live feed is healthy, skipping MRT inject"
exit 0
fi
log "no Established BGP session; proceeding with MRT fallback"
# --- locate the most recent available RIB dump -----------------------------
# RouteViews RIB dumps:
# <RV_BASE>/YYYY.MM/RIBS/rib.YYYYMMDD.HHMM.bz2
# RIB dumps are taken at even hours (00,02,04,...,22) UTC.
found_url=""
found_file=""
now_epoch="$(date -u +%s)"
for ((h = 0; h <= LOOKBACK_HOURS; h++)); do
ts_epoch=$(( now_epoch - h * 3600 ))
hh="$(date -u -d "@${ts_epoch}" +%H)"
# only even hours carry RIB dumps
if (( 10#${hh} % 2 != 0 )); then
continue
fi
ym="$(date -u -d "@${ts_epoch}" +%Y.%m)"
ymd="$(date -u -d "@${ts_epoch}" +%Y%m%d)"
fname="rib.${ymd}.${hh}00.bz2"
url="${RV_BASE}/${ym}/RIBS/${fname}"
if curl -fsI --max-time 30 "${url}" >/dev/null 2>&1; then
found_url="${url}"
found_file="${fname}"
break
fi
done
if [[ -z "${found_url}" ]]; then
log "ERROR: no RouteViews RIB dump found within ${LOOKBACK_HOURS}h lookback"
exit 1
fi
dest="${MRT_DIR}/${found_file}"
# --- download (idempotent) --------------------------------------------------
if [[ -s "${dest}" ]]; then
log "already have ${found_file}; reusing cached copy"
else
log "downloading ${found_url}"
tmp="${dest}.partial.$$"
curl -fsSL --max-time 600 -o "${tmp}" "${found_url}"
mv -f "${tmp}" "${dest}"
log "downloaded $(du -h "${dest}" | cut -f1) -> ${dest}"
fi
# --- inject into the running gobgpd -----------------------------------------
# `gobgp mrt inject global` reads the bz2 dump directly and installs every
# prefix into the global RIB; BMP export to the collector follows automatically.
log "injecting ${found_file} into gobgpd global RIB"
${GOBGP} mrt inject global "${dest}"
log "MRT inject complete"
# --- housekeeping: keep only the 4 most recent dumps ------------------------
( cd "${MRT_DIR}" && ls -1t rib.*.bz2 2>/dev/null | tail -n +5 | xargs -r rm -f )
log "done"

View File

@ -0,0 +1,8 @@
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY monitor.py .
CMD ["python", "-u", "monitor.py"]

View File

@ -0,0 +1,96 @@
#!/usr/bin/env python3
"""kafka-lag-monitor -- samples Kafka consumer-group lag into PostgreSQL.
Every LAG_POLL_INTERVAL seconds it records, per (group, topic, partition), the
committed offset, log-end offset and lag, plus the group's active member
count. The Kafka Lag dashboard reads kafka_consumer_lag / kafka_consumer_members
(postgres/scripts/009_kafka_lag.sql) so the ingestion path can be sanity-
checked -- watch lag spike during a BGP convergence storm and drain again, and
confirm the member count when psql-app is scaled out.
"""
import os
import time
import traceback
import psycopg2
from confluent_kafka import Consumer, TopicPartition, ConsumerGroupTopicPartitions
from confluent_kafka.admin import AdminClient
BROKER = os.environ.get("KAFKA_BROKER", "obmp-kafka:29092")
PG_DSN = os.environ.get(
"PG_DSN",
"host=obmp-psql port=5432 dbname=openbmp user=openbmp password=openbmp",
)
INTERVAL = int(os.environ.get("LAG_POLL_INTERVAL", "30"))
GROUPS = [g.strip() for g in os.environ.get(
"CONSUMER_GROUPS", "obmp-psql-consumer,evpn-psql").split(",") if g.strip()]
def sample_group(admin, consumer, group):
"""Return (lag_rows, member_count) for one consumer group.
lag_rows: [(group, topic, partition, committed, log_end, lag), ...]
"""
futs = admin.list_consumer_group_offsets(
[ConsumerGroupTopicPartitions(group)])
offsets = futs[group].result(timeout=30)
rows = []
for tp in offsets.topic_partitions:
committed = tp.offset if (tp.offset is not None and tp.offset >= 0) else 0
try:
_, log_end = consumer.get_watermark_offsets(
TopicPartition(tp.topic, tp.partition), timeout=10, cached=False)
except Exception:
continue
rows.append((group, tp.topic, tp.partition, committed, log_end,
max(log_end - committed, 0)))
members = None
try:
desc = admin.describe_consumer_groups([group])[group].result(timeout=30)
members = len(desc.members)
except Exception:
pass
return rows, members
def main():
print(f"kafka-lag-monitor starting; broker={BROKER}, groups={GROUPS}, "
f"interval={INTERVAL}s", flush=True)
admin = AdminClient({"bootstrap.servers": BROKER})
consumer = Consumer({"bootstrap.servers": BROKER,
"group.id": "kafka-lag-monitor-probe",
"enable.auto.commit": False})
while True:
start = time.time()
try:
conn = psycopg2.connect(PG_DSN)
with conn.cursor() as cur:
for group in GROUPS:
try:
rows, members = sample_group(admin, consumer, group)
except Exception as e:
print(f" [{group}] sample failed: {e}", flush=True)
continue
for r in rows:
cur.execute(
"INSERT INTO kafka_consumer_lag (group_id, topic, "
"partition, committed, log_end, lag) "
"VALUES (%s,%s,%s,%s,%s,%s)", r)
if members is not None:
cur.execute(
"INSERT INTO kafka_consumer_members (group_id, "
"members) VALUES (%s,%s)", (group, members))
total = sum(r[5] for r in rows)
print(f" [{group}] {len(rows)} partitions, "
f"total lag={total}, members={members}", flush=True)
conn.commit()
conn.close()
except Exception as e:
print(f"cycle failed: {e}", flush=True)
traceback.print_exc()
time.sleep(max(0, INTERVAL - (time.time() - start)))
if __name__ == "__main__":
main()

View File

@ -0,0 +1,2 @@
confluent-kafka==2.5.3
psycopg2-binary==2.9.9

View File

@ -0,0 +1,8 @@
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY monitor.py .
CMD ["python", "-u", "monitor.py"]

View File

@ -0,0 +1,102 @@
#!/usr/bin/env python3
"""obmp-churn-monitor -- decoupled fast-path BGP churn-rate monitor.
Reads openbmp.parsed.unicast_prefix from Kafka with its OWN consumer group and
only COUNTS announcements/withdrawals per (router, peer) -- no relational RIB
maintenance, no per-route DB upserts. Because counting is orders of magnitude
cheaper than psql-app's work, this stays real-time even when the main
ingestion pipeline lags minutes behind during a churn storm.
It does NOT make the bulk DB write faster -- it guarantees churn *visibility*
survives a storm the bulk pipeline cannot keep up with. Aggregates flush every
FLUSH_INTERVAL seconds to churn_metrics (postgres/scripts/010_churn_metrics.sql);
the Live BGP Churn dashboard reads it.
The consumer starts at the topic head (auto.offset.reset=latest, no commits):
on restart it jumps to current churn rather than replaying a stale backlog.
"""
import collections
import os
import time
import psycopg2
from confluent_kafka import Consumer
BROKER = os.environ.get("KAFKA_BROKER", "obmp-kafka:29092")
TOPIC = os.environ.get("CHURN_TOPIC", "openbmp.parsed.unicast_prefix")
PG_DSN = os.environ.get(
"PG_DSN",
"host=obmp-psql port=5432 dbname=openbmp user=openbmp password=openbmp",
)
FLUSH_INTERVAL = int(os.environ.get("FLUSH_INTERVAL", "10"))
# Tab-separated field positions in an openbmp.parsed.unicast_prefix data row.
F_ACTION = 0 # "add" (announce) or "del" (withdraw)
F_ROUTER_IP = 4 # BMP router management IP
F_PEER_IP = 7 # BGP peer address
F_PEER_ASN = 8 # BGP peer ASN
def flush(counts, window):
"""Write accumulated per-(router,peer) counts to churn_metrics."""
if not counts:
return
try:
conn = psycopg2.connect(PG_DSN)
with conn.cursor() as cur:
for (router, peer, asn), (adds, dels) in counts.items():
cur.execute(
"INSERT INTO churn_metrics (router_ip, peer_ip, peer_asn, "
"adds, dels) VALUES (%s,%s,%s,%s,%s)",
(router or None, peer or None, asn, adds, dels))
conn.commit()
conn.close()
tot_a = sum(v[0] for v in counts.values())
tot_d = sum(v[1] for v in counts.values())
print(f"flush: {len(counts)} sessions, +{tot_a} -{tot_d} "
f"({(tot_a + tot_d) / max(window, 1):.0f} msg/s) over "
f"{window:.0f}s", flush=True)
except Exception as e:
print(f"flush failed: {e}", flush=True)
def main():
print(f"obmp-churn-monitor: topic={TOPIC}, flush={FLUSH_INTERVAL}s", flush=True)
consumer = Consumer({
"bootstrap.servers": BROKER,
"group.id": "obmp-churn-monitor",
"auto.offset.reset": "latest",
"enable.auto.commit": False,
})
consumer.subscribe([TOPIC])
counts = collections.defaultdict(lambda: [0, 0]) # key -> [adds, dels]
last_flush = time.time()
while True:
msg = consumer.poll(1.0)
if msg is not None and not msg.error():
text = msg.value().decode("utf-8", errors="replace")
for line in text.split("\n"):
if not (line.startswith("add\t") or line.startswith("del\t")):
continue
f = line.split("\t")
if len(f) <= F_PEER_ASN:
continue
try:
asn = int(f[F_PEER_ASN])
except ValueError:
asn = None
key = (f[F_ROUTER_IP], f[F_PEER_IP], asn)
if f[F_ACTION] == "add":
counts[key][0] += 1
else:
counts[key][1] += 1
now = time.time()
if now - last_flush >= FLUSH_INTERVAL:
flush(counts, now - last_flush)
counts.clear()
last_flush = now
if __name__ == "__main__":
main()

View File

@ -0,0 +1,2 @@
confluent-kafka==2.5.3
psycopg2-binary==2.9.9

View File

@ -0,0 +1,8 @@
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY consumer.py .
CMD ["python", "-u", "consumer.py"]

View File

@ -0,0 +1,223 @@
#!/usr/bin/env python3
"""obmp-evpn-consumer — OpenBMP EVPN -> PostgreSQL (roadmap E5).
Subscribes to the Kafka topic `openbmp.parsed.evpn` (the OpenBMP collector
already decodes EVPN and publishes it there) and writes BGP EVPN routes into
the `evpn_rib` table. The stock openbmp/psql-app never consumes this topic;
this process fills that gap.
Field positions are pinned to the collector 2.2.3 / message-bus v1.7 layout,
verified off the live topic. The collector parses EVPN type-2 (MAC/IP) and
type-3 (inclusive multicast) cleanly; type-5 (IP-prefix) is mis-decoded
upstream and is not relied on here.
"""
import os
import sys
import time
import psycopg2
from psycopg2.extras import execute_values
from confluent_kafka import Consumer, KafkaException
KAFKA_BROKER = os.environ.get("KAFKA_BROKER", "obmp-kafka:29092")
TOPIC = os.environ.get("EVPN_TOPIC", "openbmp.parsed.evpn")
GROUP_ID = os.environ.get("EVPN_GROUP", "evpn-psql")
PG_DSN = os.environ.get(
"PG_DSN", "host=obmp-psql port=5432 dbname=openbmp user=openbmp password=openbmp"
)
BATCH_SECONDS = 2.0
# 0-indexed field positions in a parsed EVPN data row (collector 2.2.3, v1.7).
F_ACTION, F_HASH = 0, 2
F_BASE_ATTR, F_PEER_HASH = 5, 6
F_TIMESTAMP = 9
F_ORIGIN_AS = 13
F_EXT_COMM = 19
F_PATH_ID = 24
F_RD, F_RD_TYPE = 27, 28
F_ORIG_RTR_IP = 30
F_ETH_TAG, F_ESI = 31, 32
F_MAC_LEN, F_MAC = 33, 34
F_IP_LEN, F_IP = 35, 36
F_LABEL1, F_LABEL2 = 37, 38
MIN_FIELDS = 39
def log(msg):
print(f"[{time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime())}] {msg}", flush=True)
def nz(s):
s = (s or "").strip()
return s or None
def to_int(s):
s = nz(s)
if s is None:
return None
try:
return int(s)
except ValueError:
return None
def hex_to_int(s):
s = nz(s)
if s is None:
return None
try:
return int(s, 16)
except ValueError:
return None
def parse_rts(field):
"""The ext-community field looks like 'rt=65010:100 encap=8' — keep the RTs."""
rts = [t[3:] for t in (field or "").split() if t.startswith("rt=")]
return rts or None
def derive_route_type(mac, orig_rtr_ip):
if mac:
return 2 # MAC/IP advertisement
if orig_rtr_ip:
return 3 # inclusive multicast
return 5 # IP-prefix
def parse_message(raw):
"""OpenBMP message: 'K: V' header lines, a blank line, then R tab-sep rows."""
text = raw.decode("utf-8", errors="replace")
if "\n\n" not in text:
return []
_, body = text.split("\n\n", 1)
return [ln.split("\t") for ln in body.splitlines() if "\t" in ln]
def row_to_record(r):
if len(r) < MIN_FIELDS:
return None
mac = nz(r[F_MAC])
orig_rtr_ip = nz(r[F_ORIG_RTR_IP])
return {
"action": r[F_ACTION].strip().lower(),
"hash_id": r[F_HASH].strip(),
"peer_hash_id": r[F_PEER_HASH].strip(),
"base_attr_hash_id": nz(r[F_BASE_ATTR]),
"rd": r[F_RD].strip() or "0:0",
"rd_type": to_int(r[F_RD_TYPE]),
"route_type": derive_route_type(mac, orig_rtr_ip),
"origin_as": to_int(r[F_ORIGIN_AS]),
"eth_segment_id": nz(r[F_ESI]),
"eth_tag_id": hex_to_int(r[F_ETH_TAG]),
"mac": mac,
"mac_len": to_int(r[F_MAC_LEN]),
"ip": nz(r[F_IP]),
"ip_len": to_int(r[F_IP_LEN]),
"orig_router_ip": orig_rtr_ip,
"mpls_label1": to_int(r[F_LABEL1]),
"mpls_label2": to_int(r[F_LABEL2]),
"ext_community_list": parse_rts(r[F_EXT_COMM]),
"path_id": to_int(r[F_PATH_ID]),
"timestamp": nz(r[F_TIMESTAMP]),
}
INSERT_COLS = (
"hash_id", "peer_hash_id", "base_attr_hash_id", "rd", "rd_type", "route_type",
"origin_as", "eth_segment_id", "eth_tag_id", "mac", "mac_len", "ip", "ip_len",
"orig_router_ip", "mpls_label1", "mpls_label2", "ext_community_list", "path_id",
"timestamp",
)
INSERT_SQL = f"""
INSERT INTO evpn_rib ({", ".join(INSERT_COLS)}, iswithdrawn)
VALUES %s
ON CONFLICT (peer_hash_id, hash_id) DO UPDATE SET
base_attr_hash_id = EXCLUDED.base_attr_hash_id, rd = EXCLUDED.rd,
rd_type = EXCLUDED.rd_type, route_type = EXCLUDED.route_type,
origin_as = EXCLUDED.origin_as, eth_segment_id = EXCLUDED.eth_segment_id,
eth_tag_id = EXCLUDED.eth_tag_id, mac = EXCLUDED.mac, mac_len = EXCLUDED.mac_len,
ip = EXCLUDED.ip, ip_len = EXCLUDED.ip_len,
orig_router_ip = EXCLUDED.orig_router_ip, mpls_label1 = EXCLUDED.mpls_label1,
mpls_label2 = EXCLUDED.mpls_label2, ext_community_list = EXCLUDED.ext_community_list,
path_id = EXCLUDED.path_id, timestamp = EXCLUDED.timestamp, iswithdrawn = false
"""
DELETE_SQL = """
UPDATE evpn_rib SET iswithdrawn = true, base_attr_hash_id = NULL, timestamp = %s
WHERE peer_hash_id = %s AND hash_id = %s
"""
def flush(conn, adds, dels):
if not adds and not dels:
return
with conn.cursor() as cur:
if adds:
tuples = [
tuple(rec[c] for c in INSERT_COLS) + (False,) for rec in adds
]
execute_values(cur, INSERT_SQL, tuples)
for rec in dels:
cur.execute(DELETE_SQL, (rec["timestamp"], rec["peer_hash_id"], rec["hash_id"]))
conn.commit()
log(f"flushed {len(adds)} add/update, {len(dels)} withdraw")
def connect_pg():
while True:
try:
conn = psycopg2.connect(PG_DSN)
conn.autocommit = False
with conn.cursor() as cur:
cur.execute("SELECT 1 FROM evpn_rib LIMIT 0")
log("connected to PostgreSQL; evpn_rib present")
return conn
except psycopg2.Error as e:
log(f"PostgreSQL not ready ({e}); retrying in 5s")
time.sleep(5)
def main():
log(f"starting — kafka={KAFKA_BROKER} topic={TOPIC} group={GROUP_ID}")
conn = connect_pg()
consumer = Consumer({
"bootstrap.servers": KAFKA_BROKER,
"group.id": GROUP_ID,
"auto.offset.reset": "earliest",
"enable.auto.commit": False,
})
consumer.subscribe([TOPIC])
adds, dels = [], []
last_flush = time.time()
try:
while True:
msg = consumer.poll(1.0)
if msg is not None and not msg.error():
for row in parse_message(msg.value()):
rec = row_to_record(row)
if rec is None:
continue
(dels if rec["action"] == "del" else adds).append(rec)
elif msg is not None and msg.error():
raise KafkaException(msg.error())
if (adds or dels) and time.time() - last_flush >= BATCH_SECONDS:
try:
flush(conn, adds, dels)
except psycopg2.Error as e:
log(f"DB write failed ({e}); reconnecting")
conn = connect_pg()
continue
consumer.commit(asynchronous=False)
adds, dels = [], []
last_flush = time.time()
except KeyboardInterrupt:
log("shutting down")
finally:
consumer.close()
conn.close()
if __name__ == "__main__":
sys.exit(main())

View File

@ -0,0 +1,2 @@
confluent-kafka==2.5.3
psycopg2-binary==2.9.9

View File

@ -1,237 +1,59 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "OpenBMP navigation hub. Start at the NOC Overview, then drill into the operational dashboards.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 1,
"links": [],
"id": null,
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 3,
"w": 24,
"x": 0,
"y": 0
},
"id": 6,
"links": [],
"gridPos": {"h": 6,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {
"content": "# OpenBMP\n\n*Select a dashboard*",
"content": "# OpenBMP\n\nBGP Monitoring Protocol analytics. **Start here → [NOC Overview](/d/obmp-noc-overview/noc-overview)** — the at-a-glance health view.\n\n| Tier | What it covers |\n|------|----------------|\n| **NOC Overview** | Is the network healthy right now? Routers, peers, flaps, churn, RPKI, topology |\n| **Operations** | Router & peer inventory, per-router and per-peer detail, session health |\n| **Routing** | Prefix explorer, top talkers & churn, AS-path, communities, RPKI security |\n| **Link State** | IGP topology, nodes, links, TE & Segment Routing |\n| **L3VPN** | VPNv4/VPNv6 RIB and prefix history |\n| **Telemetry** | gNMI interface utilization, errors, BMP+telemetry correlation |\n| **Reference** | Database schema map, RR Loc-RIB diff |\n\nUse the **OBMP Dashboards** dropdown (top-right) or the panels below to navigate.",
"mode": "markdown"
},
"pluginVersion": "8.5.4",
"pluginVersion": "9.1.7",
"type": "text"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 18,
"w": 4,
"x": 0,
"y": 3
},
"id": 7,
"links": [],
"options": {
"maxItems": 41,
"query": "",
"showHeadings": true,
"showRecentlyViewed": false,
"showSearch": true,
"showStarred": false,
"tags": [
"obmp-tops"
]
},
"pluginVersion": "8.5.4",
"tags": [],
"title": "Tops",
"gridPos": {"h": 16,"w": 8,"x": 0,"y": 6},
"id": 2,
"options": {"maxItems": 100,"showHeadings": false,"showRecentlyViewed": false,"showSearch": true,"showStarred": false,"query": "","tags": []},
"pluginVersion": "9.1.7",
"title": "All Dashboards",
"type": "dashlist"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 18,
"w": 5,
"x": 4,
"y": 3
},
"id": 8,
"links": [],
"options": {
"maxItems": 41,
"query": "",
"showHeadings": true,
"showRecentlyViewed": false,
"showSearch": true,
"showStarred": false,
"tags": [
"obmp-base"
]
},
"pluginVersion": "8.5.4",
"tags": [],
"title": "Base",
"type": "dashlist"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 18,
"w": 5,
"x": 9,
"y": 3
},
"id": 4,
"links": [],
"options": {
"maxItems": 41,
"query": "",
"showHeadings": true,
"showRecentlyViewed": false,
"showSearch": true,
"showStarred": false,
"tags": [
"obmp-history"
]
},
"pluginVersion": "8.5.4",
"tags": [],
"title": "Prefix History",
"type": "dashlist"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 18,
"w": 5,
"x": 14,
"y": 3
},
"id": 9,
"links": [],
"options": {
"maxItems": 41,
"query": "",
"showHeadings": true,
"showRecentlyViewed": false,
"showSearch": true,
"showStarred": false,
"tags": [
"obmp-l3vpn"
]
},
"pluginVersion": "8.5.4",
"tags": [],
"title": "L3VPN",
"type": "dashlist"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 18,
"w": 5,
"x": 19,
"y": 3
},
"gridPos": {"h": 16,"w": 8,"x": 8,"y": 6},
"id": 3,
"links": [],
"options": {
"maxItems": 41,
"query": "",
"showHeadings": true,
"showRecentlyViewed": false,
"showSearch": true,
"showStarred": false,
"tags": [
"obmp-linkstate"
]
},
"pluginVersion": "8.5.4",
"tags": [],
"title": "Link State",
"options": {"maxItems": 20,"showHeadings": false,"showRecentlyViewed": true,"showSearch": false,"showStarred": false,"query": "","tags": []},
"pluginVersion": "9.1.7",
"title": "Recently Viewed",
"type": "dashlist"
},
{
"gridPos": {"h": 16,"w": 8,"x": 16,"y": 6},
"id": 4,
"options": {"maxItems": 20,"showHeadings": false,"showRecentlyViewed": false,"showSearch": false,"showStarred": true,"query": "","tags": []},
"pluginVersion": "9.1.7",
"title": "Starred",
"type": "dashlist"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [],
"templating": {
"list": []
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"tags": ["obmp","obmp-nav"],
"templating": {"list": []},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "OBMP-Home",
"uid": "obmp-home",
"version": 1,
"version": 2,
"weekStart": ""
}
}

View File

@ -0,0 +1,207 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Network-operations overview — answers 'is the network healthy right now?' at a glance. Counts come from stats_* aggregate tables so the dashboard stays fast at production scale.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"liveNow": true,
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total routers reporting BMP to the collector.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Routers\" FROM routers","refId": "A"}],
"title": "Routers Monitored",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routers whose BMP session is not up. Should be 0.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "red","value": 1}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 3,"y": 0},
"id": 2,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Routers Down\" FROM routers WHERE state != 'up'","refId": "A"}],
"title": "Routers Down",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP peers currently up (pre-policy Adj-RIB-In sessions).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 6,"y": 0},
"id": 3,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Peers Up\" FROM bgp_peers WHERE isprepolicy = true AND state = 'up'","refId": "A"}],
"title": "Peers Up",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP peers that went down within the selected time range. Investigate any non-zero value. (Removed/decommissioned peers fall outside the range and are not counted.)",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "red","value": 1}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 9,"y": 0},
"id": 4,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Peers Down\" FROM bgp_peers WHERE isprepolicy = true AND state != 'up' AND $__timeFilter(timestamp)","refId": "A"}],
"title": "Peers Down",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Peer session down-events in the last hour. Sustained flapping needs investigation.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 5}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 12,"y": 0},
"id": 5,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Flaps (1h)\" FROM peer_event_log WHERE state = 'down' AND timestamp > NOW() - INTERVAL '1 hour'","refId": "A"}],
"title": "Flap Events (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total BGP updates across all peers in the last 5 minutes (from stats_chg_bypeer).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 15,"y": 0},
"id": 6,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE(SUM(updates),0) AS \"RIB Updates (5m)\" FROM stats_chg_bypeer WHERE interval_time > NOW() - INTERVAL '5 minutes'","refId": "A"}],
"title": "RIB Updates (5m)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routes whose origin AS conflicts with a covering ROA (RPKI-invalid). Potential hijacks or misconfigurations.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "red","value": 1}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 18,"y": 0},
"id": 7,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"RPKI Invalid\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND EXISTS (SELECT 1 FROM rpki_validator rv WHERE rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as)\n AND NOT EXISTS (SELECT 1 FROM rpki_validator rv WHERE rv.prefix >>= r.prefix AND rv.origin_as = ba.origin_as AND r.prefix_len <= rv.prefix_len_max)","refId": "A"}],
"title": "RPKI Invalid Routes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP-LS link and node changes in the last hour. A spike indicates topology instability.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 20}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 21,"y": 0},
"id": 8,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time,\n (SELECT count(*) FROM ls_links_log WHERE timestamp > NOW() - INTERVAL '1 hour')\n + (SELECT count(*) FROM ls_nodes_log WHERE timestamp > NOW() - INTERVAL '1 hour') AS \"LS Changes (1h)\"","refId": "A"}],
"title": "LS Topology Changes (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Per-peer session state over the selected range. Any gap is a flap.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"custom": {"fillOpacity": 70,"lineWidth": 0,"spanNulls": false},
"mappings": [{"options": {"0": {"color": "red","index": 1,"text": "DOWN"},"1": {"color": "green","index": 0,"text": "UP"}},"type": "value"}],
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]}
}
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 4},
"id": 9,
"options": {"alignValue": "left","legend": {"displayMode": "list","placement": "bottom","showLegend": false},"mergeValues": true,"rowHeight": 0.9,"showValue": "never","tooltip": {"mode": "single"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(e.timestamp,'1m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n CASE WHEN e.state = 'up' THEN 1 ELSE 0 END AS \"value\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nORDER BY 1, 2","refId": "A"}],
"title": "Peer Session State",
"type": "state-timeline"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP update vs withdraw rate across all peers (from stats_chg_bypeer).",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisCenteredZero": false,"axisColorMode": "text","axisLabel": "","axisPlacement": "auto","barAlignment": 0,"drawStyle": "line","fillOpacity": 20,"gradientMode": "none","lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"scaleDistribution": {"type": "linear"},"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"},"thresholdsStyle": {"mode": "off"}},"unit": "short"},
"overrides": [{"matcher": {"id": "byName","options": "Withdraws"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]},{"matcher": {"id": "byName","options": "Updates"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]}]
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 4},
"id": 10,
"options": {"legend": {"calcs": ["sum"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "none"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(interval_time,'5m'),\n SUM(updates) AS \"Updates\",\n SUM(withdraws) AS \"Withdraws\"\nFROM stats_chg_bypeer\nWHERE $__timeFilter(interval_time)\nGROUP BY 1\nORDER BY 1","refId": "A"}],
"title": "BGP Update Rate",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Peers that went down within the selected time range. Empty is healthy. Widen the time range to see longer-standing issues. Click a peer to open Peer Detail.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 0,"text": "DOWN"}},"type": "value"}]}]},
{"matcher": {"id": "byName","options": "Peer"},"properties": [{"id": "links","value": [{"title": "Open Peer Detail","url": "/d/obmp-peer-detail/peer-detail?var-peer_hash=${__data.fields[\"peer_hash_id\"]}"}]}]},
{"matcher": {"id": "byName","options": "peer_hash_id"},"properties": [{"id": "custom.hidden","value": true}]}
]
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 13},
"id": 11,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Last Change"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n p.hash_id AS peer_hash_id,\n COALESCE(p.name, p.peer_addr::text) AS \"Peer\",\n p.peer_addr AS \"Address\",\n p.peer_as AS \"AS\",\n p.state AS \"State\",\n p.timestamp AS \"Last Change\",\n p.error_text AS \"Reason\"\nFROM bgp_peers p\nWHERE p.isprepolicy = true AND p.state != 'up' AND $__timeFilter(p.timestamp)\nORDER BY p.timestamp DESC","refId": "A"}],
"title": "Peers Down",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Most-churned prefixes in the last hour (from stats_chg_byprefix). Click a prefix to open Prefix Explorer.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "Total Changes"},"properties": [{"id": "custom.displayMode","value": "gradient-gauge"},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 50},{"color": "red","value": 500}]}}]},
{"matcher": {"id": "byName","options": "Prefix"},"properties": [{"id": "links","value": [{"title": "Open in Prefix Explorer","url": "/d/prefix-hist/prefix-explorer?var-prefix=${__value.text}"}]}]}
]
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 13},
"id": 12,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Total Changes"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n (host(prefix) || '/' || prefix_len) AS \"Prefix\",\n SUM(updates) AS \"Updates\",\n SUM(withdraws) AS \"Withdraws\",\n SUM(updates + withdraws) AS \"Total Changes\"\nFROM stats_chg_byprefix\nWHERE interval_time > NOW() - INTERVAL '1 hour'\nGROUP BY prefix, prefix_len\nORDER BY \"Total Changes\" DESC\nLIMIT 25","refId": "A"}],
"title": "Top Churning Prefixes (1h)",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routes whose observed origin AS conflicts with a covering ROA — potential hijacks or leaks.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "Status"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"Invalid": {"color": "red","index": 0}},"type": "value"}]}]}]
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 22},
"id": 13,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n r.prefix AS \"Prefix\",\n ba.origin_as AS \"Observed Origin AS\",\n rv.origin_as AS \"Authorized AS (ROA)\",\n 'Invalid' AS \"Status\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nJOIN rpki_validator rv ON rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND NOT EXISTS (SELECT 1 FROM rpki_validator rv2 WHERE rv2.prefix >>= r.prefix AND rv2.origin_as = ba.origin_as AND r.prefix_len <= rv2.prefix_len_max)\nORDER BY r.prefix\nLIMIT 50","refId": "A"}],
"title": "RPKI Invalid Routes — Potential Hijacks",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Recent BGP-LS link changes — topology churn over the selected range.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "Action"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"updated": {"color": "blue","index": 0},"withdrawn": {"color": "orange","index": 1}},"type": "value"}]}]}]
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 22},
"id": 14,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Time"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n timestamp AS \"Time\",\n COALESCE(interface_addr::text, '') AS \"Local\",\n COALESCE(neighbor_addr::text, '') AS \"Neighbor\",\n CASE WHEN iswithdrawn THEN 'withdrawn' ELSE 'updated' END AS \"Action\"\nFROM ls_links_log\nWHERE $__timeFilter(timestamp)\nORDER BY timestamp DESC\nLIMIT 50","refId": "A"}],
"title": "Recent LS Topology Changes",
"type": "table"
}
],
"refresh": "1m",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","obmp-nav","noc","overview"],
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "NOC Overview",
"uid": "obmp-noc-overview",
"version": 1
}

View File

@ -0,0 +1,217 @@
{
"uid": "obmp-learn-07",
"title": "Database Schema Map",
"schemaVersion": 39,
"tags": [
"obmp-learning",
"obmp",
"obmp-nav",
"reference"
],
"editable": true,
"time": {
"from": "now-6h",
"to": "now"
},
"templating": {
"list": []
},
"panels": [
{
"id": 1,
"title": "Table Row Counts",
"type": "table",
"gridPos": {
"h": 12,
"w": 8,
"x": 0,
"y": 0
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"rawSql": "SELECT 'routers' as table_name, count(*) as rows FROM routers\nUNION ALL SELECT 'collectors', count(*) FROM collectors\nUNION ALL SELECT 'bgp_peers', count(*) FROM bgp_peers\nUNION ALL SELECT 'peer_event_log', count(*) FROM peer_event_log\nUNION ALL SELECT 'base_attrs', count(*) FROM base_attrs\nUNION ALL SELECT 'ip_rib', count(*) FROM ip_rib\nUNION ALL SELECT 'ip_rib_log', count(*) FROM ip_rib_log\nUNION ALL SELECT 'l3vpn_rib', count(*) FROM l3vpn_rib\nUNION ALL SELECT 'global_ip_rib', count(*) FROM global_ip_rib\nUNION ALL SELECT 'ls_nodes', count(*) FROM ls_nodes\nUNION ALL SELECT 'ls_links', count(*) FROM ls_links\nUNION ALL SELECT 'ls_prefixes', count(*) FROM ls_prefixes\nUNION ALL SELECT 'ls_nodes_log', count(*) FROM ls_nodes_log\nUNION ALL SELECT 'ls_links_log', count(*) FROM ls_links_log\nUNION ALL SELECT 'ls_prefixes_log', count(*) FROM ls_prefixes_log\nUNION ALL SELECT 'rpki_validator', count(*) FROM rpki_validator\nUNION ALL SELECT 'info_asn', count(*) FROM info_asn\nUNION ALL SELECT 'info_route', count(*) FROM info_route\nUNION ALL SELECT 'stat_reports', count(*) FROM stat_reports\nUNION ALL SELECT 'geo_ip', count(*) FROM geo_ip\nORDER BY table_name",
"format": "table"
}
]
},
{
"id": 2,
"title": "Table Relationships",
"type": "text",
"gridPos": {
"h": 12,
"w": 8,
"x": 8,
"y": 0
},
"options": {
"mode": "markdown",
"content": "## Entity Relationships\n\n### BMP Core Chain\n```\ncollectors\n \u2514\u2500\u2500 routers (collector_hash_id)\n \u2514\u2500\u2500 bgp_peers (router_hash_id)\n \u251c\u2500\u2500 ip_rib (peer_hash_id)\n \u251c\u2500\u2500 ip_rib_log (peer_hash_id)\n \u251c\u2500\u2500 l3vpn_rib (peer_hash_id)\n \u251c\u2500\u2500 ls_nodes (peer_hash_id)\n \u251c\u2500\u2500 ls_links (peer_hash_id)\n \u251c\u2500\u2500 ls_prefixes (peer_hash_id)\n \u251c\u2500\u2500 peer_event_log (peer_hash_id)\n \u2514\u2500\u2500 stat_reports (peer_hash_id)\n```\n\n### Path Attributes\n```\nip_rib \u2500\u2500(base_attr_hash_id)\u2500\u2500\u25ba base_attrs\n \u2502 \u251c\u2500\u2500 as_path (bigint[])\n \u2502 \u251c\u2500\u2500 origin_as\n \u2502 \u251c\u2500\u2500 next_hop\n \u2502 \u251c\u2500\u2500 med / local_pref\n \u2502 \u251c\u2500\u2500 community_list[]\n \u2502 \u251c\u2500\u2500 ext_community_list[]\n \u2502 \u2514\u2500\u2500 large_community_list[]\n \u2502\n \u2514\u2500\u2500(prefix)\u2500\u2500\u25ba global_ip_rib\n \u251c\u2500\u2500 rpki_origin_as\n \u251c\u2500\u2500 irr_origin_as\n \u2514\u2500\u2500 num_peers\n```\n\n### Link-State Topology\n```\nls_nodes \u25c4\u2500\u2500 ls_links (local_node_hash_id, remote_node_hash_id)\nls_nodes \u25c4\u2500\u2500 ls_prefixes (local_node_hash_id)\n```\n\n### Reference Data\n```\nrpki_validator \u2500\u2500(prefix, origin_as)\u2500\u2500\u25ba validates ip_rib\ninfo_asn \u2500\u2500(asn)\u2500\u2500\u25ba enriches base_attrs.origin_as\ninfo_route \u2500\u2500(prefix)\u2500\u2500\u25ba enriches ip_rib.prefix\ngeo_ip \u2500\u2500(ip)\u2500\u2500\u25ba geolocates routers, peers\n```"
}
},
{
"id": 3,
"title": "BMP Core Tables",
"type": "text",
"gridPos": {
"h": 8,
"w": 8,
"x": 16,
"y": 0
},
"options": {
"mode": "markdown",
"content": "## BMP Core Tables\n\n| Table | Purpose | Key Columns |\n|-------|---------|-------------|\n| **routers** | BMP-monitored routers | hash_id, name, ip_address, router_as, state, bgp_id |\n| **collectors** | BMP collector instances | hash_id, admin_id, name, ip_address, router_count |\n| **bgp_peers** | BGP sessions per router | hash_id, router_hash_id, peer_addr, peer_as, state, isl3vpnpeer |\n| **peer_event_log** | Session state history (TimescaleDB) | peer_hash_id, state, timestamp, bmp_reason, bgp_err_code |\n| **stat_reports** | BMP statistics messages | peer_hash_id, prefixes_rejected, num_routes_adj_rib_in, num_routes_local_rib |\n| **users** | Access control | username, password, type (admin/oper) |"
}
},
{
"id": 4,
"title": "RIB & Path Attribute Tables",
"type": "text",
"gridPos": {
"h": 8,
"w": 8,
"x": 16,
"y": 8
},
"options": {
"mode": "markdown",
"content": "## RIB & Path Attribute Tables\n\n| Table | Purpose | Key Columns |\n|-------|---------|-------------|\n| **base_attrs** | BGP path attributes | hash_id, as_path[], as_path_count, origin_as, next_hop, med, local_pref, community_list[], ext_community_list[], large_community_list[], cluster_list, originator_id |\n| **ip_rib** | IPv4/IPv6 unicast RIB | hash_id, peer_hash_id, prefix, prefix_len, origin_as, iswithdrawn, labels, path_id |\n| **ip_rib_log** | RIB change history (TimescaleDB) | peer_hash_id, prefix, prefix_len, origin_as, iswithdrawn, timestamp |\n| **l3vpn_rib** | L3VPN/MPLS VPN routes | hash_id, peer_hash_id, rd, prefix, labels, ext_community_list[] |\n| **l3vpn_rib_log** | L3VPN change history (TimescaleDB) | peer_hash_id, rd, prefix, iswithdrawn, timestamp |\n| **global_ip_rib** | Aggregated prefix summary | prefix, recv_origin_as, rpki_origin_as, irr_origin_as, num_peers |"
}
},
{
"id": 5,
"title": "Link-State Tables",
"type": "text",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 12
},
"options": {
"mode": "markdown",
"content": "## Link-State Tables (BGP-LS / RFC 7752)\n\n| Table | Purpose | Key Columns |\n|-------|---------|-------------|\n| **ls_nodes** | IS-IS/OSPF nodes | hash_id, peer_hash_id, igp_router_id, name, protocol, asn, sr_capabilities, isis_area_id |\n| **ls_links** | IS-IS/OSPF links + TE/SR | hash_id, local/remote_node_hash_id, interface_addr, neighbor_addr, igp_metric, **te_def_metric**, **max_link_bw**, **max_resv_bw**, **unreserved_bw**, **admin_group**, **srlg**, **sr_adjacency_sids**, **peer_node_sid**, **protection_type**, **mpls_proto_mask** |\n| **ls_prefixes** | IS-IS/OSPF prefixes | hash_id, local_node_hash_id, prefix, metric, sr_prefix_sids, igp_flags |\n| **ls_nodes_log** | Node change history (TimescaleDB) | Same as ls_nodes + timestamp |\n| **ls_links_log** | Link change history (TimescaleDB) | Same as ls_links + timestamp |\n| **ls_prefixes_log** | Prefix change history (TimescaleDB) | Same as ls_prefixes + timestamp |\n\n**Bold columns** = TE/SR fields not used by any existing dashboard"
}
},
{
"id": 6,
"title": "Statistics Tables",
"type": "text",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 12
},
"options": {
"mode": "markdown",
"content": "## Statistics Tables (TimescaleDB Hypertables)\n\n| Table | Purpose | Key Columns |\n|-------|---------|-------------|\n| **stat_reports** | BMP stat messages | peer_hash_id, prefixes_rejected, known_dup_prefixes, num_routes_adj_rib_in |\n| **stats_chg_byprefix** | Per-prefix churn stats | interval_time, peer_hash_id, prefix, updates, withdraws |\n| **stats_chg_byasn** | Per-ASN churn stats | interval_time, peer_hash_id, origin_as, updates, withdraws |\n| **stats_chg_bypeer** | Per-peer churn stats | interval_time, peer_hash_id, updates, withdraws |\n| **stats_peer_rib** | Per-peer RIB size | interval_time, peer_hash_id, v4_prefixes, v6_prefixes |\n| **stats_peer_update_counts** | Update rate statistics | interval_time, peer_hash_id, advertise_avg/min/max, withdraw_avg/min/max |\n| **stats_ip_origins** | Per-ASN prefix counts | interval_time, asn, v4_prefixes, v6_prefixes, v4_with_rpki, v4_with_irr |"
}
},
{
"id": 7,
"title": "Reference & Enrichment Tables",
"type": "text",
"gridPos": {
"h": 6,
"w": 12,
"x": 0,
"y": 20
},
"options": {
"mode": "markdown",
"content": "## Reference & Enrichment Tables\n\n| Table | Purpose | Key Columns |\n|-------|---------|-------------|\n| **rpki_validator** | RPKI ROAs | prefix, prefix_len, prefix_len_max, origin_as |\n| **info_asn** | ASN WHOIS/IRR data | asn, as_name, org_name, country, source |\n| **info_route** | Route IRR data | prefix, prefix_len, origin_as, descr, source |\n| **geo_ip** | IP geolocation (DB-IP) | ip, country, city, latitude, longitude, isp_name |\n| **pdb_exchange_peers** | PeeringDB IXP data | ix_name, peer_name, peer_asn, speed, peer_ipv4/ipv6 |"
}
},
{
"id": 8,
"title": "Views Quick Reference",
"type": "text",
"gridPos": {
"h": 6,
"w": 12,
"x": 12,
"y": 20
},
"options": {
"mode": "markdown",
"content": "## Database Views\n\n| View | Joins | Purpose |\n|------|-------|---------|\n| **v_peers** | bgp_peers + routers + info_asn | Complete peer info with router name and ASN details |\n| **v_ip_routes** | ip_rib + bgp_peers + base_attrs + routers | Full route detail with path attributes |\n| **v_ip_routes_geo** | v_ip_routes + geo_ip | Routes with geolocation |\n| **v_ip_routes_history** | ip_rib_log + base_attrs + bgp_peers + routers | Historical route changes with attributes |\n| **v_l3vpn_routes** | l3vpn_rib + bgp_peers + base_attrs + routers | L3VPN routes with path attributes |\n| **v_l3vpn_routes_history** | l3vpn_rib_log + base_attrs + bgp_peers + routers | Historical L3VPN changes |\n| **v_ls_nodes** | ls_nodes + base_attrs + bgp_peers + routers | Link-state nodes with peer/router info |\n| **v_ls_links** | ls_links + ls_nodes(x2) + routers | Links with local/remote node names + TE fields |\n| **v_ls_prefixes** | ls_prefixes + ls_nodes + routers | LS prefixes with originating node info |\n\n### Enum Types\n- **opstate**: up, down\n- **ls_proto**: IS-IS_L1, IS-IS_L2, OSPFv2, OSPFv3, Direct, Static\n- **ospf_route_type**: Intra, Inter, Ext-1, Ext-2, NSSA-1, NSSA-2\n- **ls_mpls_proto_mask**: MPLS protocol bitmask"
}
},
{
"id": 9,
"title": "LinkState Column Details",
"type": "table",
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 26
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"rawSql": "SELECT column_name, data_type, \n CASE \n WHEN column_name IN ('admin_group','max_link_bw','max_resv_bw','unreserved_bw','te_def_metric','protection_type','srlg','sr_adjacency_sids','peer_node_sid','mpls_proto_mask') THEN 'TE/SR'\n WHEN column_name IN ('hash_id','peer_hash_id','base_attr_hash_id','local_node_hash_id','remote_node_hash_id') THEN 'FK/Key'\n ELSE 'Core'\n END as category\nFROM information_schema.columns \nWHERE table_name = 'ls_links' AND table_schema = 'public'\nORDER BY ordinal_position",
"format": "table"
}
]
},
{
"id": 10,
"title": "ip_rib Column Details",
"type": "table",
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 26
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"rawSql": "SELECT column_name, data_type,\n CASE \n WHEN column_name IN ('hash_id','peer_hash_id','base_attr_hash_id') THEN 'FK/Key'\n ELSE 'Core'\n END as category\nFROM information_schema.columns \nWHERE table_name = 'ip_rib' AND table_schema = 'public'\nORDER BY ordinal_position",
"format": "table"
}
]
}
],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
]
}

View File

@ -1,160 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "AS path length distribution and analysis. Teaches how BGP AS paths reflect internet topology and how to detect anomalies like route leaks or AS path prepending.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Internet routes typically have 2-5 hops. A /32 or /24 appearing with only 1-hop AS path from an unexpected ASN is a classic hijack indicator. Routes with 10+ hops may indicate prepending.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"fillOpacity": 80,"gradientMode": "none","lineWidth": 0},
"unit": "short"
}
},
"gridPos": {"h": 10,"w": 12,"x": 0,"y": 0},
"id": 1,
"options": {"barRadius": 0,"barWidth": 0.7,"groupWidth": 0.7,"legend": {"calcs": [],"displayMode": "list","placement": "bottom"},"orientation": "auto","tooltip": {"mode": "single"},"xTickLabelRotation": 0,"xTickLabelSpacing": 200},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n ba.as_path_count AS \"AS Path Length (hops)\",\n COUNT(*) AS \"Prefix Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false\n AND r.isipv4 = true\n AND ba.as_path_count > 0\nGROUP BY ba.as_path_count\nORDER BY ba.as_path_count",
"refId": "A"
}
],
"title": "AS Path Length Distribution (Active IPv4 Routes)",
"type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Average AS path length on the internet is ~4-5 hops. Your lab has shorter paths since ExaBGP is a single eBGP hop away.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 5},{"color": "red","value": 8}]},
"unit": "short",
"decimals": 1
}
},
"gridPos": {"h": 5,"w": 6,"x": 12,"y": 0},
"id": 2,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(AVG(ba.as_path_count)::numeric, 1) AS \"Avg AS Path Length\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true AND ba.as_path_count > 0",
"refId": "A"
}
],
"title": "Average AS Path Length",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Routes with only 1-hop AS path are directly connected or possibly hijacked. In your lab, ExaBGP injects routes starting with AS 65100.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 5},{"color": "red","value": 20}]},
"unit": "short"
}
},
"gridPos": {"h": 5,"w": 6,"x": 18,"y": 0},
"id": 3,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n COUNT(*) AS \"Direct (1-hop) Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true AND ba.as_path_count = 1",
"refId": "A"
}
],
"title": "1-Hop Routes (Direct/Possible Hijack)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: The longest paths reveal the most AS-level hops in your network. AS path prepending intentionally lengthens paths to make a route less preferred.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "AS Path Length"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 5},{"color": "red","value": 10}]}}]},
{"matcher": {"id": "byName","options": "AS Path"},"properties": [{"id": "custom.width","value": 400}]}
]
},
"gridPos": {"h": 10,"w": 24,"x": 0,"y": 10},
"id": 4,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "AS Path Length"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n r.prefix AS \"Prefix\",\n ba.as_path_count AS \"AS Path Length\",\n ba.as_path::text AS \"AS Path\",\n ba.origin_as AS \"Origin AS\",\n ba.next_hop AS \"Next Hop\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nORDER BY ba.as_path_count DESC\nLIMIT 30",
"refId": "A"
}
],
"title": "Longest AS Paths (Top 30)",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Origin AS is the rightmost ASN in the AS path — the network that first originated the prefix. Most internet prefixes are originated by their owning organization.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "Route Count"},"properties": [{"id": "custom.displayMode","value": "lcd-gauge"},{"id": "custom.width","value": 200}]}
]
},
"gridPos": {"h": 12,"w": 12,"x": 0,"y": 20},
"id": 5,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Route Count"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n ba.origin_as AS \"Origin AS\",\n COALESCE(ia.as_name, 'Unknown') AS \"AS Name\",\n COUNT(*) AS \"Route Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nLEFT JOIN info_asn ia ON ia.asn = ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.origin_as, ia.as_name\nORDER BY COUNT(*) DESC\nLIMIT 20",
"refId": "A"
}
],
"title": "Top Origin ASNs by Route Count",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: A transit AS (appearing frequently in AS paths but not as origin) is a carrier. The most frequent transit ASNs in your lab correspond to simulated Tier-1 carriers (174=Cogent, 3356=Lumen, 1299=Telia, etc.)",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"fillOpacity": 80,"lineWidth": 0},"unit": "short"}
},
"gridPos": {"h": 12,"w": 12,"x": 12,"y": 20},
"id": 6,
"options": {"barRadius": 0,"barWidth": 0.7,"groupWidth": 0.7,"legend": {"calcs": [],"displayMode": "list","placement": "bottom"},"orientation": "horizontal","tooltip": {"mode": "single"},"xTickLabelRotation": 0,"xTickLabelSpacing": 200},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n asn_val AS \"Transit ASN\",\n COUNT(*) AS \"Appearances in AS Paths\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nCROSS JOIN LATERAL unnest(ba.as_path) AS asn_val\nWHERE r.iswithdrawn = false AND asn_val != ba.origin_as\nGROUP BY asn_val\nORDER BY COUNT(*) DESC\nLIMIT 15",
"refId": "A"
}
],
"title": "Most Common Transit ASNs",
"type": "barchart"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","as-path","topology"],
"time": {"from": "now-1h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "AS Path Analysis",
"uid": "obmp-learn-03",
"version": 1
}

View File

@ -1,201 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","target": {"limit": 100,"matchAny": false,"tags": [],"type": "dashboard"},"type": "dashboard"}]},
"description": "Explore BGP path attributes: communities, MED, local-pref and how they influence routing policy decisions.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"datasource": {"type": "datasource","uid": "grafana"},
"gridPos": {"h": 8,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {
"content": "## BGP Path Attributes — What They Mean\n\n### BGP Communities (RFC 1997)\nCommunities are 32-bit tags attached to routes, written as **ASN:value** (e.g., `65000:100`). They carry policy signals between routers and ASes.\n\n**Well-known communities:**\n| Community | Decimal | Meaning |\n|-----------|---------|----------|\n| `65535:0` | NO_EXPORT | Do not advertise outside this AS or confederation |\n| `65535:1` | NO_ADVERTISE | Do not advertise to any peer |\n| `65535:666` | BLACKHOLE | Drop traffic destined for this prefix (RFC 7999) |\n\nPrivate communities (e.g., `65001:200`) are operator-defined — they may encode region, customer tier, or traffic-engineering intent.\n\n### Local Preference (local-pref)\n- **Scope:** iBGP only — never sent to eBGP peers.\n- **Effect:** Higher local-pref wins. Default is **100**.\n- **Use case:** Prefer one upstream provider over another for all outbound traffic.\n\n### Multi-Exit Discriminator (MED)\n- **Scope:** Sent to directly connected eBGP peers to influence *inbound* traffic.\n- **Effect:** Lower MED wins (when comparing routes from the same AS).\n- **Use case:** Tell a peer which of your links to prefer when sending traffic to you.\n\n> **Tip:** Use the panels below to explore what communities and attributes are actually present in the current RIB. Run `inject.py attributes` to load routes with varied communities and MED values.",
"mode": "markdown"
},
"title": "BGP Attribute Reference — Communities, Local-Pref, MED",
"type": "text"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Each row is a unique community string (format ASN:value) seen across all active routes. High route counts for a community mean many routes share that policy tag. Look for well-known communities: 65535:0 (NO_EXPORT), 65535:1 (NO_ADVERTISE), 65535:666 (BLACKHOLE).",
"fieldConfig": {
"defaults": {"color": {"mode": "thresholds"},"custom": {"align": "auto","displayMode": "auto"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}},
"overrides": [
{"matcher": {"id": "byName","options": "Routes Tagged"},"properties": [{"id": "custom.displayMode","value": "lcd-gauge"},{"id": "color","value": {"mode": "thresholds"}},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "blue","value": null},{"color": "green","value": 10},{"color": "yellow","value": 100}]}}]}
]
},
"gridPos": {"h": 11,"w": 12,"x": 0,"y": 8},
"id": 2,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Routes Tagged"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n comm AS \"Community\",\n COUNT(*) AS \"Routes Tagged\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nCROSS JOIN LATERAL unnest(ba.community_list) AS comm\nWHERE r.iswithdrawn = false AND ba.community_list IS NOT NULL\nGROUP BY comm\nORDER BY COUNT(*) DESC\nLIMIT 30",
"refId": "A"
}
],
"title": "Top BGP Communities in Current RIB",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Routes with notable BGP attributes — tagged with communities or using non-default local-pref / MED values. These routes carry explicit policy information. Examine the Communities column for operator-defined tags and the Local Pref column to see traffic engineering decisions.",
"fieldConfig": {
"defaults": {"color": {"mode": "thresholds"},"custom": {"align": "auto","displayMode": "auto"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}},
"overrides": [
{"matcher": {"id": "byName","options": "Local Pref"},"properties": [{"id": "custom.displayMode","value": "color-text"},{"id": "color","value": {"mode": "thresholds"}},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 101},{"color": "red","value": 200}]}}]},
{"matcher": {"id": "byName","options": "MED"},"properties": [{"id": "custom.displayMode","value": "color-text"},{"id": "color","value": {"mode": "thresholds"}},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 100}]}}]}
]
},
"gridPos": {"h": 11,"w": 12,"x": 12,"y": 8},
"id": 3,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n r.prefix::text AS \"Prefix\",\n ba.origin_as AS \"Origin AS\",\n ba.community_list::text AS \"Communities\",\n ba.local_pref AS \"Local Pref\",\n ba.med AS \"MED\",\n ba.as_path_count AS \"Path Length\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND (ba.community_list IS NOT NULL OR ba.med IS NOT NULL OR ba.local_pref IS NOT NULL)\nORDER BY r.prefix\nLIMIT 100",
"refId": "A"
}
],
"title": "Routes with Notable Attributes",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: MED (Multi-Exit Discriminator) is used to influence inbound traffic from a directly connected AS. Lower MED is preferred. If most routes show 'Not Set', MED is not being used for traffic engineering. A single dominant MED value means a simple policy; many different values indicate fine-grained control.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"fillOpacity": 80,"lineWidth": 0},
"unit": "short"
}
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 19},
"id": 4,
"options": {"barRadius": 0.1,"barWidth": 0.6,"groupWidth": 0.7,"legend": {"displayMode": "list","placement": "bottom"},"orientation": "auto","text": {},"tooltip": {"mode": "single"},"xTickLabelRotation": -30,"xTickLabelSpacing": 100},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n COALESCE(ba.med::text, 'Not Set') AS \"MED Value\",\n COUNT(*) AS \"Route Count\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.med\nORDER BY ba.med NULLS LAST\nLIMIT 20",
"refId": "A"
}
],
"title": "MED Value Distribution",
"type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Local preference is an iBGP attribute — it never crosses AS boundaries. Default is 100. Routes with local-pref above 100 are preferred over the default path; below 100 they are used as last-resort. Non-100 values indicate active traffic-engineering policy. Run 'inject.py attributes' to inject routes with varied local-pref values.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"fillOpacity": 80,"lineWidth": 0},
"unit": "short"
}
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 19},
"id": 5,
"options": {"barRadius": 0.1,"barWidth": 0.6,"groupWidth": 0.7,"legend": {"displayMode": "list","placement": "bottom"},"orientation": "auto","text": {},"tooltip": {"mode": "single"},"xTickLabelRotation": -30,"xTickLabelSpacing": 100},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n COALESCE(ba.local_pref::text, 'Not Set') AS \"Local Pref\",\n COUNT(*) AS \"Route Count\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.local_pref\nORDER BY ba.local_pref DESC NULLS LAST\nLIMIT 20",
"refId": "A"
}
],
"title": "Local Preference Value Distribution",
"type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: This count tells you how widely BGP communities are used in your network. A value of 0 means no community tagging — communities are an opt-in feature. Run 'inject.py attributes' to add routes with community strings.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null},{"color": "green","value": 1}]},
"unit": "short",
"mappings": []
}
},
"gridPos": {"h": 5,"w": 8,"x": 0,"y": 28},
"id": 6,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(*) AS \"Routes with Communities\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false\n AND ba.community_list IS NOT NULL\n AND array_length(ba.community_list, 1) > 0",
"refId": "A"
}
],
"title": "Routes with Communities",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: The number of distinct community strings seen across all active routes. A diverse set indicates fine-grained policy tagging. A single value means one uniform policy tag is applied.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null},{"color": "green","value": 1},{"color": "yellow","value": 50}]},
"unit": "short",
"mappings": []
}
},
"gridPos": {"h": 5,"w": 8,"x": 8,"y": 28},
"id": 7,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(DISTINCT comm) AS \"Unique Communities\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nCROSS JOIN LATERAL unnest(ba.community_list) AS comm\nWHERE r.iswithdrawn = false",
"refId": "A"
}
],
"title": "Unique Community Values",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Routes with a local-pref other than the default (100) have been explicitly policy-engineered. A high count here means your network actively uses local-pref to prefer specific paths. A value of 0 means all paths are at default preference.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 100},{"color": "red","value": 1000}]},
"unit": "short",
"mappings": []
}
},
"gridPos": {"h": 5,"w": 8,"x": 16,"y": 28},
"id": 8,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(*) AS \"Custom Local-Pref Routes\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false\n AND ba.local_pref IS NOT NULL\n AND ba.local_pref != 100",
"refId": "A"
}
],
"title": "Routes with Non-Default Local-Pref",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","communities","attributes","policy"],
"time": {"from": "now-1h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "BGP Attribute Explorer",
"uid": "obmp-learn-06",
"version": 1
}

View File

@ -1,152 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","target": {"limit": 100,"matchAny": false,"tags": [],"type": "dashboard"},"type": "dashboard"}]},
"description": "Prefix stability analysis and route churn visualization. Teaches how to identify unstable routes and understand BGP churn.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: This chart shows BGP advertisements and withdrawals bucketed per hour. A healthy network has steady low churn. Spikes in withdrawals indicate route instability events — link failures, IBGP reconvergence, or policy changes. Run 'inject.py churn' to generate synthetic churn data and observe it here.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"drawStyle": "bars","fillOpacity": 60,"lineWidth": 1,"spanNulls": false,"stacking": {"group": "A","mode": "none"}},
"unit": "short"
},
"overrides": [
{"matcher": {"id": "byName","options": "Advertisements"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]},
{"matcher": {"id": "byName","options": "Withdrawals"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]}
]
},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {"legend": {"calcs": ["sum","max"],"displayMode": "list","placement": "bottom"},"tooltip": {"mode": "multi"}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(timestamp,'1h'),\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) AS \"Advertisements\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) AS \"Withdrawals\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "Advertisements vs Withdrawals Rate (per hour)",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: A prefix with more than 30 updates per day is considered unstable — it is flapping or being re-announced frequently. The Stability column categorizes each prefix. Run 'inject.py churn' to generate churn data and observe it here. Sort by 'Total Updates' to find the most problematic prefixes.",
"fieldConfig": {
"defaults": {"color": {"mode": "thresholds"},"custom": {"align": "auto","displayMode": "auto"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}},
"overrides": [
{"matcher": {"id": "byName","options": "Stability"},"properties": [{"id": "custom.displayMode","value": "color-text"},{"id": "mappings","value": [{"options": {"Very Stable": {"color": "green","index": 0},"Stable": {"color": "blue","index": 1},"Moderate": {"color": "yellow","index": 2},"Unstable": {"color": "red","index": 3}},"type": "value"}]}]},
{"matcher": {"id": "byName","options": "Total Updates"},"properties": [{"id": "custom.displayMode","value": "lcd-gauge"},{"id": "color","value": {"mode": "thresholds"}},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 7},{"color": "red","value": 30}]}}]}
]
},
"gridPos": {"h": 12,"w": 24,"x": 0,"y": 9},
"id": 2,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Total Updates"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n prefix::text AS \"Prefix\",\n COUNT(*) AS \"Total Updates\",\n SUM(CASE WHEN iswithdrawn THEN 1 ELSE 0 END) AS \"Withdrawals\",\n SUM(CASE WHEN NOT iswithdrawn THEN 1 ELSE 0 END) AS \"Announcements\",\n MAX(timestamp) AS \"Last Change\",\n CASE\n WHEN COUNT(*) = 1 THEN 'Very Stable'\n WHEN COUNT(*) <= 7 THEN 'Stable'\n WHEN COUNT(*) <= 30 THEN 'Moderate'\n ELSE 'Unstable'\n END AS \"Stability\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY prefix\nORDER BY \"Total Updates\" DESC\nLIMIT 100",
"refId": "A"
}
],
"title": "Top Churning Prefixes",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: This bar chart shows how many prefixes fall into each stability tier. In a healthy network, the vast majority of prefixes should be 'Very Stable' (only announced once during the window). A large 'Unstable' bar is a red flag. Run 'inject.py churn' to shift prefixes into the Unstable tier.",
"fieldConfig": {
"defaults": {
"color": {"mode": "fixed","fixedColor": "blue"},
"custom": {"fillOpacity": 80,"lineWidth": 0},
"unit": "short"
},
"overrides": [
{"matcher": {"id": "byName","options": "1. Very Stable (1 update)"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]},
{"matcher": {"id": "byName","options": "2. Stable (2-7 updates)"},"properties": [{"id": "color","value": {"fixedColor": "blue","mode": "fixed"}}]},
{"matcher": {"id": "byName","options": "3. Moderate (8-30 updates)"},"properties": [{"id": "color","value": {"fixedColor": "yellow","mode": "fixed"}}]},
{"matcher": {"id": "byName","options": "4. Unstable (31+ updates)"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]}
]
},
"gridPos": {"h": 9,"w": 14,"x": 0,"y": 21},
"id": 3,
"options": {"barRadius": 0.1,"barWidth": 0.6,"groupWidth": 0.7,"legend": {"displayMode": "list","placement": "bottom"},"orientation": "auto","text": {},"tooltip": {"mode": "single"},"xTickLabelRotation": 0,"xTickLabelSpacing": 200},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n CASE\n WHEN cnt = 1 THEN '1. Very Stable (1 update)'\n WHEN cnt <= 7 THEN '2. Stable (2-7 updates)'\n WHEN cnt <= 30 THEN '3. Moderate (8-30 updates)'\n ELSE '4. Unstable (31+ updates)'\n END AS \"Stability Tier\",\n COUNT(*) AS \"Prefix Count\"\nFROM (\n SELECT prefix, COUNT(*) as cnt\n FROM ip_rib_log\n WHERE $__timeFilter(timestamp)\n GROUP BY prefix\n) sub\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "Prefix Distribution by Stability Tier",
"type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: This is the single most churning prefix in the selected time range. If a prefix appears here repeatedly across time ranges, it may warrant investigation — check the AS path and peers announcing it.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null}]},
"unit": "string",
"mappings": []
}
},
"gridPos": {"h": 5,"w": 10,"x": 14,"y": 21},
"id": 4,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "center","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {"titleSize": 14,"valueSize": 18}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, prefix::text AS \"Most Churned Prefix\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY prefix\nORDER BY COUNT(*) DESC\nLIMIT 1",
"refId": "A"
}
],
"title": "Most Churned Prefix",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: This counts how many distinct prefixes had at least one update event in the selected time window. During a normal steady state this number should be low. After a major routing event (e.g., upstream link failure) you may see thousands of prefixes change simultaneously.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 500},{"color": "red","value": 2000}]},
"unit": "short",
"mappings": []
}
},
"gridPos": {"h": 4,"w": 10,"x": 14,"y": 26},
"id": 5,
"options": {"colorMode": "background","graphMode": "area","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT prefix) AS \"Prefixes with Updates\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)",
"refId": "A"
}
],
"title": "Total Unique Prefixes with Updates",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","churn","stability"],
"time": {"from": "now-24h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Route Churn & Stability Score",
"uid": "obmp-learn-05",
"version": 1
}

View File

@ -1,144 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "BGP peer session health, uptime, and flap analysis. Teaches session stability and how to diagnose flapping peers.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: A healthy BGP mesh shows all peers UP continuously. Any gap in the UP state represents a session flap — investigate the reset reason.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"custom": {"fillOpacity": 70,"lineWidth": 0,"spanNulls": false},
"mappings": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}],
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]}
}
},
"gridPos": {"h": 8,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {"alignValue": "left","legend": {"displayMode": "list","placement": "bottom"},"mergeValues": true,"rowHeight": 0.9,"showValue": "auto","tooltip": {"mode": "single"}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(e.timestamp,'1m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n CASE WHEN e.state = 'up' THEN 1 ELSE 0 END AS \"value\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nORDER BY 1, 2",
"refId": "A"
}
],
"title": "Peer Session State Timeline",
"type": "state-timeline"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Current state of all BGP peers. Learn: 'bmp_reason' tells you why BMP reporting stopped. 'bgp_err_code' shows BGP NOTIFICATION error codes.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]},
{"matcher": {"id": "byName","options": "Peer"},"properties": [{"id": "custom.width","value": 200}]},
{"matcher": {"id": "byName","options": "AS"},"properties": [{"id": "custom.width","value": 80}]}
]
},
"gridPos": {"h": 12,"w": 24,"x": 0,"y": 8},
"id": 2,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": false,"displayName": "State"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n COALESCE(p.name, p.peer_addr::text) AS \"Peer\",\n p.peer_addr AS \"Address\",\n p.peer_as AS \"AS\",\n p.state AS \"State\",\n p.timestamp AS \"Last State Change\",\n p.error_text AS \"Last Error\",\n p.local_hold_time AS \"Hold Time\"\nFROM bgp_peers p\nWHERE p.isprepolicy = true\nORDER BY p.state, p.peer_addr",
"refId": "A"
}
],
"title": "Current Peer State",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Flap count = number of times a peer went from UP to DOWN. A peer flapping more than 2 times per hour needs investigation.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "Flap Count"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 5}]}}]}
]
},
"gridPos": {"h": 10,"w": 24,"x": 0,"y": 20},
"id": 3,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Flap Count"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n COALESCE(p.name, p.peer_addr::text) AS \"Peer\",\n p.peer_addr AS \"Address\",\n p.peer_as AS \"AS\",\n COUNT(CASE WHEN e.state = 'down' THEN 1 END) AS \"Flap Count\",\n MIN(e.timestamp) AS \"First Event\",\n MAX(e.timestamp) AS \"Last Event\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nGROUP BY p.name, p.peer_addr, p.peer_as\nORDER BY \"Flap Count\" DESC",
"refId": "A"
}
],
"title": "Peer Flap Analysis",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "yellow","value": 50},{"color": "green","value": 90}]},"unit": "percent","max": 100,"min": 0}},
"gridPos": {"h": 8,"w": 8,"x": 0,"y": 30},
"id": 4,
"options": {"orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"showThresholdLabels": false,"showThresholdMarkers": true,"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(100.0 * SUM(CASE WHEN state = 'up' THEN 1 ELSE 0 END) / NULLIF(COUNT(*),0), 1) AS \"Mesh Health %\"\nFROM bgp_peers WHERE isprepolicy = true",
"refId": "A"
}
],
"title": "Overall Peer Mesh Health",
"type": "gauge"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]},"unit": "short","mappings": [{"options": {"0": {"color": "red","index": 0,"text": "DOWN"}},"type": "value"}]}},
"gridPos": {"h": 8,"w": 8,"x": 8,"y": 30},
"id": 5,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n SUM(CASE WHEN state = 'up' THEN 1 ELSE 0 END) AS \"Peers UP\"\nFROM bgp_peers WHERE isprepolicy = true",
"refId": "A"
}
],
"title": "Peers Currently UP",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 5}]},"unit": "short"}},
"gridPos": {"h": 8,"w": 8,"x": 16,"y": 30},
"id": 6,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n COUNT(CASE WHEN state = 'down' THEN 1 END) AS \"Flap Events (24h)\"\nFROM peer_event_log\nWHERE timestamp > NOW() - INTERVAL '24 hours' AND state = 'down'",
"refId": "A"
}
],
"title": "Flap Events (24h)",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","peers","flap"],
"time": {"from": "now-24h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Peer Session Health & Flap Analysis",
"uid": "obmp-learn-02",
"version": 1
}

View File

@ -1,150 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "RPKI (Resource Public Key Infrastructure) validation status. Teaches BGP routing security and how RPKI prevents prefix hijacks by validating route origin.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"content": "## What is RPKI?\n\nRPKI (Resource Public Key Infrastructure) is a cryptographic security framework for BGP routing. It lets IP address holders publish **Route Origin Authorizations (ROAs)** stating which ASNs are authorized to originate their prefixes.\n\n### RPKI Validation States\n| State | Meaning |\n|-------|----------|\n| **Valid** | The route's origin AS matches a ROA for this prefix |\n| **Invalid** | A ROA exists but the origin AS or prefix length does NOT match — this route is potentially a hijack |\n| **NotFound** | No ROA exists for this prefix/origin — unprotected, can't be validated |\n\n### How to read this dashboard\n- **Valid %** should be as high as possible (target: 100%)\n- **Invalid routes** are critical — they indicate either a misconfiguration or a prefix hijack\n- Routes with no RPKI data show as **NotFound** — they are not necessarily invalid, just unprotected\n\n> **Lab note:** The RPKI validator table is populated by a cron job in psql-app every 2 hours. If the table shows 0 rows, wait for the cron to run or check `ENABLE_RPKI=1` in docker-compose.yml.",
"datasource": {"type": "datasource","uid": "grafana"},
"gridPos": {"h": 10,"w": 8,"x": 0,"y": 0},
"id": 1,
"options": {"content": "## What is RPKI?\n\nRPKI (Resource Public Key Infrastructure) is a cryptographic security framework for BGP routing. It lets IP address holders publish **Route Origin Authorizations (ROAs)** stating which ASNs are authorized to originate their prefixes.\n\n### RPKI Validation States\n| State | Meaning |\n|-------|----------|\n| **Valid** | The route's origin AS matches a ROA for this prefix |\n| **Invalid** | A ROA exists but the origin AS or prefix length does NOT match — this route is potentially a hijack |\n| **NotFound** | No ROA exists for this prefix/origin — unprotected, can't be validated |\n\n### How to read this dashboard\n- **Valid %** should be as high as possible (target: 100%)\n- **Invalid routes** are critical — they indicate either a misconfiguration or a prefix hijack\n- Routes with no RPKI data show as **NotFound** — they are not necessarily invalid, just unprotected\n\n> **Lab note:** The RPKI validator table is populated by a cron job in psql-app every 2 hours. If the table shows 0 rows, wait for the cron to run or check `ENABLE_RPKI=1` in docker-compose.yml.","mode": "markdown"},
"pluginVersion": "9.1.7",
"title": "RPKI Learning Guide",
"type": "text"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total ROAs (Route Origin Authorizations) loaded from the RPKI validator. If 0, the cron job has not yet run.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "yellow","value": 1},{"color": "green","value": 100000}]},
"unit": "short"
}
},
"gridPos": {"h": 5,"w": 4,"x": 8,"y": 0},
"id": 2,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"RPKI ROAs Loaded\" FROM rpki_validator",
"refId": "A"
}
],
"title": "RPKI ROAs Loaded",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routes with a matching valid ROA — origin AS and prefix length both match.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]},
"unit": "short"
}
},
"gridPos": {"h": 5,"w": 4,"x": 12,"y": 0},
"id": 3,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"Valid Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nJOIN rpki_validator rv ON rv.prefix >>= r.prefix AND rv.origin_as = ba.origin_as AND r.prefix_len <= rv.prefix_len_max\nWHERE r.iswithdrawn = false AND r.isipv4 = true",
"refId": "A"
}
],
"title": "RPKI Valid Routes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routes where a ROA exists but the origin AS does NOT match — high-priority investigation needed.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "red","value": 1}]},
"unit": "short"
}
},
"gridPos": {"h": 5,"w": 4,"x": 16,"y": 0},
"id": 4,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"RPKI Invalid Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND EXISTS (\n SELECT 1 FROM rpki_validator rv\n WHERE rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as\n )\n AND NOT EXISTS (\n SELECT 1 FROM rpki_validator rv\n WHERE rv.prefix >>= r.prefix AND rv.origin_as = ba.origin_as AND r.prefix_len <= rv.prefix_len_max\n )",
"refId": "A"
}
],
"title": "RPKI Invalid Routes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: ExaBGP-injected routes (AS 65100) will be NotFound since they use synthetic ASNs not registered in RPKI. Real internet prefixes with valid ROAs will appear as Valid.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"hideFrom": {"legend": false,"tooltip": false,"viz": false}},
"mappings": []
},
"overrides": []
},
"gridPos": {"h": 10,"w": 10,"x": 0,"y": 10},
"id": 5,
"options": {"displayLabels": ["percent","name"],"legend": {"displayMode": "list","placement": "bottom"},"pieType": "donut","tooltip": {"mode": "single"}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n CASE\n WHEN rv_valid.prefix IS NOT NULL THEN 'Valid'\n WHEN rv_any.prefix IS NOT NULL THEN 'Invalid'\n ELSE 'NotFound'\n END AS \"RPKI Status\",\n COUNT(*) AS \"Route Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nLEFT JOIN rpki_validator rv_valid\n ON rv_valid.prefix >>= r.prefix AND rv_valid.origin_as = ba.origin_as AND r.prefix_len <= rv_valid.prefix_len_max\nLEFT JOIN rpki_validator rv_any\n ON rv_any.prefix >>= r.prefix AND rv_any.origin_as != ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "RPKI Validation Status Distribution",
"type": "piechart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Prefixes that have a ROA but the observed origin AS does not match. These are the most security-critical routes — each one represents a potential hijack or misconfiguration.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "Status"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"Invalid": {"color": "red","index": 0},"Valid": {"color": "green","index": 1},"NotFound": {"color": "yellow","index": 2}},"type": "value"}]}]}
]
},
"gridPos": {"h": 14,"w": 14,"x": 10,"y": 10},
"id": 6,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n r.prefix AS \"Prefix\",\n ba.origin_as AS \"Observed Origin AS\",\n rv.origin_as AS \"Authorized Origin AS (ROA)\",\n 'Invalid' AS \"Status\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nJOIN rpki_validator rv ON rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND NOT EXISTS (\n SELECT 1 FROM rpki_validator rv2\n WHERE rv2.prefix >>= r.prefix AND rv2.origin_as = ba.origin_as AND r.prefix_len <= rv2.prefix_len_max\n )\nORDER BY r.prefix\nLIMIT 50",
"refId": "A"
}
],
"title": "RPKI Invalid Routes — Potential Hijacks",
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","rpki","security"],
"time": {"from": "now-1h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "RPKI Validation Status",
"uid": "obmp-learn-04",
"version": 1
}

View File

@ -1,137 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","target": {"limit": 100,"matchAny": false,"tags": [],"type": "dashboard"},"type": "dashboard"}]},
"description": "BGP update and withdrawal rates over time. Teaches what normal BGP traffic looks like and how to detect route churn or instability.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: A healthy network has far more advertisements than withdrawals. A withdrawal spike often signals a link failure or route flap.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"drawStyle": "bars","fillOpacity": 60,"lineWidth": 1,"spanNulls": false,"stacking": {"group": "A","mode": "none"}},
"unit": "short"
},
"overrides": [
{"matcher": {"id": "byName","options": "Advertisements"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]},
{"matcher": {"id": "byName","options": "Withdrawals"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]}
]
},
"gridPos": {"h": 10,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {"legend": {"calcs": ["sum","max"],"displayMode": "list","placement": "bottom"},"tooltip": {"mode": "multi"}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(timestamp,'5m'),\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) AS \"Advertisements\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) AS \"Withdrawals\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "BGP Updates Over Time — Advertisements vs Withdrawals",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 100},{"color": "red","value": 1000}]},"unit": "short","mappings": []}},
"gridPos": {"h": 5,"w": 6,"x": 0,"y": 10},
"id": 2,
"options": {"colorMode": "background","graphMode": "area","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"Total Updates (24h)\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Total Updates (24h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Withdrawal rate above 30% is unusual. Above 50% may indicate a route leak or oscillation event.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 20},{"color": "red","value": 50}]},"unit": "percent","max": 100}},
"gridPos": {"h": 5,"w": 6,"x": 6,"y": 10},
"id": 3,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(100.0 * SUM(CASE WHEN iswithdrawn THEN 1 ELSE 0 END) / NULLIF(COUNT(*),0), 1) AS \"Withdrawal Rate %\"\nFROM ip_rib_log\nWHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Withdrawal Rate % (24h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1000},{"color": "red","value": 10000}]},"unit": "short"}},
"gridPos": {"h": 5,"w": 6,"x": 12,"y": 10},
"id": 4,
"options": {"colorMode": "value","graphMode": "area","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT peer_hash_id) AS \"Active Peers\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '1 hour'",
"refId": "A"
}
],
"title": "Active Reporting Peers (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 500},{"color": "red","value": 2000}]},"unit": "short"}},
"gridPos": {"h": 5,"w": 6,"x": 18,"y": 10},
"id": 5,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT prefix) AS \"Unique Prefixes Updated (24h)\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Unique Prefixes Updated (24h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Updates per peer over time. Learn: Peers should have similar update rates. A peer with dramatically more updates may be experiencing instability or receiving a full BGP table with frequent changes.",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"drawStyle": "line","fillOpacity": 10,"lineWidth": 1,"spanNulls": false},"unit": "short"}
},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 15},
"id": 6,
"options": {"legend": {"calcs": [],"displayMode": "list","placement": "right"},"tooltip": {"mode": "multi"}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(s.interval_time,'30m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n SUM(s.advertise_avg + s.withdraw_avg) AS \"Updates\"\nFROM stats_peer_update_counts s\nJOIN bgp_peers p ON p.hash_id = s.peer_hash_id\nWHERE $__timeFilter(s.interval_time)\nGROUP BY 1, 2\nORDER BY 1",
"refId": "A"
}
],
"title": "Update Rate by Peer (30-min buckets)",
"type": "timeseries"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","churn"],
"time": {"from": "now-24h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "BGP Update Rate & Churn",
"uid": "obmp-learn-01",
"version": 1
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,404 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "Combined view of BMP control-plane data (from PostgreSQL) and gNMI data-plane telemetry (from InfluxDB). Correlate BGP peer state with interface traffic patterns.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"templating": {
"list": [
{
"current": {},
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"definition": "from(bucket: \"telemetry\")\n |> range(start: -1h)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> keep(columns: [\"source\"])\n |> distinct(column: \"source\")\n |> sort()",
"hide": 0,
"includeAll": true,
"label": "Router",
"multi": true,
"name": "router",
"options": [],
"query": "import \"influxdata/influxdb/schema\"\nschema.tagValues(bucket: \"telemetry\", tag: \"source\", predicate: (r) => r._measurement == \"interface_counters\", start: -1h)",
"refresh": 2,
"regex": "",
"type": "query"
}
]
},
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Current BGP peer status from the OpenBMP PostgreSQL database. Shows peer address, name, and session state.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": true
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "state"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "mappings",
"value": [
{
"options": {
"down": {
"color": "red",
"index": 1,
"text": "DOWN"
},
"up": {
"color": "green",
"index": 0,
"text": "UP"
}
},
"type": "value"
}
]
}
]
},
{
"matcher": {
"id": "byName",
"options": "peer_addr"
},
"properties": [
{
"id": "custom.width",
"value": 160
}
]
},
{
"matcher": {
"id": "byName",
"options": "name"
},
"properties": [
{
"id": "custom.width",
"value": 200
}
]
}
]
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": false,
"displayName": "state"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n p.peer_addr,\n COALESCE(p.name, p.peer_addr::text) AS name,\n p.state,\n p.peer_as AS \"AS\",\n p.router_hash_id IS NOT NULL AS \"BMP Active\",\n p.timestamp AS \"Last State Change\"\nFROM bgp_peers p\nWHERE p.isprepolicy = true\nORDER BY p.state, p.peer_addr",
"refId": "A"
}
],
"title": "BGP Peer Status",
"type": "table"
},
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Interface traffic rates from gNMI streaming telemetry. Shows bytes per second for each interface across selected routers.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "Bps"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 10
},
"id": 2,
"options": {
"legend": {
"calcs": [
"mean",
"max"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r._field == \"in-octets\" or r._field == \"out-octets\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)\n |> map(fn: (r) => ({r with _value: if r._value < 0.0 then 0.0 else r._value}))",
"refId": "A"
}
],
"title": "Interface Traffic",
"type": "timeseries"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "BGP update activity over time from the OpenBMP PostgreSQL database. Shows peer event transitions and update counts for correlation with traffic patterns.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "bars",
"fillOpacity": 50,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "normal"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 20
},
"id": 3,
"options": {
"legend": {
"calcs": [
"sum"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(e.timestamp, '1m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n COUNT(*) AS \"value\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nGROUP BY 1, 2\nORDER BY 1",
"refId": "A"
}
],
"title": "BGP Update Activity",
"type": "timeseries"
},
{
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"gridPos": {
"h": 6,
"w": 24,
"x": 0,
"y": 30
},
"id": 4,
"options": {
"code": {
"language": "plaintext",
"showLineNumbers": false,
"showMiniMap": false
},
"content": "## Combined BMP + Telemetry View\n\nThis dashboard integrates two complementary data sources to provide a unified network monitoring view:\n\n### Control Plane (BMP via PostgreSQL)\n- **BGP Peer Status** -- Real-time BGP session state from BMP (OpenBMP)\n- **BGP Update Activity** -- Session transitions and update events from `peer_event_log`\n\n### Data Plane (gNMI via InfluxDB)\n- **Interface Traffic** -- Streaming telemetry byte rates collected via gNMI at 10-second intervals\n\n### Correlation Use Cases\n- A BGP peer flap (control plane) should correlate with a traffic shift on affected interfaces (data plane)\n- Sustained high interface utilization (data plane) may precede BGP session resets due to congestion\n- Compare the number of active BGP peers with interface traffic to validate routing convergence",
"mode": "markdown"
},
"title": "About",
"type": "text"
}
],
"schemaVersion": 39,
"style": "dark",
"tags": [
"obmp-telemetry",
"obmp",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "Combined BMP + Telemetry View",
"uid": "obmp-telem-03",
"version": 1
}

View File

@ -0,0 +1,491 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "Interface error and drop counters collected via gNMI streaming telemetry. Helps identify interfaces with packet loss or physical layer issues.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"templating": {
"list": [
{
"current": {},
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"definition": "from(bucket: \"telemetry\")\n |> range(start: -1h)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> keep(columns: [\"source\"])\n |> distinct(column: \"source\")\n |> sort()",
"hide": 0,
"includeAll": true,
"label": "Router",
"multi": true,
"name": "router",
"options": [],
"query": "import \"influxdata/influxdb/schema\"\nschema.tagValues(bucket: \"telemetry\", tag: \"source\", predicate: (r) => r._measurement == \"interface_counters\", start: -1h)",
"refresh": 2,
"regex": "",
"type": "query"
},
{
"current": {},
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"definition": "from(bucket: \"telemetry\")\n |> range(start: -1h)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> keep(columns: [\"name\"])\n |> distinct(column: \"name\")\n |> sort()",
"hide": 0,
"includeAll": true,
"label": "Interface",
"multi": true,
"name": "interface",
"options": [],
"query": "import \"influxdata/influxdb/schema\"\nschema.tagValues(bucket: \"telemetry\", tag: \"name\", predicate: (r) => r._measurement == \"interface_counters\" and r.source =~ /${router:regex}/, start: -1h)",
"refresh": 2,
"regex": "",
"type": "query"
}
]
},
"panels": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Interface error counters over time: input errors, output errors, and CRC errors. A rising trend indicates physical or configuration issues.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": [
"mean",
"max",
"last"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-errors\" or r._field == \"out-errors\" or r._field == \"in-fcs-errors\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)",
"refId": "A"
}
],
"title": "Interface Errors",
"type": "timeseries"
},
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Interface drop counters over time: input drops and output drops. Drops indicate congestion or queue overflow.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 10
},
"id": 2,
"options": {
"legend": {
"calcs": [
"mean",
"max",
"last"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-discards\" or r._field == \"out-discards\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)",
"refId": "A"
}
],
"title": "Interface Drops",
"type": "timeseries"
},
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Summary table showing the latest error and drop counter values per interface. Useful for quickly identifying problematic interfaces.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": true
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "in-errors"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "out-errors"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "in-discards"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "out-discards"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
}
}
]
}
]
},
"gridPos": {
"h": 12,
"w": 24,
"x": 0,
"y": 20
},
"id": 3,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "in-errors"
}
]
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-errors\" or r._field == \"out-errors\" or r._field == \"in-fcs-errors\" or r._field == \"in-discards\" or r._field == \"out-discards\")\n |> toFloat()\n |> last()\n |> pivot(rowKey: [\"_time\"], columnKey: [\"_field\"], valueColumn: \"_value\")\n |> keep(columns: [\"source\", \"name\", \"in-errors\", \"out-errors\", \"in-fcs-errors\", \"in-discards\", \"out-discards\"])\n |> sort(columns: [\"in-errors\"], desc: true)",
"refId": "A"
}
],
"title": "Error Summary Table",
"type": "table"
}
],
"schemaVersion": 39,
"style": "dark",
"tags": [
"obmp-telemetry",
"obmp",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "Interface Errors & Drops",
"uid": "obmp-telem-02",
"version": 1
}

View File

@ -0,0 +1,385 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "Interface utilization metrics collected via gNMI streaming telemetry from IOS-XR routers. Shows byte rates, packet rates, and top interfaces by traffic volume.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"templating": {
"list": [
{
"current": {},
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"definition": "from(bucket: \"telemetry\")\n |> range(start: -1h)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> keep(columns: [\"source\"])\n |> distinct(column: \"source\")\n |> sort()",
"hide": 0,
"includeAll": true,
"label": "Router",
"multi": true,
"name": "router",
"options": [],
"query": "import \"influxdata/influxdb/schema\"\nschema.tagValues(bucket: \"telemetry\", tag: \"source\", predicate: (r) => r._measurement == \"interface_counters\", start: -1h)",
"refresh": 2,
"regex": "",
"type": "query"
},
{
"current": {},
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"definition": "from(bucket: \"telemetry\")\n |> range(start: -1h)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> keep(columns: [\"name\"])\n |> distinct(column: \"name\")\n |> sort()",
"hide": 0,
"includeAll": true,
"label": "Interface",
"multi": true,
"name": "interface",
"options": [],
"query": "import \"influxdata/influxdb/schema\"\nschema.tagValues(bucket: \"telemetry\", tag: \"name\", predicate: (r) => r._measurement == \"interface_counters\" and r.source =~ /${router:regex}/, start: -1h)",
"refresh": 2,
"regex": "",
"type": "query"
}
]
},
"panels": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Rate of bytes received and sent per interface, computed as the derivative of cumulative counters. Unit: bytes per second.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "Bps"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": [
"mean",
"max"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-octets\" or r._field == \"out-octets\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)\n |> map(fn: (r) => ({r with _value: if r._value < 0.0 then 0.0 else r._value}))",
"refId": "A"
}
],
"title": "Input/Output Bytes Rate",
"type": "timeseries"
},
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Rate of packets received and sent per interface, computed as the derivative of cumulative counters. Unit: packets per second.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "pps"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 10
},
"id": 2,
"options": {
"legend": {
"calcs": [
"mean",
"max"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-pkts\" or r._field == \"out-pkts\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)\n |> map(fn: (r) => ({r with _value: if r._value < 0.0 then 0.0 else r._value}))",
"refId": "A"
}
],
"title": "Input/Output Packets Rate",
"type": "timeseries"
},
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Top interfaces ranked by total bytes (received + sent) over the selected time range.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"fillOpacity": 80,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineWidth": 1,
"scaleDistribution": {
"type": "linear"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "decbytes"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 20
},
"id": 3,
"options": {
"barRadius": 0,
"barWidth": 0.6,
"fullHighlight": false,
"groupWidth": 0.7,
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom"
},
"orientation": "horizontal",
"showValue": "auto",
"stacking": "none",
"tooltip": {
"mode": "single",
"sort": "none"
},
"xTickLabelRotation": 0
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-octets\" or r._field == \"out-octets\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)\n |> group(columns: [\"source\", \"name\", \"_field\"])\n |> sum()\n |> group(columns: [\"source\", \"name\"])\n |> sum()\n |> group()\n |> sort(columns: [\"_value\"], desc: true)\n |> limit(n: 20)",
"refId": "A"
}
],
"title": "Top Interfaces by Traffic",
"type": "barchart"
},
{
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"gridPos": {
"h": 4,
"w": 24,
"x": 0,
"y": 30
},
"id": 4,
"options": {
"code": {
"language": "plaintext",
"showLineNumbers": false,
"showMiniMap": false
},
"content": "## Interface Utilization Dashboard\n\nThis dashboard displays real-time interface utilization metrics collected via **gNMI streaming telemetry** from IOS-XR routers.\n\n- **Data source:** InfluxDB (Telegraf gNMI input plugin)\n- **YANG model:** OpenConfig (`openconfig-interfaces`)\n- **Subscription path:** `/interfaces/interface/state/counters`\n- **Sample interval:** 10 seconds\n\nUse the **Router** and **Interface** template variables at the top to filter the view.",
"mode": "markdown"
},
"title": "About",
"type": "text"
}
],
"schemaVersion": 39,
"style": "dark",
"tags": [
"obmp-telemetry",
"obmp",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "Interface Utilization",
"uid": "obmp-telem-01",
"version": 1
}

View File

@ -0,0 +1,112 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Kafka consumer-group lag for the OpenBMP ingestion path, sampled every 30s by the kafka-lag-monitor service. Use it to sanity-check ingestion under load: lag spikes during a BGP convergence storm and should drain back to ~0; the consumer member count rises when psql-app is scaled out.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total lag across all partitions at the latest sample.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 50000},{"color": "red","value": 1000000}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "background","graphMode": "area","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT sum(lag) AS \"Total Lag\" FROM kafka_consumer_lag WHERE group_id = '$group' AND ts = (SELECT max(ts) FROM kafka_consumer_lag WHERE group_id = '$group')","refId": "A"}],
"title": "Current Total Lag","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Active consumer members in the group at the latest sample. Rises when psql-app is scaled out.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 6,"y": 0},
"id": 2,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT members AS \"Consumers\" FROM kafka_consumer_members WHERE group_id = '$group' ORDER BY ts DESC LIMIT 1","refId": "A"}],
"title": "Active Consumers","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Topic-partitions tracked for the group at the latest sample.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "purple","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 12,"y": 0},
"id": 3,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(*) AS \"Partitions\" FROM kafka_consumer_lag WHERE group_id = '$group' AND ts = (SELECT max(ts) FROM kafka_consumer_lag WHERE group_id = '$group')","refId": "A"}],
"title": "Partitions Monitored","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Highest total lag observed in the selected time range.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 50000},{"color": "red","value": 1000000}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 18,"y": 0},
"id": 4,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT max(t.total) AS \"Peak Lag\" FROM (SELECT ts, sum(lag) AS total FROM kafka_consumer_lag WHERE group_id = '$group' AND $__timeFilter(ts) GROUP BY ts) t","refId": "A"}],
"title": "Peak Lag (range)","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total consumer lag over time. A healthy ingestion path returns to near-zero after a burst; sustained growth means consumers cannot keep up.",
"fieldConfig": {"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 25,"lineInterpolation": "smooth","lineWidth": 1,"showPoints": "never","spanNulls": true},"unit": "short"},"overrides": []},
"gridPos": {"h": 12,"w": 12,"x": 0,"y": 4},
"id": 5,
"options": {"legend": {"calcs": ["max","last"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT ts AS time, sum(lag) AS \"Total lag\" FROM kafka_consumer_lag WHERE group_id = '$group' AND $__timeFilter(ts) GROUP BY ts ORDER BY ts","refId": "A"}],
"title": "Total Consumer Lag","type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Active consumer members over time. Step changes correspond to psql-app scale events or rebalances.",
"fieldConfig": {"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 15,"lineInterpolation": "stepAfter","lineWidth": 2,"showPoints": "never","spanNulls": true},"unit": "short"},"overrides": []},
"gridPos": {"h": 12,"w": 12,"x": 12,"y": 4},
"id": 6,
"options": {"legend": {"calcs": ["min","max"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "single","sort": "none"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT ts AS time, members AS \"Consumers\" FROM kafka_consumer_members WHERE group_id = '$group' AND $__timeFilter(ts) ORDER BY ts","refId": "A"}],
"title": "Consumer Members","type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Lag broken down by topic. unicast_prefix and base_attribute carry the BGP route churn and dominate during a convergence storm.",
"fieldConfig": {"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 20,"lineInterpolation": "smooth","lineWidth": 1,"showPoints": "never","spanNulls": true,"stacking": {"group": "A","mode": "normal"}},"unit": "short"},"overrides": []},
"gridPos": {"h": 12,"w": 12,"x": 0,"y": 16},
"id": 7,
"options": {"legend": {"calcs": ["last"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT ts AS time, topic AS metric, sum(lag) AS lag FROM kafka_consumer_lag WHERE group_id = '$group' AND $__timeFilter(ts) GROUP BY ts, topic ORDER BY ts","refId": "A"}],
"title": "Lag by Topic","type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Per-partition lag for openbmp.parsed.unicast_prefix. A single deep partition that lags while others stay flat indicates a hot partition (skewed message keying) -- adding consumers gives it a dedicated thread but cannot split it.",
"fieldConfig": {"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 10,"lineInterpolation": "smooth","lineWidth": 1,"showPoints": "never","spanNulls": true},"unit": "short"},"overrides": []},
"gridPos": {"h": 12,"w": 12,"x": 12,"y": 16},
"id": 8,
"options": {"legend": {"calcs": ["max","last"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT ts AS time, 'p' || partition AS metric, lag FROM kafka_consumer_lag WHERE group_id = '$group' AND topic = 'openbmp.parsed.unicast_prefix' AND $__timeFilter(ts) ORDER BY ts","refId": "A"}],
"title": "Lag by Partition (unicast_prefix)","type": "timeseries"
}
],
"refresh": "30s",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "telemetry", "kafka"],
"templating": {
"list": [
{"name": "group","type": "query","label": "Consumer Group","datasource": {"type": "postgres","uid": "obmp_postgres"},"query": "SELECT DISTINCT group_id FROM kafka_consumer_members ORDER BY 1","definition": "SELECT DISTINCT group_id FROM kafka_consumer_members ORDER BY 1","refresh": 1,"includeAll": false,"multi": false,"current": {"selected": true,"text": "obmp-psql-consumer","value": "obmp-psql-consumer"},"options": [],"sort": 1,"hide": 0}
]
},
"time": {"from": "now-3h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "Kafka Ingestion Lag",
"uid": "kafka-lag",
"version": 1,
"weekStart": ""
}

View File

@ -0,0 +1,98 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Real-time BGP churn rate from the obmp-churn-monitor fast-path consumer. This consumer reads Kafka with its own group and only counts announcements/withdrawals, so it stays current even when the main psql-app ingestion pipeline lags minutes behind during a churn storm. Use the Kafka Ingestion Lag dashboard alongside this: when lag is high, THIS dashboard is still telling you what is churning.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": true,
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total churn events (announcements + withdrawals) in the last minute.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 100000},{"color": "red","value": 1000000}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "background","graphMode": "area","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT COALESCE(sum(adds + dels),0) AS \"Churn (1m)\" FROM churn_metrics WHERE ts > now() - interval '1 minute'","refId": "A"}],
"title": "Churn Events (last min)","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Route announcements in the last minute.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 6,"y": 0},
"id": 2,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT COALESCE(sum(adds),0) AS \"Announcements\" FROM churn_metrics WHERE ts > now() - interval '1 minute'","refId": "A"}],
"title": "Announcements (last min)","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Route withdrawals in the last minute.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "orange","value": 1}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 12,"y": 0},
"id": 3,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT COALESCE(sum(dels),0) AS \"Withdrawals\" FROM churn_metrics WHERE ts > now() - interval '1 minute'","refId": "A"}],
"title": "Withdrawals (last min)","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Distinct BGP sessions with churn in the last minute.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "purple","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 18,"y": 0},
"id": 4,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(*) AS \"Sessions\" FROM (SELECT router_ip, peer_ip FROM churn_metrics WHERE ts > now() - interval '1 minute' AND (adds > 0 OR dels > 0) GROUP BY router_ip, peer_ip) s","refId": "A"}],
"title": "Churning Sessions","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP churn rate over time -- announcements vs withdrawals per minute. This stays live during a storm even while the Kafka Ingestion Lag dashboard shows the bulk pipeline backed up.",
"fieldConfig": {"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 25,"lineInterpolation": "smooth","lineWidth": 1,"showPoints": "never","spanNulls": true},"unit": "short"},"overrides": [{"matcher": {"id": "byName","options": "Withdrawals"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]},{"matcher": {"id": "byName","options": "Announcements"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]}]},
"gridPos": {"h": 12,"w": 12,"x": 0,"y": 4},
"id": 5,
"options": {"legend": {"calcs": ["max","sum"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT $__timeGroupAlias(ts,'1m'), sum(adds) AS \"Announcements\", sum(dels) AS \"Withdrawals\" FROM churn_metrics WHERE $__timeFilter(ts) GROUP BY 1 ORDER BY 1","refId": "A"}],
"title": "Churn Rate (per minute)","type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Churn per minute broken down by the BMP router reporting it.",
"fieldConfig": {"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 20,"lineInterpolation": "smooth","lineWidth": 1,"showPoints": "never","spanNulls": true,"stacking": {"group": "A","mode": "normal"}},"unit": "short"},"overrides": []},
"gridPos": {"h": 12,"w": 12,"x": 12,"y": 4},
"id": 6,
"options": {"legend": {"calcs": ["sum"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT $__timeGroupAlias(ts,'1m'), COALESCE(host(router_ip),'(unknown)') AS metric, sum(adds + dels) AS churn FROM churn_metrics WHERE $__timeFilter(ts) GROUP BY 1, router_ip ORDER BY 1","refId": "A"}],
"title": "Churn by Router","type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Busiest BGP sessions by churn over the dashboard time range.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": [{"matcher": {"id": "byName","options": "Withdraws"},"properties": [{"id": "custom.displayMode","value": "color-text"},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "text","value": null},{"color": "orange","value": 1}]}}]}]},
"gridPos": {"h": 12,"w": 24,"x": 0,"y": 16},
"id": 7,
"options": {"showHeader": true,"sortBy": [{"desc": true,"displayName": "Total Churn"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT host(router_ip) AS \"Router\", host(peer_ip) AS \"Peer\", peer_asn AS \"Peer AS\", sum(adds) AS \"Announces\", sum(dels) AS \"Withdraws\", sum(adds + dels) AS \"Total Churn\" FROM churn_metrics WHERE $__timeFilter(ts) GROUP BY router_ip, peer_ip, peer_asn ORDER BY \"Total Churn\" DESC LIMIT 20","refId": "A"}],
"title": "Top Churning Sessions","type": "table"
}
],
"refresh": "10s",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "telemetry", "bgp"],
"templating": {"list": []},
"time": {"from": "now-1h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "Live BGP Churn",
"uid": "live-churn",
"version": 1,
"weekStart": ""
}

View File

@ -0,0 +1,78 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Per-container CPU, memory, and I/O for the OpenBMP stack — collected by the Telegraf docker input. Watch memory % to catch a container approaching its mem_limit before it OOM-crashes.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Memory usage as a percentage of each container's mem_limit. Sustained values near 100% precede an OOM kill.",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 10,"lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"}},"unit": "percent","min": 0,"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "orange","value": 80},{"color": "red","value": 95}]}},
"overrides": []
},
"gridPos": {"h": 14,"w": 12,"x": 0,"y": 0},
"id": 1,
"options": {"legend": {"calcs": ["last","max"],"displayMode": "table","placement": "bottom","showLegend": true,"sortBy": "Max","sortDesc": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"docker_container_mem\" and r._field == \"usage_percent\")\n |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)\n |> keep(columns: [\"_time\", \"_value\", \"container_name\"])\n |> group(columns: [\"container_name\"])","refId": "A"}],
"title": "Container Memory %",
"type": "timeseries"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "CPU usage per container (cpu-total). Can exceed 100% — that is multiple cores.",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 10,"lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"}},"unit": "percent","min": 0},
"overrides": []
},
"gridPos": {"h": 14,"w": 12,"x": 12,"y": 0},
"id": 2,
"options": {"legend": {"calcs": ["last","max"],"displayMode": "table","placement": "bottom","showLegend": true,"sortBy": "Max","sortDesc": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"docker_container_cpu\" and r._field == \"usage_percent\" and r.cpu == \"cpu-total\")\n |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)\n |> keep(columns: [\"_time\", \"_value\", \"container_name\"])\n |> group(columns: [\"container_name\"])","refId": "A"}],
"title": "Container CPU %",
"type": "timeseries"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Absolute memory usage per container.",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 10,"lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"}},"unit": "bytes","min": 0},
"overrides": []
},
"gridPos": {"h": 14,"w": 12,"x": 0,"y": 14},
"id": 3,
"options": {"legend": {"calcs": ["last","max"],"displayMode": "table","placement": "bottom","showLegend": true,"sortBy": "Max","sortDesc": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"docker_container_mem\" and r._field == \"usage\")\n |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)\n |> keep(columns: [\"_time\", \"_value\", \"container_name\"])\n |> group(columns: [\"container_name\"])","refId": "A"}],
"title": "Container Memory Usage",
"type": "timeseries"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Current memory pressure per container. Anything in orange/red is close to its mem_limit.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"},"unit": "percent"},
"overrides": [{"matcher": {"id": "byName","options": "Memory %"},"properties": [{"id": "custom.displayMode","value": "gradient-gauge"},{"id": "max","value": 100},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "orange","value": 80},{"color": "red","value": 95}]}}]}]
},
"gridPos": {"h": 14,"w": 12,"x": 12,"y": 14},
"id": 4,
"options": {"showHeader": true,"sortBy": [{"desc": true,"displayName": "Memory %"}]},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: -5m)\n |> filter(fn: (r) => r._measurement == \"docker_container_mem\" and r._field == \"usage_percent\")\n |> last()\n |> keep(columns: [\"container_name\", \"_value\"])\n |> group()\n |> rename(columns: {_value: \"Memory %\", container_name: \"Container\"})\n |> sort(columns: [\"Memory %\"], desc: true)","refId": "A"}],
"title": "Current Memory % by Container",
"type": "table"
}
],
"refresh": "30s",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "telemetry", "resources"],
"time": {"from": "now-1h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Stack Resources",
"uid": "obmp-stack-resources",
"version": 1
}

View File

@ -0,0 +1,107 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Disk-space, PostgreSQL database/table growth, and GoBGP global-feed health. Disk and DB metrics come from Telegraf -> InfluxDB; feed health is read live from PostgreSQL. Watch this when the full-table feed is ingesting — the RIB grows fast.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Current size of the openbmp PostgreSQL database.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "decbytes","thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 8,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "value","graphMode": "area","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"postgresql_db_size\" and r._field == \"bytes\")\n |> last()\n |> keep(columns: [\"_time\", \"_value\"])","refId": "A"}],
"title": "openbmp Database Size","type": "stat"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Highest filesystem utilisation across monitored host volumes. Orange >80%, red >95%.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "percent","min": 0,"max": 100,"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "orange","value": 80},{"color": "red","value": 95}]}},"overrides": []},
"gridPos": {"h": 4,"w": 8,"x": 8,"y": 0},
"id": 2,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: -10m)\n |> filter(fn: (r) => r._measurement == \"disk\" and r._field == \"used_percent\")\n |> last()\n |> max()\n |> keep(columns: [\"_value\"])","refId": "A"}],
"title": "Busiest Filesystem","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routes currently held by the GoBGP global-table feed peer.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]}},"overrides": []},
"gridPos": {"h": 4,"w": 8,"x": 16,"y": 0},
"id": 3,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(*) AS \"GoBGP Feed Routes\" FROM ip_rib r JOIN bgp_peers p ON p.hash_id = r.peer_hash_id JOIN routers rt ON rt.hash_id = p.router_hash_id WHERE rt.name = 'GoBGP' AND r.iswithdrawn = false","refId": "A"}],
"title": "GoBGP Feed Routes","type": "stat"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "openbmp database size over time. A steady climb is expected while the global feed ingests; a plateau means it has converged.",
"fieldConfig": {"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 15,"lineInterpolation": "smooth","lineWidth": 2,"pointSize": 5,"showPoints": "never","spanNulls": true},"unit": "decbytes","min": 0},"overrides": []},
"gridPos": {"h": 8,"w": 12,"x": 0,"y": 4},
"id": 4,
"options": {"legend": {"calcs": ["last","max"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "single","sort": "none"}},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"postgresql_db_size\" and r._field == \"bytes\")\n |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)\n |> keep(columns: [\"_time\", \"_value\"])","refId": "A"}],
"title": "Database Size Over Time","type": "timeseries"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Filesystem utilisation per host volume. Threshold lines at 80% and 95%.",
"fieldConfig": {"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 10,"lineInterpolation": "linear","lineWidth": 2,"pointSize": 5,"showPoints": "never","spanNulls": true,"thresholdsStyle": {"mode": "line"}},"unit": "percent","min": 0,"max": 100,"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "orange","value": 80},{"color": "red","value": 95}]}},"overrides": []},
"gridPos": {"h": 8,"w": 12,"x": 12,"y": 4},
"id": 5,
"options": {"legend": {"calcs": ["last"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"disk\" and r._field == \"used_percent\")\n |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)\n |> keep(columns: [\"_time\", \"_value\", \"path\"])\n |> group(columns: [\"path\"])","refId": "A"}],
"title": "Filesystem Usage %","type": "timeseries"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Largest tables in the openbmp database (total relation size, incl. indexes + TOAST).",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"},"unit": "decbytes"},"overrides": [{"matcher": {"id": "byName","options": "Size"},"properties": [{"id": "custom.displayMode","value": "gradient-gauge"},{"id": "color","value": {"mode": "continuous-BlPu"}}]}]},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 12},
"id": 6,
"options": {"showHeader": true,"sortBy": [{"desc": true,"displayName": "Size"}]},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: -15m)\n |> filter(fn: (r) => r._measurement == \"postgresql_table_size\" and r._field == \"bytes\")\n |> last()\n |> keep(columns: [\"tablename\", \"_value\"])\n |> group()\n |> rename(columns: {_value: \"Size\", tablename: \"Table\"})\n |> sort(columns: [\"Size\"], desc: true)","refId": "A"}],
"title": "Largest Tables","type": "table"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Current free space and utilisation per host filesystem.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": [{"matcher": {"id": "byName","options": "Used %"},"properties": [{"id": "unit","value": "percent"},{"id": "custom.displayMode","value": "gradient-gauge"},{"id": "max","value": 100},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "orange","value": 80},{"color": "red","value": 95}]}}]},{"matcher": {"id": "byName","options": "Free"},"properties": [{"id": "unit","value": "decbytes"}]}]},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 12},
"id": 7,
"options": {"showHeader": true,"sortBy": [{"desc": true,"displayName": "Used %"}]},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "free = from(bucket: \"telemetry\")\n |> range(start: -15m)\n |> filter(fn: (r) => r._measurement == \"disk\" and r._field == \"free\")\n |> last()\n |> keep(columns: [\"path\", \"_value\"])\n |> rename(columns: {_value: \"Free\"})\npct = from(bucket: \"telemetry\")\n |> range(start: -15m)\n |> filter(fn: (r) => r._measurement == \"disk\" and r._field == \"used_percent\")\n |> last()\n |> keep(columns: [\"path\", \"_value\"])\n |> rename(columns: {_value: \"Used %\"})\njoin(tables: {f: free, p: pct}, on: [\"path\"])\n |> rename(columns: {path: \"Filesystem\"})\n |> group()","refId": "A"}],
"title": "Filesystem Free Space","type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP sessions of the GoBGP global-table feed and their state. Both the IPv4 and IPv6 sessions to AS57355 should read 'up'.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": [{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"type": "value","options": {"up": {"color": "green","index": 0},"down": {"color": "red","index": 1}}}]}]}]},
"gridPos": {"h": 6,"w": 24,"x": 0,"y": 21},
"id": 8,
"options": {"showHeader": true,"sortBy": [{"desc": false,"displayName": "Peer"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT host(vp.peerip) AS \"Peer\",\n vp.peerasn AS \"Peer AS\",\n vp.peer_state AS \"State\",\n (SELECT count(*) FROM ip_rib r WHERE r.peer_hash_id = vp.peer_hash_id AND r.iswithdrawn = false) AS \"Routes\"\nFROM v_peers vp\nWHERE vp.routername = 'GoBGP'\nORDER BY vp.peerip","refId": "A"}],
"title": "GoBGP Feed — BGP Sessions","type": "table"
}
],
"refresh": "1m",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "telemetry", "storage"],
"templating": {"list": []},
"time": {"from": "now-24h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Storage & Feed Health",
"uid": "obmp-storage-health",
"version": 1,
"weekStart": ""
}

View File

@ -24,25 +24,47 @@
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 3,
"id": null,
"iteration": 1654876929746,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
"aliasColors": {},
"breakPoint": "50%",
"combine": {
"label": "Others",
"threshold": 0
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"fontSize": "80%",
"format": "none",
"description": "IPv4 vs IPv6 prefix count advertised by this ASN.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 5,
@ -50,24 +72,50 @@
"y": 0
},
"id": 6,
"legend": {
"show": true,
"values": true
},
"legendType": "Under graph",
"links": [],
"maxDataPoints": 3,
"nullPointMode": "connected",
"pieType": "pie",
"strokeWidth": 1,
"options": {
"displayLabels": [
"value"
],
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "bottom",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"alias": "",
"format": "time_series",
"rawSql": "SELECT\n max(timestamp) as time,\n count(*) as \"ipv4\"\nFROM\n global_ip_rib\nWHERE\n recv_origin_as = [[asn_num]]\n and family(prefix) = 4\nGROUP BY prefix\n",
"refId": "A"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"alias": "",
"format": "time_series",
"rawSql": "SELECT\n max(timestamp) as time,\n count(*) as \"ipv6\"\nFROM\n global_ip_rib\nWHERE\n recv_origin_as = [[asn_num]]\n and family(prefix) = 6\nGROUP BY prefix\n",
@ -75,8 +123,7 @@
}
],
"title": "Advertised IP Addresses",
"type": "grafana-piechart-panel",
"valueName": "total"
"type": "piechart"
},
{
"datasource": {
@ -175,99 +222,91 @@
"type": "stat"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"description": "IPv4/IPv6 prefixes originated by this ASN over time, with RPKI/IRR coverage (from stats_ip_origins).",
"fieldConfig": {
"defaults": {
"links": []
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 15,
"x": 9,
"y": 0
},
"hiddenSeries": false,
"id": 14,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"hideEmpty": false,
"hideZero": false,
"max": true,
"min": true,
"rightSide": true,
"show": true,
"total": false,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
"legend": {
"calcs": [
"min",
"max",
"mean"
],
"displayMode": "table",
"placement": "right",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"percentage": false,
"pluginVersion": "8.5.4",
"pointradius": 5,
"points": true,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"alias": "",
"format": "time_series",
"rawSql": "SELECT\n $__time(interval_time),\n v4_prefixes,v6_prefixes,v4_with_rpki,v6_with_rpki,v4_with_irr,v6_with_irr\nFROM\n stats_ip_origins\nWHERE\n $__timeFilter(interval_time) and asn = [[asn_num]]\nORDER BY interval_time asc\n",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": "24h",
"timeRegions": [],
"title": "Originating Prefix Trend",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"mode": "time",
"show": true,
"values": []
},
"yaxes": [
{
"decimals": 0,
"format": "none",
"logBase": 1,
"show": true
},
{
"format": "short",
"logBase": 1,
"show": false
}
],
"yaxis": {
"align": false
}
"type": "timeseries"
},
{
"datasource": {
@ -984,32 +1023,31 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-base"
"obmp",
"obmp-nav",
"operations"
],
"templating": {
"list": [
{
"current": {
"selected": false,
"text": "714",
"value": "714"
},
"hide": 0,
"includeAll": false,
"label": "ASN",
"multi": false,
"name": "asn_num",
"type": "textbox",
"label": "Origin AS",
"description": "Enter an origin AS number \u2014 every panel shows that AS's prefixes, upstreams, and downstreams from the BMP RIB.",
"query": "13335",
"current": {
"text": "13335",
"value": "13335"
},
"options": [
{
"selected": true,
"text": "109",
"value": "109"
"text": "13335",
"value": "13335",
"selected": true
}
],
"query": "109",
"queryValue": "714",
"skipUrlSync": false,
"type": "custom"
"hide": 0,
"skipUrlSync": false
}
]
},

View File

@ -24,8 +24,10 @@
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 4,
"links": [],
"id": null,
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"liveNow": false,
"panels": [
{
@ -182,7 +184,25 @@
]
}
},
"overrides": []
"overrides": [
{
"matcher": {"id": "byName","options": "name"},
"properties": [
{"id": "links","value": [{"title": "Open Router Detail","url": "/d/obmp-router-detail/router-detail?var-router_hash=${__data.fields[\"hash_id\"]}"}]}
]
},
{
"matcher": {"id": "byName","options": "hash_id"},
"properties": [{"id": "custom.hidden","value": true}]
},
{
"matcher": {"id": "byName","options": "state"},
"properties": [
{"id": "custom.displayMode","value": "color-background"},
{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}
]
}
]
},
"gridPos": {
"h": 11,
@ -215,7 +235,7 @@
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select max(r.timestamp) as timestamp,r.name,max(ip_address) as ip_address,max(r.state) as state,\n count(*) as peers,max(description) as description, CASE WHEN max(r.state) = 'up' THEN 1 ELSE 0 END as stateBool\n from routers r\n JOIN bgp_peers p on (r.hash_id = p.router_hash_id)\n GROUP BY r.name;",
"rawSql": "select max(r.hash_id::text) as hash_id,max(r.timestamp) as timestamp,r.name,max(ip_address) as ip_address,max(r.state) as state,\n count(*) as peers,max(description) as description, CASE WHEN max(r.state) = 'up' THEN 1 ELSE 0 END as stateBool\n from routers r\n JOIN bgp_peers p on (r.hash_id = p.router_hash_id)\n GROUP BY r.name;",
"refId": "A",
"select": [
[
@ -397,7 +417,25 @@
]
}
},
"overrides": []
"overrides": [
{
"matcher": {"id": "byName","options": "PeerName"},
"properties": [
{"id": "links","value": [{"title": "Open Peer Detail","url": "/d/obmp-peer-detail/peer-detail?var-peer_hash=${__data.fields[\"peer_hash_id\"]}"}]}
]
},
{
"matcher": {"id": "byName","options": "peer_hash_id"},
"properties": [{"id": "custom.hidden","value": true}]
},
{
"matcher": {"id": "byName","options": "State"},
"properties": [
{"id": "custom.displayMode","value": "color-background"},
{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}
]
}
]
},
"gridPos": {
"h": 14,
@ -435,7 +473,7 @@
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": " SELECT\n max(RouterName) as \"RouterName\",\n max(PeerName) as \"PeerName\",\n max(PeerIP) as \"PeerIP\",\n max(PeerASN) as \"PeerASN\",\n max(peer_state) as \"State\",\n max(LastModified) as \"LastModified\",\n max(v4_prefixes) as \"IPv4 Prefixes\",\n max(v6_prefixes) as \"IPv6 Prefixes\",\n CASE WHEN max(peer_state) = 'up' THEN 1 ELSE 0 END as stateBool\nFROM v_peers p\n LEFT JOIN stats_peer_rib s ON (p.peer_hash_id = s.peer_hash_id\n AND s.interval_time >= now() - interval '20 minutes')\nGROUP BY p.peer_hash_id;\n",
"rawSql": " SELECT\n p.peer_hash_id as peer_hash_id,\n max(RouterName) as \"RouterName\",\n max(PeerName) as \"PeerName\",\n max(PeerIP) as \"PeerIP\",\n max(PeerASN) as \"PeerASN\",\n max(peer_state) as \"State\",\n max(LastModified) as \"LastModified\",\n max(v4_prefixes) as \"IPv4 Prefixes\",\n max(v6_prefixes) as \"IPv6 Prefixes\",\n CASE WHEN max(peer_state) = 'up' THEN 1 ELSE 0 END as stateBool\nFROM v_peers p\n LEFT JOIN stats_peer_rib s ON (p.peer_hash_id = s.peer_hash_id\n AND s.interval_time >= now() - interval '20 minutes')\nGROUP BY p.peer_hash_id;\n",
"refId": "A",
"select": [
[
@ -464,7 +502,9 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-base"
"obmp",
"obmp-nav",
"operations"
],
"templating": {
"list": []

View File

@ -0,0 +1,536 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "BGP peer session health, uptime, and flap analysis. Teaches session stability and how to diagnose flapping peers.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: A healthy BGP mesh shows all peers UP continuously. Any gap in the UP state represents a session flap \u2014 investigate the reset reason.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"fillOpacity": 70,
"lineWidth": 0,
"spanNulls": false
},
"mappings": [
{
"options": {
"down": {
"color": "red",
"index": 1,
"text": "DOWN"
},
"up": {
"color": "green",
"index": 0,
"text": "UP"
}
},
"type": "value"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
},
{
"color": "green",
"value": 1
}
]
}
}
},
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"alignValue": "left",
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"mergeValues": true,
"rowHeight": 0.9,
"showValue": "auto",
"tooltip": {
"mode": "single"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(e.timestamp,'1m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n CASE WHEN e.state = 'up' THEN 1 ELSE 0 END AS \"value\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nORDER BY 1, 2",
"refId": "A"
}
],
"title": "Peer Session State Timeline",
"type": "state-timeline"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Current state of all BGP peers. Learn: 'bmp_reason' tells you why BMP reporting stopped. 'bgp_err_code' shows BGP NOTIFICATION error codes.",
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "State"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "mappings",
"value": [
{
"options": {
"down": {
"color": "red",
"index": 1,
"text": "DOWN"
},
"up": {
"color": "green",
"index": 0,
"text": "UP"
}
},
"type": "value"
}
]
}
]
},
{
"matcher": {
"id": "byName",
"options": "Peer"
},
"properties": [
{
"id": "custom.width",
"value": 200
}
]
},
{
"matcher": {
"id": "byName",
"options": "AS"
},
"properties": [
{
"id": "custom.width",
"value": 80
}
]
}
]
},
"gridPos": {
"h": 12,
"w": 24,
"x": 0,
"y": 8
},
"id": 2,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": false,
"displayName": "State"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n COALESCE(p.name, p.peer_addr::text) AS \"Peer\",\n p.peer_addr AS \"Address\",\n p.peer_as AS \"AS\",\n p.state AS \"State\",\n p.timestamp AS \"Last State Change\",\n p.error_text AS \"Last Error\",\n p.local_hold_time AS \"Hold Time\"\nFROM bgp_peers p\nWHERE p.isprepolicy = true\nORDER BY p.state, p.peer_addr",
"refId": "A"
}
],
"title": "Current Peer State",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Flap count = number of times a peer went from UP to DOWN. A peer flapping more than 2 times per hour needs investigation.",
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Flap Count"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 5
}
]
}
}
]
}
]
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 20
},
"id": 3,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "Flap Count"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n COALESCE(p.name, p.peer_addr::text) AS \"Peer\",\n p.peer_addr AS \"Address\",\n p.peer_as AS \"AS\",\n COUNT(CASE WHEN e.state = 'down' THEN 1 END) AS \"Flap Count\",\n MIN(e.timestamp) AS \"First Event\",\n MAX(e.timestamp) AS \"Last Event\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nGROUP BY p.name, p.peer_addr, p.peer_as\nORDER BY \"Flap Count\" DESC",
"refId": "A"
}
],
"title": "Peer Flap Analysis",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
},
{
"color": "yellow",
"value": 50
},
{
"color": "green",
"value": 90
}
]
},
"unit": "percent",
"max": 100,
"min": 0
}
},
"gridPos": {
"h": 8,
"w": 8,
"x": 0,
"y": 30
},
"id": 4,
"options": {
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"showThresholdLabels": false,
"showThresholdMarkers": true,
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(100.0 * SUM(CASE WHEN state = 'up' THEN 1 ELSE 0 END) / NULLIF(COUNT(*),0), 1) AS \"Mesh Health %\"\nFROM bgp_peers WHERE isprepolicy = true",
"refId": "A"
}
],
"title": "Overall Peer Mesh Health",
"type": "gauge"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
},
{
"color": "green",
"value": 1
}
]
},
"unit": "short",
"mappings": [
{
"options": {
"0": {
"color": "red",
"index": 0,
"text": "DOWN"
}
},
"type": "value"
}
]
}
},
"gridPos": {
"h": 8,
"w": 8,
"x": 8,
"y": 30
},
"id": 5,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n SUM(CASE WHEN state = 'up' THEN 1 ELSE 0 END) AS \"Peers UP\"\nFROM bgp_peers WHERE isprepolicy = true",
"refId": "A"
}
],
"title": "Peers Currently UP",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 5
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 8,
"w": 8,
"x": 16,
"y": 30
},
"id": 6,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n COUNT(CASE WHEN state = 'down' THEN 1 END) AS \"Flap Events (24h)\"\nFROM peer_event_log\nWHERE timestamp > NOW() - INTERVAL '24 hours' AND state = 'down'",
"refId": "A"
}
],
"title": "Flap Events (24h)",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"peers",
"flap",
"obmp-nav"
],
"time": {
"from": "now-24h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "Peer Session Health & Flap Analysis",
"uid": "obmp-learn-02",
"version": 1
}

View File

@ -24,9 +24,21 @@
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 5,
"id": null,
"iteration": 1654877090626,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -49,45 +61,112 @@
"type": "text"
},
{
"circleMaxSize": "15",
"circleMinSize": 2,
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"esMetric": "Count",
"description": "Geolocation of the matched prefix (from the geo_ip table).",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"hideEmpty": false,
"hideZero": false,
"id": 17,
"initialZoom": "1",
"locationData": "table",
"mapCenter": "(0°, 0°)",
"mapCenterLatitude": 0,
"mapCenterLongitude": 0,
"maxDataPoints": 1,
"mouseWheelZoom": false,
"showLegend": false,
"stickyLabels": false,
"tableQueryOptions": {
"geohashField": "geohash",
"labelField": "name",
"latitudeField": "latitude",
"longitudeField": "longitude",
"metricField": "value",
"queryType": "coordinates"
"options": {
"basemap": {
"config": {},
"name": "Layer 0",
"type": "default"
},
"controls": {
"mouseWheelZoom": false,
"showAttribution": true,
"showDebug": false,
"showMeasure": false,
"showScale": false,
"showZoom": true
},
"layers": [
{
"config": {
"showLegend": false,
"style": {
"color": {
"fixed": "red"
},
"opacity": 0.7,
"rotation": {
"fixed": 0,
"max": 360,
"min": -360,
"mode": "mod"
},
"size": {
"fixed": 8,
"max": 15,
"min": 2
},
"symbol": {
"fixed": "img/icons/marker/circle.svg",
"mode": "fixed"
},
"textConfig": {
"fontSize": 12,
"offsetX": 0,
"offsetY": 0,
"textAlign": "center",
"textBaseline": "middle"
}
}
},
"location": {
"latitude": "latitude",
"longitude": "longitude",
"mode": "coords"
},
"name": "Prefix Location",
"tooltip": true,
"type": "markers"
}
],
"tooltip": {
"mode": "details"
},
"view": {
"allLayers": true,
"id": "zero",
"lat": 0,
"lon": 0,
"zoom": 1
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
@ -123,12 +202,8 @@
]
}
],
"thresholds": "0,10",
"title": "Prefix Location",
"type": "grafana-worldmap-panel",
"unitPlural": "",
"unitSingle": "",
"valueName": "current"
"type": "geomap"
},
{
"datasource": {
@ -317,12 +392,52 @@
"type": "piechart"
},
{
"columns": [],
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fontSize": "100%",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"inspect": false
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "origin_as"
},
"properties": [
{
"id": "links",
"value": [
{
"title": "Open ASN View for AS ${__value.raw}",
"url": "/d/asnview-agg/asn-view?var-asn_num=${__value.raw}",
"targetBlank": true
}
]
}
]
}
]
},
"gridPos": {
"h": 6,
"w": 24,
@ -331,53 +446,17 @@
},
"id": 12,
"links": [],
"scroll": true,
"showHeader": true,
"sort": {
"col": 0,
"desc": true
"options": {
"footer": {
"countRows": false,
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"styles": [
{
"alias": "Time",
"align": "auto",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "date"
},
{
"alias": "",
"align": "auto",
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"mappingType": 1,
"pattern": "raw_output",
"preserveFormat": true,
"sanitize": false,
"thresholds": [],
"type": "string",
"unit": "short"
},
{
"alias": "",
"align": "auto",
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"decimals": 2,
"pattern": "/.*/",
"thresholds": [],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"alias": "",
@ -412,16 +491,36 @@
}
],
"title": "ASN Info",
"transform": "table",
"type": "table-old"
"type": "table"
},
{
"columns": [],
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fontSize": "100%",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"inspect": false
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 24,
@ -430,75 +529,17 @@
},
"id": 13,
"links": [],
"scroll": true,
"showHeader": true,
"sort": {
"col": 0,
"desc": true
"options": {
"footer": {
"countRows": false,
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"styles": [
{
"alias": "Time",
"align": "auto",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "date"
},
{
"alias": "",
"align": "auto",
"colorMode": "cell",
"colors": [
"#cca300",
"#e24d42",
"#9ac48a"
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"mappingType": 1,
"pattern": "irr_origin_as",
"thresholds": [
"0",
"1"
],
"type": "number",
"unit": "none"
},
{
"alias": "",
"align": "auto",
"colorMode": "cell",
"colors": [
"#cca300",
"#e24d42",
"#9ac48a"
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"mappingType": 1,
"pattern": "rpki_origin_as",
"thresholds": [
"0",
"1"
],
"type": "number",
"unit": "none"
},
{
"alias": "",
"align": "auto",
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"decimals": 2,
"pattern": "/.*/",
"thresholds": [],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"alias": "",
@ -533,8 +574,7 @@
}
],
"title": "Prefix Info",
"transform": "table",
"type": "table-old"
"type": "table"
},
{
"datasource": {
@ -761,7 +801,9 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-base"
"obmp",
"obmp-nav",
"operations"
],
"templating": {
"list": [

View File

@ -0,0 +1,200 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Per-peer drilldown — BGP session identity, state history, prefix counts, update/withdraw rate, recent events and negotiated capabilities for a single BGP session. The selector is router-qualified ('router -> peer'): prefix counts are routes RECEIVED from the selected peer (Adj-RIB-In). In a route-reflector mesh pick 'client -> core-loopback' to see a client's full received table; 'core -> client-loopback' shows only the client's originated routes.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Current BGP session state for this peer.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]},"mappings": [{"options": {"0": {"color": "red","index": 1,"text": "DOWN"},"1": {"color": "green","index": 0,"text": "UP"}},"type": "value"}],"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, CASE WHEN peer_state = 'up' THEN 1 ELSE 0 END AS \"Peer State\"\nFROM v_peers WHERE peer_hash_id = '$peer_hash'::uuid","refId": "A"}],
"title": "Peer State",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "IPv4 prefixes from this peer (latest stats_peer_rib interval).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 4,"y": 0},
"id": 2,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE((SELECT v4_prefixes FROM stats_peer_rib WHERE peer_hash_id = '$peer_hash'::uuid ORDER BY interval_time DESC LIMIT 1),0) AS \"IPv4 Prefixes\"","refId": "A"}],
"title": "IPv4 Prefixes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "IPv6 prefixes from this peer (latest stats_peer_rib interval).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 8,"y": 0},
"id": 3,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE((SELECT v6_prefixes FROM stats_peer_rib WHERE peer_hash_id = '$peer_hash'::uuid ORDER BY interval_time DESC LIMIT 1),0) AS \"IPv6 Prefixes\"","refId": "A"}],
"title": "IPv6 Prefixes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Updates received from this peer in the last hour (from stats_chg_bypeer).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 12,"y": 0},
"id": 4,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE(SUM(updates),0) AS \"Updates (1h)\"\nFROM stats_chg_bypeer\nWHERE peer_hash_id = '$peer_hash'::uuid AND interval_time > NOW() - INTERVAL '1 hour'","refId": "A"}],
"title": "Updates (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Withdraws received from this peer in the last hour (from stats_chg_bypeer).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 16,"y": 0},
"id": 5,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE(SUM(withdraws),0) AS \"Withdraws (1h)\"\nFROM stats_chg_bypeer\nWHERE peer_hash_id = '$peer_hash'::uuid AND interval_time > NOW() - INTERVAL '1 hour'","refId": "A"}],
"title": "Withdraws (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Session down-events for this peer in the last 24 hours.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 5}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 20,"y": 0},
"id": 6,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Flaps (24h)\"\nFROM peer_event_log\nWHERE peer_hash_id = '$peer_hash'::uuid AND state = 'down' AND timestamp > NOW() - INTERVAL '24 hours'","refId": "A"}],
"title": "Flap Events (24h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Identity and session parameters for the selected peer.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]}]
},
"gridPos": {"h": 5,"w": 24,"x": 0,"y": 4},
"id": 7,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n routername AS \"Router\",\n peername AS \"Peer\",\n host(peerip) AS \"Address\",\n peerasn AS \"Peer AS\",\n as_name AS \"AS Name\",\n peer_state AS \"State\",\n peerholdtime AS \"Hold Time\",\n table_name AS \"Table\",\n lastmodified AS \"Last Change\",\n lastdownmessage AS \"Last Down Message\"\nFROM v_peers\nWHERE peer_hash_id = '$peer_hash'::uuid","refId": "A"}],
"title": "Peer Info",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Session state over the selected range. Any gap to DOWN is a flap.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"custom": {"fillOpacity": 70,"lineWidth": 0,"spanNulls": false},
"mappings": [{"options": {"0": {"color": "red","index": 1,"text": "DOWN"},"1": {"color": "green","index": 0,"text": "UP"}},"type": "value"}],
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]}
}
},
"gridPos": {"h": 7,"w": 24,"x": 0,"y": 9},
"id": 8,
"options": {"alignValue": "left","legend": {"displayMode": "list","placement": "bottom","showLegend": false},"mergeValues": true,"rowHeight": 0.9,"showValue": "auto","tooltip": {"mode": "single"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(e.timestamp,'1m'),\n 'Session' AS metric,\n CASE WHEN e.state = 'up' THEN 1 ELSE 0 END AS \"value\"\nFROM peer_event_log e\nWHERE e.peer_hash_id = '$peer_hash'::uuid AND $__timeFilter(e.timestamp)\nORDER BY 1","refId": "A"}],
"title": "Session State Timeline",
"type": "state-timeline"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP update vs withdraw rate for this peer (from stats_chg_bypeer).",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisCenteredZero": false,"axisColorMode": "text","axisLabel": "","axisPlacement": "auto","barAlignment": 0,"drawStyle": "line","fillOpacity": 20,"gradientMode": "none","lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"scaleDistribution": {"type": "linear"},"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"},"thresholdsStyle": {"mode": "off"}},"unit": "short"},
"overrides": [{"matcher": {"id": "byName","options": "Withdraws"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]},{"matcher": {"id": "byName","options": "Updates"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]}]
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 16},
"id": 9,
"options": {"legend": {"calcs": ["sum"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "none"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(interval_time,'5m'),\n SUM(updates) AS \"Updates\",\n SUM(withdraws) AS \"Withdraws\"\nFROM stats_chg_bypeer\nWHERE peer_hash_id = '$peer_hash'::uuid AND $__timeFilter(interval_time)\nGROUP BY 1\nORDER BY 1","refId": "A"}],
"title": "Update / Withdraw Rate",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Prefix count from this peer over time (from stats_peer_rib).",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisCenteredZero": false,"axisColorMode": "text","axisLabel": "","axisPlacement": "auto","barAlignment": 0,"drawStyle": "line","fillOpacity": 20,"gradientMode": "none","lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"scaleDistribution": {"type": "linear"},"showPoints": "never","spanNulls": true,"stacking": {"group": "A","mode": "none"},"thresholdsStyle": {"mode": "off"}},"unit": "short"}
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 16},
"id": 10,
"options": {"legend": {"calcs": ["last"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "none"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(interval_time,'5m'),\n MAX(v4_prefixes) AS \"IPv4 Prefixes\",\n MAX(v6_prefixes) AS \"IPv6 Prefixes\"\nFROM stats_peer_rib\nWHERE peer_hash_id = '$peer_hash'::uuid AND $__timeFilter(interval_time)\nGROUP BY 1\nORDER BY 1","refId": "A"}],
"title": "Prefix Count Trend",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Recent BGP session state changes for this peer.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]}]
},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 25},
"id": 11,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Time"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n e.timestamp AS \"Time\",\n e.state AS \"State\",\n e.bmp_reason AS \"BMP Reason\",\n e.bgp_err_code AS \"BGP Err Code\",\n e.bgp_err_subcode AS \"BGP Err Subcode\",\n e.error_text AS \"Reason\"\nFROM peer_event_log e\nWHERE e.peer_hash_id = '$peer_hash'::uuid AND $__timeFilter(e.timestamp)\nORDER BY e.timestamp DESC\nLIMIT 100","refId": "A"}],
"title": "Recent Peer Events",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP capabilities negotiated on this session.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto","cellOptions": {"type": "auto","wrapText": true}}},
"overrides": [
{"matcher": {"id": "byName","options": "Sent Capabilities"},"properties": [{"id": "custom.width","value": 600}]},
{"matcher": {"id": "byName","options": "Received Capabilities"},"properties": [{"id": "custom.width","value": 600}]}
]
},
"gridPos": {"h": 8,"w": 24,"x": 0,"y": 34},
"id": 12,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n sentcapabilities AS \"Sent Capabilities\",\n recvcapabilities AS \"Received Capabilities\"\nFROM v_peers\nWHERE peer_hash_id = '$peer_hash'::uuid","refId": "A"}],
"title": "Negotiated Capabilities",
"type": "table"
}
],
"refresh": "1m",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","obmp-nav","operations","peer"],
"templating": {
"list": [
{
"current": {},
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"definition": "select routername || ' -> ' || peername as __text, peer_hash_id as __value from v_peers where length(peername) > 0 order by 1",
"hide": 0,
"includeAll": false,
"label": "Router -> Peer",
"multi": false,
"name": "peer_hash",
"options": [],
"query": "select routername || ' -> ' || peername as __text, peer_hash_id as __value from v_peers where length(peername) > 0 order by 1",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"type": "query"
}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Peer Detail",
"uid": "obmp-peer-detail",
"version": 1
}

View File

@ -0,0 +1,170 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Per-router drilldown — BMP state, peer health, prefix counts, update rate and recent session events for a single monitored router.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BMP session state for this router. Should be UP.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]},"mappings": [{"options": {"0": {"color": "red","index": 1,"text": "DOWN"},"1": {"color": "green","index": 0,"text": "UP"}},"type": "value"}],"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, CASE WHEN state = 'up' THEN 1 ELSE 0 END AS \"Router State\"\nFROM routers WHERE hash_id = '$router_hash'::uuid","refId": "A"}],
"title": "Router State",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP peers on this router that are currently up (pre-policy Adj-RIB-In).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 4,"y": 0},
"id": 2,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Peers Up\"\nFROM bgp_peers\nWHERE router_hash_id = '$router_hash'::uuid AND isprepolicy = true AND state = 'up'","refId": "A"}],
"title": "Peers Up",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP peers on this router that are not up. Investigate any non-zero value.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "red","value": 1}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 8,"y": 0},
"id": 3,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Peers Down\"\nFROM bgp_peers\nWHERE router_hash_id = '$router_hash'::uuid AND isprepolicy = true AND state != 'up'","refId": "A"}],
"title": "Peers Down",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total IPv4 prefixes across this router's peers (latest stats_peer_rib interval per peer).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 12,"y": 0},
"id": 4,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE(SUM(s.v4_prefixes),0) AS \"IPv4 Prefixes\"\nFROM bgp_peers p\nLEFT JOIN LATERAL (SELECT v4_prefixes FROM stats_peer_rib sr WHERE sr.peer_hash_id = p.hash_id ORDER BY interval_time DESC LIMIT 1) s ON true\nWHERE p.router_hash_id = '$router_hash'::uuid AND p.isprepolicy = true","refId": "A"}],
"title": "IPv4 Prefixes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total IPv6 prefixes across this router's peers (latest stats_peer_rib interval per peer).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 16,"y": 0},
"id": 5,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE(SUM(s.v6_prefixes),0) AS \"IPv6 Prefixes\"\nFROM bgp_peers p\nLEFT JOIN LATERAL (SELECT v6_prefixes FROM stats_peer_rib sr WHERE sr.peer_hash_id = p.hash_id ORDER BY interval_time DESC LIMIT 1) s ON true\nWHERE p.router_hash_id = '$router_hash'::uuid AND p.isprepolicy = true","refId": "A"}],
"title": "IPv6 Prefixes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Peer session down-events on this router in the last hour.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 5}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 20,"y": 0},
"id": 6,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Flaps (1h)\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE p.router_hash_id = '$router_hash'::uuid AND e.state = 'down' AND e.timestamp > NOW() - INTERVAL '1 hour'","refId": "A"}],
"title": "Flap Events (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Identity and BMP state for the selected router.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]}]
},
"gridPos": {"h": 5,"w": 24,"x": 0,"y": 4},
"id": 7,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n r.name AS \"Router\",\n host(r.ip_address) AS \"Mgmt IP\",\n host(r.bgp_id) AS \"BGP ID\",\n r.router_as AS \"AS\",\n r.state AS \"State\",\n r.timestamp AS \"Last Update\",\n r.description AS \"Description\"\nFROM routers r\nWHERE r.hash_id = '$router_hash'::uuid","refId": "A"}],
"title": "Router Info",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP peers on this router with state, ASN and latest prefix counts. Click a peer to open Peer Detail.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]},
{"matcher": {"id": "byName","options": "Peer"},"properties": [{"id": "links","value": [{"title": "Open Peer Detail","url": "/d/obmp-peer-detail/peer-detail?var-peer_hash=${__data.fields[\"peer_hash_id\"]}"}]}]},
{"matcher": {"id": "byName","options": "peer_hash_id"},"properties": [{"id": "custom.hidden","value": true}]}
]
},
"gridPos": {"h": 11,"w": 24,"x": 0,"y": 9},
"id": 8,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": false,"displayName": "State"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n p.hash_id AS peer_hash_id,\n COALESCE(NULLIF(p.name,''), p.peer_addr::text) AS \"Peer\",\n host(p.peer_addr) AS \"Address\",\n p.peer_as AS \"AS\",\n p.state AS \"State\",\n COALESCE(s.v4_prefixes,0) AS \"IPv4 Prefixes\",\n COALESCE(s.v6_prefixes,0) AS \"IPv6 Prefixes\",\n p.timestamp AS \"Last Change\"\nFROM bgp_peers p\nLEFT JOIN LATERAL (SELECT v4_prefixes, v6_prefixes FROM stats_peer_rib sr WHERE sr.peer_hash_id = p.hash_id ORDER BY interval_time DESC LIMIT 1) s ON true\nWHERE p.router_hash_id = '$router_hash'::uuid AND p.isprepolicy = true\nORDER BY p.state, p.peer_addr","refId": "A"}],
"title": "Peers",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP update vs withdraw rate across this router's peers (from stats_chg_bypeer).",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisCenteredZero": false,"axisColorMode": "text","axisLabel": "","axisPlacement": "auto","barAlignment": 0,"drawStyle": "line","fillOpacity": 20,"gradientMode": "none","lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"scaleDistribution": {"type": "linear"},"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"},"thresholdsStyle": {"mode": "off"}},"unit": "short"},
"overrides": [{"matcher": {"id": "byName","options": "Withdraws"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]},{"matcher": {"id": "byName","options": "Updates"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]}]
},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 20},
"id": 9,
"options": {"legend": {"calcs": ["sum"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "none"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(c.interval_time,'5m'),\n SUM(c.updates) AS \"Updates\",\n SUM(c.withdraws) AS \"Withdraws\"\nFROM stats_chg_bypeer c\nJOIN bgp_peers p ON p.hash_id = c.peer_hash_id\nWHERE p.router_hash_id = '$router_hash'::uuid AND $__timeFilter(c.interval_time)\nGROUP BY 1\nORDER BY 1","refId": "A"}],
"title": "BGP Update Rate",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Recent BGP session state changes for this router's peers.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]}]
},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 29},
"id": 10,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Time"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n e.timestamp AS \"Time\",\n COALESCE(NULLIF(p.name,''), p.peer_addr::text) AS \"Peer\",\n host(p.peer_addr) AS \"Address\",\n e.state AS \"State\",\n e.error_text AS \"Reason\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE p.router_hash_id = '$router_hash'::uuid AND $__timeFilter(e.timestamp)\nORDER BY e.timestamp DESC\nLIMIT 100","refId": "A"}],
"title": "Recent Peer Events",
"type": "table"
}
],
"refresh": "1m",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","obmp-nav","operations","router"],
"templating": {
"list": [
{
"current": {},
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"definition": "select name as __text, hash_id as __value from routers where length(name) > 0",
"hide": 0,
"includeAll": false,
"label": "Router",
"multi": false,
"name": "router_hash",
"options": [],
"query": "select name as __text, hash_id as __value from routers where length(name) > 0",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"type": "query"
}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Router Detail",
"uid": "obmp-router-detail",
"version": 1
}

View File

@ -0,0 +1,117 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Explores the real Internet routing table pulled by the GoBGP feed (roadmap E3) — the eBGP-multihop session to the AS57355 route server, landed in ip_rib as the '$feed' peer. Use it as the comparison baseline for the Router Diff dashboard.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "datasource","uid": "grafana"},
"gridPos": {"h": 4,"w": 24,"x": 0,"y": 0},
"id": 10,
"options": {"code": {"language": "plaintext","showLineNumbers": false,"showMiniMap": false},"content": "## Global Internet Table\n\nThe real DFZ routing table received from the **AS57355** lab route server via the GoBGP feed, monitored over BMP and stored in `ip_rib` as the **$feed** peer. AS57355 is prepended to every AS path (it is the route server). This dashboard is the comparison baseline for **Router Diff** — select `$feed` there to diff a lab router against the real Internet. Counts grow as the table converges (~1M IPv4 + ~200k IPv6 once fully loaded).","mode": "markdown"},
"pluginVersion": "9.1.7",
"type": "text"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Distinct IPv4 prefixes in the global feed.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"},"overrides": []},
"gridPos": {"h": 4,"w": 8,"x": 0,"y": 4},
"id": 1,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(*) AS \"IPv4 Prefixes\" FROM ip_rib r JOIN bgp_peers p ON p.hash_id = r.peer_hash_id JOIN routers rt ON rt.hash_id = p.router_hash_id WHERE rt.name = '$feed' AND r.iswithdrawn = false AND r.isipv4","refId": "A"}],
"title": "IPv4 Prefixes","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Distinct IPv6 prefixes in the global feed. Requires the IPv6 session — see gobgp/README.md.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "purple","value": null}]},"unit": "short"},"overrides": []},
"gridPos": {"h": 4,"w": 8,"x": 8,"y": 4},
"id": 2,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(*) AS \"IPv6 Prefixes\" FROM ip_rib r JOIN bgp_peers p ON p.hash_id = r.peer_hash_id JOIN routers rt ON rt.hash_id = p.router_hash_id WHERE rt.name = '$feed' AND r.iswithdrawn = false AND NOT r.isipv4","refId": "A"}],
"title": "IPv6 Prefixes","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Distinct origin ASes across the global feed.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]},"unit": "short"},"overrides": []},
"gridPos": {"h": 4,"w": 8,"x": 16,"y": 4},
"id": 3,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(DISTINCT r.origin_as) AS \"Origin ASes\" FROM ip_rib r JOIN bgp_peers p ON p.hash_id = r.peer_hash_id JOIN routers rt ON rt.hash_id = p.router_hash_id WHERE rt.name = '$feed' AND r.iswithdrawn = false AND r.origin_as IS NOT NULL","refId": "A"}],
"title": "Origin ASes","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Prefix-length distribution. A real table peaks hard at /24 (IPv4) and /48 (IPv6).",
"fieldConfig": {"defaults": {"color": {"mode": "continuous-BlPu"},"custom": {"lineWidth": 1,"fillOpacity": 80,"axisPlacement": "auto"}},"overrides": []},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 8},
"id": 4,
"options": {"orientation": "vertical","showValue": "auto","xField": "Length","legend": {"showLegend": false},"tooltip": {"mode": "single"}},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT (CASE WHEN r.isipv4 THEN 'v4 /' ELSE 'v6 /' END || r.prefix_len) AS \"Length\",\n count(*) AS \"Prefixes\"\nFROM ip_rib r JOIN bgp_peers p ON p.hash_id = r.peer_hash_id JOIN routers rt ON rt.hash_id = p.router_hash_id\nWHERE rt.name = '$feed' AND r.iswithdrawn = false\nGROUP BY r.isipv4, r.prefix_len\nORDER BY r.isipv4 DESC, r.prefix_len","refId": "A"}],
"title": "Prefix-Length Distribution","type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "ASes originating the most prefixes in the global table, with whois names.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": [{"matcher": {"id": "byName","options": "Prefixes"},"properties": [{"id": "custom.displayMode","value": "gradient-gauge"},{"id": "color","value": {"mode": "continuous-BlPu"}}]}]},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 17},
"id": 5,
"options": {"showHeader": true,"sortBy": [{"desc": true,"displayName": "Prefixes"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT r.origin_as AS \"Origin AS\",\n COALESCE(NULLIF(ia.as_name,''),'AS' || r.origin_as) AS \"Name\",\n COALESCE(NULLIF(ia.country,''),'?') AS \"Country\",\n count(*) AS \"Prefixes\"\nFROM ip_rib r JOIN bgp_peers p ON p.hash_id = r.peer_hash_id JOIN routers rt ON rt.hash_id = p.router_hash_id\nLEFT JOIN info_asn ia ON ia.asn = r.origin_as\nWHERE rt.name = '$feed' AND r.iswithdrawn = false AND r.origin_as IS NOT NULL\nGROUP BY r.origin_as, ia.as_name, ia.country\nORDER BY count(*) DESC LIMIT 50","refId": "A"}],
"title": "Top Origin ASes by Prefix Count","type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Prefix-length count as a sortable table (companion to the bar chart).",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": [{"matcher": {"id": "byName","options": "Prefixes"},"properties": [{"id": "custom.displayMode","value": "gradient-gauge"},{"id": "color","value": {"mode": "continuous-GrYlRd"}}]}]},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 17},
"id": 6,
"options": {"showHeader": true,"sortBy": [{"desc": true,"displayName": "Prefixes"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT CASE WHEN r.isipv4 THEN 'IPv4' ELSE 'IPv6' END AS \"AFI\",\n r.prefix_len AS \"Prefix Length\",\n count(*) AS \"Prefixes\"\nFROM ip_rib r JOIN bgp_peers p ON p.hash_id = r.peer_hash_id JOIN routers rt ON rt.hash_id = p.router_hash_id\nWHERE rt.name = '$feed' AND r.iswithdrawn = false\nGROUP BY r.isipv4, r.prefix_len\nORDER BY count(*) DESC","refId": "A"}],
"title": "Prefix Lengths","type": "table"
},
{
"collapsed": false,
"gridPos": {"h": 1,"w": 24,"x": 0,"y": 26},
"id": 11,
"panels": [],
"title": "Prefix Lookup","type": "row"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Enter a prefix or CIDR block in the 'Search prefix' box (e.g. 8.8.8.0/24 or 1.0.0.0/8). Shows every route in the global table that overlaps it — the exact prefix plus more- and less-specifics.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": []},
"gridPos": {"h": 11,"w": 24,"x": 0,"y": 27},
"id": 7,
"options": {"showHeader": true,"sortBy": [{"desc": false,"displayName": "Prefix"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT host(r.prefix) || '/' || r.prefix_len AS \"Prefix\",\n r.origin_as AS \"Origin AS\",\n COALESCE(NULLIF(ia.as_name,''),'AS' || r.origin_as) AS \"Origin Name\",\n host(ba.next_hop) AS \"Next Hop\",\n ba.as_path::text AS \"AS Path\",\n array_length(ba.as_path,1) AS \"Path Len\",\n r.timestamp AS \"Last Update\"\nFROM ip_rib r JOIN bgp_peers p ON p.hash_id = r.peer_hash_id JOIN routers rt ON rt.hash_id = p.router_hash_id\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nLEFT JOIN info_asn ia ON ia.asn = r.origin_as\nWHERE rt.name = '$feed' AND r.iswithdrawn = false\n AND r.prefix && NULLIF('$search_prefix','')::inet\nORDER BY r.prefix, r.prefix_len LIMIT 500","refId": "A"}],
"title": "Routes Overlapping $search_prefix","type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "bgp", "global"],
"templating": {
"list": [
{"name": "feed","type": "custom","label": "Feed router","description": "The BMP router name of the global-table feed (the GoBGP container).","query": "GoBGP","current": {"text": "GoBGP","value": "GoBGP"},"options": [{"text": "GoBGP","value": "GoBGP","selected": true}],"hide": 0},
{"name": "search_prefix","type": "textbox","label": "Search prefix","description": "A prefix or CIDR block to look up in the global table (e.g. 8.8.8.0/24). Blank = no lookup.","query": "","current": {"text": "","value": ""},"options": [{"text": "","value": "","selected": true}],"hide": 0}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "Global Internet Table",
"uid": "global-table",
"version": 1,
"weekStart": ""
}

View File

@ -0,0 +1,466 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "AS path length distribution and analysis. Teaches how BGP AS paths reflect internet topology and how to detect anomalies like route leaks or AS path prepending.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Internet routes typically have 2-5 hops. A /32 or /24 appearing with only 1-hop AS path from an unexpected ASN is a classic hijack indicator. Routes with 10+ hops may indicate prepending.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"fillOpacity": 80,
"gradientMode": "none",
"lineWidth": 0
},
"unit": "short"
}
},
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"barRadius": 0,
"barWidth": 0.7,
"groupWidth": 0.7,
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom"
},
"orientation": "auto",
"tooltip": {
"mode": "single"
},
"xTickLabelRotation": 0,
"xTickLabelSpacing": 200
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n ba.as_path_count AS \"AS Path Length (hops)\",\n COUNT(*) AS \"Prefix Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false\n AND r.isipv4 = true\n AND ba.as_path_count > 0\nGROUP BY ba.as_path_count\nORDER BY ba.as_path_count",
"refId": "A"
}
],
"title": "AS Path Length Distribution (Active IPv4 Routes)",
"type": "barchart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Average AS path length on the internet is ~4-5 hops. Your lab has shorter paths since ExaBGP is a single eBGP hop away.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 5
},
{
"color": "red",
"value": 8
}
]
},
"unit": "short",
"decimals": 1
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 12,
"y": 0
},
"id": 2,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(AVG(ba.as_path_count)::numeric, 1) AS \"Avg AS Path Length\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true AND ba.as_path_count > 0",
"refId": "A"
}
],
"title": "Average AS Path Length",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Routes with only 1-hop AS path are directly connected or possibly hijacked. In your lab, ExaBGP injects routes starting with AS 65100.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 5
},
{
"color": "red",
"value": 20
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 18,
"y": 0
},
"id": 3,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n COUNT(*) AS \"Direct (1-hop) Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true AND ba.as_path_count = 1",
"refId": "A"
}
],
"title": "1-Hop Routes (Direct/Possible Hijack)",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: The longest paths reveal the most AS-level hops in your network. AS path prepending intentionally lengthens paths to make a route less preferred.",
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "AS Path Length"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 5
},
{
"color": "red",
"value": 10
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "AS Path"
},
"properties": [
{
"id": "custom.width",
"value": 400
}
]
}
]
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 10
},
"id": 4,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "AS Path Length"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n r.prefix AS \"Prefix\",\n ba.as_path_count AS \"AS Path Length\",\n ba.as_path::text AS \"AS Path\",\n ba.origin_as AS \"Origin AS\",\n ba.next_hop AS \"Next Hop\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nORDER BY ba.as_path_count DESC\nLIMIT 30",
"refId": "A"
}
],
"title": "Longest AS Paths (Top 30)",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Origin AS is the rightmost ASN in the AS path \u2014 the network that first originated the prefix. Most internet prefixes are originated by their owning organization.",
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Route Count"
},
"properties": [
{
"id": "custom.displayMode",
"value": "lcd-gauge"
},
{
"id": "custom.width",
"value": 200
}
]
}
]
},
"gridPos": {
"h": 12,
"w": 12,
"x": 0,
"y": 20
},
"id": 5,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "Route Count"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n ba.origin_as AS \"Origin AS\",\n COALESCE(ia.as_name, 'Unknown') AS \"AS Name\",\n COUNT(*) AS \"Route Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nLEFT JOIN info_asn ia ON ia.asn = ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.origin_as, ia.as_name\nORDER BY COUNT(*) DESC\nLIMIT 20",
"refId": "A"
}
],
"title": "Top Origin ASNs by Route Count",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: A transit AS (appearing frequently in AS paths but not as origin) is a carrier. The most frequent transit ASNs in your lab correspond to simulated Tier-1 carriers (174=Cogent, 3356=Lumen, 1299=Telia, etc.)",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"fillOpacity": 80,
"lineWidth": 0
},
"unit": "short"
}
},
"gridPos": {
"h": 12,
"w": 12,
"x": 12,
"y": 20
},
"id": 6,
"options": {
"barRadius": 0,
"barWidth": 0.7,
"groupWidth": 0.7,
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom"
},
"orientation": "horizontal",
"tooltip": {
"mode": "single"
},
"xTickLabelRotation": 0,
"xTickLabelSpacing": 200
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n asn_val AS \"Transit ASN\",\n COUNT(*) AS \"Appearances in AS Paths\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nCROSS JOIN LATERAL unnest(ba.as_path) AS asn_val\nWHERE r.iswithdrawn = false AND asn_val != ba.origin_as\nGROUP BY asn_val\nORDER BY COUNT(*) DESC\nLIMIT 15",
"refId": "A"
}
],
"title": "Most Common Transit ASNs",
"type": "barchart"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"as-path",
"topology",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "AS Path Analysis",
"uid": "obmp-learn-03",
"version": 1
}

View File

@ -0,0 +1,623 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"description": "Explore BGP path attributes: communities, MED, local-pref and how they influence routing policy decisions.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"content": "## BGP Path Attributes \u2014 What They Mean\n\n### BGP Communities (RFC 1997)\nCommunities are 32-bit tags attached to routes, written as **ASN:value** (e.g., `65000:100`). They carry policy signals between routers and ASes.\n\n**Well-known communities:**\n| Community | Decimal | Meaning |\n|-----------|---------|----------|\n| `65535:0` | NO_EXPORT | Do not advertise outside this AS or confederation |\n| `65535:1` | NO_ADVERTISE | Do not advertise to any peer |\n| `65535:666` | BLACKHOLE | Drop traffic destined for this prefix (RFC 7999) |\n\nPrivate communities (e.g., `65001:200`) are operator-defined \u2014 they may encode region, customer tier, or traffic-engineering intent.\n\n### Local Preference (local-pref)\n- **Scope:** iBGP only \u2014 never sent to eBGP peers.\n- **Effect:** Higher local-pref wins. Default is **100**.\n- **Use case:** Prefer one upstream provider over another for all outbound traffic.\n\n### Multi-Exit Discriminator (MED)\n- **Scope:** Sent to directly connected eBGP peers to influence *inbound* traffic.\n- **Effect:** Lower MED wins (when comparing routes from the same AS).\n- **Use case:** Tell a peer which of your links to prefer when sending traffic to you.\n\n> **Tip:** Use the panels below to explore what communities and attributes are actually present in the current RIB. Run `inject.py attributes` to load routes with varied communities and MED values.",
"mode": "markdown"
},
"title": "BGP Attribute Reference \u2014 Communities, Local-Pref, MED",
"type": "text"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Each row is a unique community string (format ASN:value) seen across all active routes. High route counts for a community mean many routes share that policy tag. Look for well-known communities: 65535:0 (NO_EXPORT), 65535:1 (NO_ADVERTISE), 65535:666 (BLACKHOLE).",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Routes Tagged"
},
"properties": [
{
"id": "custom.displayMode",
"value": "lcd-gauge"
},
{
"id": "color",
"value": {
"mode": "thresholds"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "blue",
"value": null
},
{
"color": "green",
"value": 10
},
{
"color": "yellow",
"value": 100
}
]
}
}
]
}
]
},
"gridPos": {
"h": 11,
"w": 12,
"x": 0,
"y": 8
},
"id": 2,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "Routes Tagged"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n comm AS \"Community\",\n COUNT(*) AS \"Routes Tagged\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nCROSS JOIN LATERAL unnest(ba.community_list) AS comm\nWHERE r.iswithdrawn = false AND ba.community_list IS NOT NULL\nGROUP BY comm\nORDER BY COUNT(*) DESC\nLIMIT 30",
"refId": "A"
}
],
"title": "Top BGP Communities in Current RIB",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Routes with notable BGP attributes \u2014 tagged with communities or using non-default local-pref / MED values. These routes carry explicit policy information. Examine the Communities column for operator-defined tags and the Local Pref column to see traffic engineering decisions.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Local Pref"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-text"
},
{
"id": "color",
"value": {
"mode": "thresholds"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 101
},
{
"color": "red",
"value": 200
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "MED"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-text"
},
{
"id": "color",
"value": {
"mode": "thresholds"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 100
}
]
}
}
]
}
]
},
"gridPos": {
"h": 11,
"w": 12,
"x": 12,
"y": 8
},
"id": 3,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n r.prefix::text AS \"Prefix\",\n ba.origin_as AS \"Origin AS\",\n ba.community_list::text AS \"Communities\",\n ba.local_pref AS \"Local Pref\",\n ba.med AS \"MED\",\n ba.as_path_count AS \"Path Length\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND (ba.community_list IS NOT NULL OR ba.med IS NOT NULL OR ba.local_pref IS NOT NULL)\nORDER BY r.prefix\nLIMIT 100",
"refId": "A"
}
],
"title": "Routes with Notable Attributes",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: MED (Multi-Exit Discriminator) is used to influence inbound traffic from a directly connected AS. Lower MED is preferred. If most routes show 'Not Set', MED is not being used for traffic engineering. A single dominant MED value means a simple policy; many different values indicate fine-grained control.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"fillOpacity": 80,
"lineWidth": 0
},
"unit": "short"
}
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 19
},
"id": 4,
"options": {
"barRadius": 0.1,
"barWidth": 0.6,
"groupWidth": 0.7,
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"orientation": "auto",
"text": {},
"tooltip": {
"mode": "single"
},
"xTickLabelRotation": -30,
"xTickLabelSpacing": 100
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n COALESCE(ba.med::text, 'Not Set') AS \"MED Value\",\n COUNT(*) AS \"Route Count\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.med\nORDER BY ba.med NULLS LAST\nLIMIT 20",
"refId": "A"
}
],
"title": "MED Value Distribution",
"type": "barchart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Local preference is an iBGP attribute \u2014 it never crosses AS boundaries. Default is 100. Routes with local-pref above 100 are preferred over the default path; below 100 they are used as last-resort. Non-100 values indicate active traffic-engineering policy. Run 'inject.py attributes' to inject routes with varied local-pref values.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"fillOpacity": 80,
"lineWidth": 0
},
"unit": "short"
}
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 19
},
"id": 5,
"options": {
"barRadius": 0.1,
"barWidth": 0.6,
"groupWidth": 0.7,
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"orientation": "auto",
"text": {},
"tooltip": {
"mode": "single"
},
"xTickLabelRotation": -30,
"xTickLabelSpacing": 100
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n COALESCE(ba.local_pref::text, 'Not Set') AS \"Local Pref\",\n COUNT(*) AS \"Route Count\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.local_pref\nORDER BY ba.local_pref DESC NULLS LAST\nLIMIT 20",
"refId": "A"
}
],
"title": "Local Preference Value Distribution",
"type": "barchart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: This count tells you how widely BGP communities are used in your network. A value of 0 means no community tagging \u2014 communities are an opt-in feature. Run 'inject.py attributes' to add routes with community strings.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "blue",
"value": null
},
{
"color": "green",
"value": 1
}
]
},
"unit": "short",
"mappings": []
}
},
"gridPos": {
"h": 5,
"w": 8,
"x": 0,
"y": 28
},
"id": 6,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(*) AS \"Routes with Communities\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false\n AND ba.community_list IS NOT NULL\n AND array_length(ba.community_list, 1) > 0",
"refId": "A"
}
],
"title": "Routes with Communities",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: The number of distinct community strings seen across all active routes. A diverse set indicates fine-grained policy tagging. A single value means one uniform policy tag is applied.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "blue",
"value": null
},
{
"color": "green",
"value": 1
},
{
"color": "yellow",
"value": 50
}
]
},
"unit": "short",
"mappings": []
}
},
"gridPos": {
"h": 5,
"w": 8,
"x": 8,
"y": 28
},
"id": 7,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(DISTINCT comm) AS \"Unique Communities\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nCROSS JOIN LATERAL unnest(ba.community_list) AS comm\nWHERE r.iswithdrawn = false",
"refId": "A"
}
],
"title": "Unique Community Values",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Routes with a local-pref other than the default (100) have been explicitly policy-engineered. A high count here means your network actively uses local-pref to prefer specific paths. A value of 0 means all paths are at default preference.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 100
},
{
"color": "red",
"value": 1000
}
]
},
"unit": "short",
"mappings": []
}
},
"gridPos": {
"h": 5,
"w": 8,
"x": 16,
"y": 28
},
"id": 8,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(*) AS \"Custom Local-Pref Routes\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false\n AND ba.local_pref IS NOT NULL\n AND ba.local_pref != 100",
"refId": "A"
}
],
"title": "Routes with Non-Default Local-Pref",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"communities",
"attributes",
"policy",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "BGP Attribute Explorer",
"uid": "obmp-learn-06",
"version": 1
}

View File

@ -0,0 +1,540 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"description": "Prefix stability analysis and route churn visualization. Teaches how to identify unstable routes and understand BGP churn.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: This chart shows BGP advertisements and withdrawals bucketed per hour. A healthy network has steady low churn. Spikes in withdrawals indicate route instability events \u2014 link failures, IBGP reconvergence, or policy changes. Run 'inject.py churn' to generate synthetic churn data and observe it here.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "bars",
"fillOpacity": 60,
"lineWidth": 1,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
}
},
"unit": "short"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Advertisements"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "green",
"mode": "fixed"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "Withdrawals"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "red",
"mode": "fixed"
}
}
]
}
]
},
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": [
"sum",
"max"
],
"displayMode": "list",
"placement": "bottom"
},
"tooltip": {
"mode": "multi"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(timestamp,'1h'),\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) AS \"Advertisements\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) AS \"Withdrawals\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "Advertisements vs Withdrawals Rate (per hour)",
"type": "timeseries"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: A prefix with more than 30 updates per day is considered unstable \u2014 it is flapping or being re-announced frequently. The Stability column categorizes each prefix. Run 'inject.py churn' to generate churn data and observe it here. Sort by 'Total Updates' to find the most problematic prefixes.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Stability"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-text"
},
{
"id": "mappings",
"value": [
{
"options": {
"Very Stable": {
"color": "green",
"index": 0
},
"Stable": {
"color": "blue",
"index": 1
},
"Moderate": {
"color": "yellow",
"index": 2
},
"Unstable": {
"color": "red",
"index": 3
}
},
"type": "value"
}
]
}
]
},
{
"matcher": {
"id": "byName",
"options": "Total Updates"
},
"properties": [
{
"id": "custom.displayMode",
"value": "lcd-gauge"
},
{
"id": "color",
"value": {
"mode": "thresholds"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 7
},
{
"color": "red",
"value": 30
}
]
}
}
]
}
]
},
"gridPos": {
"h": 12,
"w": 24,
"x": 0,
"y": 9
},
"id": 2,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "Total Updates"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n prefix::text AS \"Prefix\",\n COUNT(*) AS \"Total Updates\",\n SUM(CASE WHEN iswithdrawn THEN 1 ELSE 0 END) AS \"Withdrawals\",\n SUM(CASE WHEN NOT iswithdrawn THEN 1 ELSE 0 END) AS \"Announcements\",\n MAX(timestamp) AS \"Last Change\",\n CASE\n WHEN COUNT(*) = 1 THEN 'Very Stable'\n WHEN COUNT(*) <= 7 THEN 'Stable'\n WHEN COUNT(*) <= 30 THEN 'Moderate'\n ELSE 'Unstable'\n END AS \"Stability\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY prefix\nORDER BY \"Total Updates\" DESC\nLIMIT 100",
"refId": "A"
}
],
"title": "Top Churning Prefixes",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: This bar chart shows how many prefixes fall into each stability tier. In a healthy network, the vast majority of prefixes should be 'Very Stable' (only announced once during the window). A large 'Unstable' bar is a red flag. Run 'inject.py churn' to shift prefixes into the Unstable tier.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "fixed",
"fixedColor": "blue"
},
"custom": {
"fillOpacity": 80,
"lineWidth": 0
},
"unit": "short"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "1. Very Stable (1 update)"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "green",
"mode": "fixed"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "2. Stable (2-7 updates)"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "blue",
"mode": "fixed"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "3. Moderate (8-30 updates)"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "yellow",
"mode": "fixed"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "4. Unstable (31+ updates)"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "red",
"mode": "fixed"
}
}
]
}
]
},
"gridPos": {
"h": 9,
"w": 14,
"x": 0,
"y": 21
},
"id": 3,
"options": {
"barRadius": 0.1,
"barWidth": 0.6,
"groupWidth": 0.7,
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"orientation": "auto",
"text": {},
"tooltip": {
"mode": "single"
},
"xTickLabelRotation": 0,
"xTickLabelSpacing": 200
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n CASE\n WHEN cnt = 1 THEN '1. Very Stable (1 update)'\n WHEN cnt <= 7 THEN '2. Stable (2-7 updates)'\n WHEN cnt <= 30 THEN '3. Moderate (8-30 updates)'\n ELSE '4. Unstable (31+ updates)'\n END AS \"Stability Tier\",\n COUNT(*) AS \"Prefix Count\"\nFROM (\n SELECT prefix, COUNT(*) as cnt\n FROM ip_rib_log\n WHERE $__timeFilter(timestamp)\n GROUP BY prefix\n) sub\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "Prefix Distribution by Stability Tier",
"type": "barchart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: This is the single most churning prefix in the selected time range. If a prefix appears here repeatedly across time ranges, it may warrant investigation \u2014 check the AS path and peers announcing it.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
}
]
},
"unit": "string",
"mappings": []
}
},
"gridPos": {
"h": 5,
"w": 10,
"x": 14,
"y": 21
},
"id": 4,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "center",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {
"titleSize": 14,
"valueSize": 18
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, prefix::text AS \"Most Churned Prefix\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY prefix\nORDER BY COUNT(*) DESC\nLIMIT 1",
"refId": "A"
}
],
"title": "Most Churned Prefix",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: This counts how many distinct prefixes had at least one update event in the selected time window. During a normal steady state this number should be low. After a major routing event (e.g., upstream link failure) you may see thousands of prefixes change simultaneously.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 500
},
{
"color": "red",
"value": 2000
}
]
},
"unit": "short",
"mappings": []
}
},
"gridPos": {
"h": 4,
"w": 10,
"x": 14,
"y": 26
},
"id": 5,
"options": {
"colorMode": "background",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT prefix) AS \"Prefixes with Updates\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)",
"refId": "A"
}
],
"title": "Total Unique Prefixes with Updates",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"churn",
"stability",
"obmp-nav"
],
"time": {
"from": "now-24h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "Route Churn & Stability Score",
"uid": "obmp-learn-05",
"version": 1
}

View File

@ -0,0 +1,405 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "RPKI (Resource Public Key Infrastructure) validation status. Teaches BGP routing security and how RPKI prevents prefix hijacks by validating route origin.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"content": "## What is RPKI?\n\nRPKI (Resource Public Key Infrastructure) is a cryptographic security framework for BGP routing. It lets IP address holders publish **Route Origin Authorizations (ROAs)** stating which ASNs are authorized to originate their prefixes.\n\n### RPKI Validation States\n| State | Meaning |\n|-------|----------|\n| **Valid** | The route's origin AS matches a ROA for this prefix |\n| **Invalid** | A ROA exists but the origin AS or prefix length does NOT match \u2014 this route is potentially a hijack |\n| **NotFound** | No ROA exists for this prefix/origin \u2014 unprotected, can't be validated |\n\n### How to read this dashboard\n- **Valid %** should be as high as possible (target: 100%)\n- **Invalid routes** are critical \u2014 they indicate either a misconfiguration or a prefix hijack\n- Routes with no RPKI data show as **NotFound** \u2014 they are not necessarily invalid, just unprotected\n\n> **Lab note:** The RPKI validator table is populated by a cron job in psql-app every 2 hours. If the table shows 0 rows, wait for the cron to run or check `ENABLE_RPKI=1` in docker-compose.yml.",
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"gridPos": {
"h": 10,
"w": 8,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"content": "## What is RPKI?\n\nRPKI (Resource Public Key Infrastructure) is a cryptographic security framework for BGP routing. It lets IP address holders publish **Route Origin Authorizations (ROAs)** stating which ASNs are authorized to originate their prefixes.\n\n### RPKI Validation States\n| State | Meaning |\n|-------|----------|\n| **Valid** | The route's origin AS matches a ROA for this prefix |\n| **Invalid** | A ROA exists but the origin AS or prefix length does NOT match \u2014 this route is potentially a hijack |\n| **NotFound** | No ROA exists for this prefix/origin \u2014 unprotected, can't be validated |\n\n### How to read this dashboard\n- **Valid %** should be as high as possible (target: 100%)\n- **Invalid routes** are critical \u2014 they indicate either a misconfiguration or a prefix hijack\n- Routes with no RPKI data show as **NotFound** \u2014 they are not necessarily invalid, just unprotected\n\n> **Lab note:** The RPKI validator table is populated by a cron job in psql-app every 2 hours. If the table shows 0 rows, wait for the cron to run or check `ENABLE_RPKI=1` in docker-compose.yml.",
"mode": "markdown"
},
"pluginVersion": "9.1.7",
"title": "RPKI Learning Guide",
"type": "text"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Total ROAs (Route Origin Authorizations) loaded from the RPKI validator. If 0, the cron job has not yet run.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "green",
"value": 100000
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 4,
"x": 8,
"y": 0
},
"id": 2,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"RPKI ROAs Loaded\" FROM rpki_validator",
"refId": "A"
}
],
"title": "RPKI ROAs Loaded",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Routes with a matching valid ROA \u2014 origin AS and prefix length both match.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
},
{
"color": "green",
"value": 1
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 4,
"x": 12,
"y": 0
},
"id": 3,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"Valid Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nJOIN rpki_validator rv ON rv.prefix >>= r.prefix AND rv.origin_as = ba.origin_as AND r.prefix_len <= rv.prefix_len_max\nWHERE r.iswithdrawn = false AND r.isipv4 = true",
"refId": "A"
}
],
"title": "RPKI Valid Routes",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Routes where a ROA exists but the origin AS does NOT match \u2014 high-priority investigation needed.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 1
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 4,
"x": 16,
"y": 0
},
"id": 4,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"RPKI Invalid Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND EXISTS (\n SELECT 1 FROM rpki_validator rv\n WHERE rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as\n )\n AND NOT EXISTS (\n SELECT 1 FROM rpki_validator rv\n WHERE rv.prefix >>= r.prefix AND rv.origin_as = ba.origin_as AND r.prefix_len <= rv.prefix_len_max\n )",
"refId": "A"
}
],
"title": "RPKI Invalid Routes",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: ExaBGP-injected routes (AS 65100) will be NotFound since they use synthetic ASNs not registered in RPKI. Real internet prefixes with valid ROAs will appear as Valid.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"mappings": []
},
"overrides": []
},
"gridPos": {
"h": 10,
"w": 10,
"x": 0,
"y": 10
},
"id": 5,
"options": {
"displayLabels": [
"percent",
"name"
],
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"pieType": "donut",
"tooltip": {
"mode": "single"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n CASE\n WHEN rv_valid.prefix IS NOT NULL THEN 'Valid'\n WHEN rv_any.prefix IS NOT NULL THEN 'Invalid'\n ELSE 'NotFound'\n END AS \"RPKI Status\",\n COUNT(*) AS \"Route Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nLEFT JOIN rpki_validator rv_valid\n ON rv_valid.prefix >>= r.prefix AND rv_valid.origin_as = ba.origin_as AND r.prefix_len <= rv_valid.prefix_len_max\nLEFT JOIN rpki_validator rv_any\n ON rv_any.prefix >>= r.prefix AND rv_any.origin_as != ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "RPKI Validation Status Distribution",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefixes that have a ROA but the observed origin AS does not match. These are the most security-critical routes \u2014 each one represents a potential hijack or misconfiguration.",
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Status"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "mappings",
"value": [
{
"options": {
"Invalid": {
"color": "red",
"index": 0
},
"Valid": {
"color": "green",
"index": 1
},
"NotFound": {
"color": "yellow",
"index": 2
}
},
"type": "value"
}
]
}
]
}
]
},
"gridPos": {
"h": 14,
"w": 14,
"x": 10,
"y": 10
},
"id": 6,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n r.prefix AS \"Prefix\",\n ba.origin_as AS \"Observed Origin AS\",\n rv.origin_as AS \"Authorized Origin AS (ROA)\",\n 'Invalid' AS \"Status\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nJOIN rpki_validator rv ON rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND NOT EXISTS (\n SELECT 1 FROM rpki_validator rv2\n WHERE rv2.prefix >>= r.prefix AND rv2.origin_as = ba.origin_as AND r.prefix_len <= rv2.prefix_len_max\n )\nORDER BY r.prefix\nLIMIT 50",
"refId": "A"
}
],
"title": "RPKI Invalid Routes \u2014 Potential Hijacks",
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"rpki",
"security",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "RPKI Validation Status",
"uid": "obmp-learn-04",
"version": 1
}

View File

@ -0,0 +1,465 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"description": "BGP update and withdrawal rates over time. Teaches what normal BGP traffic looks like and how to detect route churn or instability.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: A healthy network has far more advertisements than withdrawals. A withdrawal spike often signals a link failure or route flap.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "bars",
"fillOpacity": 60,
"lineWidth": 1,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
}
},
"unit": "short"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Advertisements"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "green",
"mode": "fixed"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "Withdrawals"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "red",
"mode": "fixed"
}
}
]
}
]
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": [
"sum",
"max"
],
"displayMode": "list",
"placement": "bottom"
},
"tooltip": {
"mode": "multi"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(timestamp,'5m'),\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) AS \"Advertisements\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) AS \"Withdrawals\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "BGP Updates Over Time \u2014 Advertisements vs Withdrawals",
"type": "timeseries"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 100
},
{
"color": "red",
"value": 1000
}
]
},
"unit": "short",
"mappings": []
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 0,
"y": 10
},
"id": 2,
"options": {
"colorMode": "background",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"Total Updates (24h)\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Total Updates (24h)",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Withdrawal rate above 30% is unusual. Above 50% may indicate a route leak or oscillation event.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 20
},
{
"color": "red",
"value": 50
}
]
},
"unit": "percent",
"max": 100
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 6,
"y": 10
},
"id": 3,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(100.0 * SUM(CASE WHEN iswithdrawn THEN 1 ELSE 0 END) / NULLIF(COUNT(*),0), 1) AS \"Withdrawal Rate %\"\nFROM ip_rib_log\nWHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Withdrawal Rate % (24h)",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1000
},
{
"color": "red",
"value": 10000
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 12,
"y": 10
},
"id": 4,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT peer_hash_id) AS \"Active Peers\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '1 hour'",
"refId": "A"
}
],
"title": "Active Reporting Peers (1h)",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 500
},
{
"color": "red",
"value": 2000
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 18,
"y": 10
},
"id": 5,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT prefix) AS \"Unique Prefixes Updated (24h)\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Unique Prefixes Updated (24h)",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Updates per peer over time. Learn: Peers should have similar update rates. A peer with dramatically more updates may be experiencing instability or receiving a full BGP table with frequent changes.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "line",
"fillOpacity": 10,
"lineWidth": 1,
"spanNulls": false
},
"unit": "short"
}
},
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 15
},
"id": 6,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "right"
},
"tooltip": {
"mode": "multi"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(s.interval_time,'30m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n SUM(s.advertise_avg + s.withdraw_avg) AS \"Updates\"\nFROM stats_peer_update_counts s\nJOIN bgp_peers p ON p.hash_id = s.peer_hash_id\nWHERE $__timeFilter(s.interval_time)\nGROUP BY 1, 2\nORDER BY 1",
"refId": "A"
}
],
"title": "Update Rate by Peer (30-min buckets)",
"type": "timeseries"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"churn",
"obmp-nav"
],
"time": {
"from": "now-24h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "BGP Update Rate & Churn",
"uid": "obmp-learn-01",
"version": 1
}

View File

@ -0,0 +1,102 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Per-router BGP policy diff: routes RECEIVED (BMP pre-policy Adj-RIB-In) vs KEPT (accepted into the BGP table, polled from the router) vs REJECTED, plus ADVERTISED counts and the bound inbound/outbound route-policy names. Kept/advertised data is collected by the obmp-rib-poller service over CLI+NETCONF because BMP on XRv9000 24.3.1 only carries pre-policy Adj-RIB-In. NOTE: Rejected = Received - Kept is everything BGP did not accept; inbound route-policy is one cause, alongside RR originator-id/cluster-list loop detection, AS-path loops and unreachable next-hops. Per-policy attribution is a future phase.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routers with poller data in the selected scope.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(DISTINCT rs.router_hash_id) AS \"Routers\" FROM router_rib_stats rs JOIN routers r ON r.hash_id = rs.router_hash_id WHERE r.name IN ($router)","refId": "A"}],
"title": "Routers Polled","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Neighbor address-families tracked (one row per router/neighbor/AF).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "purple","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 6,"y": 0},
"id": 2,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(*) AS \"Neighbor AFs\" FROM router_rib_stats rs JOIN routers r ON r.hash_id = rs.router_hash_id WHERE r.name IN ($router)","refId": "A"}],
"title": "Neighbor AFs Tracked","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total prefixes received (BMP) minus kept (polled), summed where positive, across BMP-monitored neighbors. Includes inbound route-policy denies AND BGP loop/validation rejections (RR originator-id, AS-path loop, next-hop) -- not policy-only.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "orange","value": 1}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 12,"y": 0},
"id": 3,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT COALESCE(sum(GREATEST(COALESCE(rcv.received,0) - COALESCE(rs.accepted_count,0), 0)),0) AS \"Rejected\" FROM router_rib_stats rs JOIN routers r ON r.hash_id = rs.router_hash_id LEFT JOIN (SELECT bp.router_hash_id, bp.peer_addr, CASE WHEN ir.isipv4 THEN 'ipv4' ELSE 'ipv6' END AS afi, count(*) AS received FROM ip_rib ir JOIN bgp_peers bp ON bp.hash_id = ir.peer_hash_id WHERE ir.iswithdrawn = false GROUP BY bp.router_hash_id, bp.peer_addr, afi) rcv ON rcv.router_hash_id = rs.router_hash_id AND rcv.peer_addr = rs.peer_addr AND rcv.afi = rs.afi WHERE r.name IN ($router)","refId": "A"}],
"title": "Total Rejected (Recv - Kept)","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Distinct route-policies (RPL) stored from the routers.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 18,"y": 0},
"id": 4,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(*) AS \"Policies\" FROM route_policies rp JOIN routers r ON r.hash_id = rp.router_hash_id WHERE r.name IN ($router)","refId": "A"}],
"title": "Route Policies Stored","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Per neighbor address-family: Received = BMP pre-policy Adj-RIB-In count (blank if the neighbor is not BMP-monitored). Kept = prefixes accepted into the BGP table (polled). Rejected = Received - Kept (inbound route-policy denies plus BGP loop/validation rejections). Advertised = adj-rib-out size toward the neighbor. In/Out Policy = the bound route-policy names.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": [{"matcher": {"id": "byName","options": "Rejected"},"properties": [{"id": "custom.displayMode","value": "color-text"},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "text","value": null},{"color": "orange","value": 1}]}}]}]},
"gridPos": {"h": 12,"w": 24,"x": 0,"y": 4},
"id": 5,
"options": {"showHeader": true,"sortBy": [{"desc": true,"displayName": "Rejected"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT r.name AS \"Router\", host(rs.peer_addr) AS \"Neighbor\", rs.peer_as AS \"Peer AS\", rs.afi AS \"AF\", rs.session_state AS \"State\", rcv.received AS \"Received (BMP)\", rs.accepted_count AS \"Kept\", rcv.received - rs.accepted_count AS \"Rejected\", rs.advertised_count AS \"Advertised\", bin.policy_name AS \"In-Policy\", bout.policy_name AS \"Out-Policy\", rs.polled_at AS \"Polled\" FROM router_rib_stats rs JOIN routers r ON r.hash_id = rs.router_hash_id LEFT JOIN neighbor_policy_bind bin ON bin.router_hash_id = rs.router_hash_id AND bin.peer_addr = rs.peer_addr AND bin.afi = rs.afi AND bin.direction = 'in' LEFT JOIN neighbor_policy_bind bout ON bout.router_hash_id = rs.router_hash_id AND bout.peer_addr = rs.peer_addr AND bout.afi = rs.afi AND bout.direction = 'out' LEFT JOIN (SELECT bp.router_hash_id, bp.peer_addr, CASE WHEN ir.isipv4 THEN 'ipv4' ELSE 'ipv6' END AS afi, count(*) AS received FROM ip_rib ir JOIN bgp_peers bp ON bp.hash_id = ir.peer_hash_id WHERE ir.iswithdrawn = false GROUP BY bp.router_hash_id, bp.peer_addr, afi) rcv ON rcv.router_hash_id = rs.router_hash_id AND rcv.peer_addr = rs.peer_addr AND rcv.afi = rs.afi WHERE r.name IN ($router) ORDER BY r.name, rs.peer_addr, rs.afi","refId": "A"}],
"title": "Per-Neighbor Policy Diff","type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Prefixes received (BMP) but not accepted into the BGP table, by router.",
"fieldConfig": {"defaults": {"color": {"mode": "palette-classic"},"custom": {"lineWidth": 1,"fillOpacity": 80,"axisPlacement": "auto"}},"overrides": []},
"gridPos": {"h": 11,"w": 12,"x": 0,"y": 16},
"id": 6,
"options": {"orientation": "horizontal","showValue": "auto","xField": "Router","legend": {"showLegend": false},"tooltip": {"mode": "single"}},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT r.name AS \"Router\", sum(GREATEST(COALESCE(rcv.received,0) - COALESCE(rs.accepted_count,0), 0)) AS \"Rejected\" FROM router_rib_stats rs JOIN routers r ON r.hash_id = rs.router_hash_id LEFT JOIN (SELECT bp.router_hash_id, bp.peer_addr, CASE WHEN ir.isipv4 THEN 'ipv4' ELSE 'ipv6' END AS afi, count(*) AS received FROM ip_rib ir JOIN bgp_peers bp ON bp.hash_id = ir.peer_hash_id WHERE ir.iswithdrawn = false GROUP BY bp.router_hash_id, bp.peer_addr, afi) rcv ON rcv.router_hash_id = rs.router_hash_id AND rcv.peer_addr = rs.peer_addr AND rcv.afi = rs.afi WHERE r.name IN ($router) GROUP BY r.name ORDER BY r.name","refId": "A"}],
"title": "Rejected by Router","type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Full route-policy (RPL) bodies retrieved from the routers via NETCONF. The body is what the heuristic attribution engine would parse in a later phase.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": []},
"gridPos": {"h": 11,"w": 12,"x": 12,"y": 16},
"id": 7,
"options": {"showHeader": true,"sortBy": [{"desc": false,"displayName": "Router"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT r.name AS \"Router\", rp.policy_name AS \"Policy\", rp.body AS \"RPL Body\", rp.retrieved_at AS \"Retrieved\" FROM route_policies rp JOIN routers r ON r.hash_id = rp.router_hash_id WHERE r.name IN ($router) ORDER BY r.name, rp.policy_name","refId": "A"}],
"title": "Route Policies (RPL)","type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "bgp", "policy"],
"templating": {
"list": [
{"name": "router","type": "query","label": "Router","datasource": {"type": "postgres","uid": "obmp_postgres"},"query": "SELECT name FROM routers ORDER BY name","definition": "SELECT name FROM routers ORDER BY name","refresh": 1,"includeAll": true,"multi": true,"current": {"selected": true,"text": ["All"],"value": ["$__all"]},"options": [],"sort": 1,"hide": 0}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "Policy Diff",
"uid": "policy-diff",
"version": 1,
"weekStart": ""
}

View File

@ -25,7 +25,19 @@
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 7,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -497,7 +509,9 @@
"schemaVersion": 37,
"style": "dark",
"tags": [
"obmp-history"
"obmp-history",
"obmp",
"obmp-nav"
],
"templating": {
"list": [

View File

@ -25,7 +25,19 @@
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 8,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -231,7 +243,9 @@
"schemaVersion": 37,
"style": "dark",
"tags": [
"obmp-history"
"obmp-history",
"obmp",
"obmp-nav"
],
"templating": {
"list": [

View File

@ -26,7 +26,19 @@
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 9,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -141,10 +153,6 @@
"type": "table"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
@ -152,46 +160,42 @@
"decimals": 0,
"fieldConfig": {
"defaults": {
"links": []
"links": [],
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "line",
"lineInterpolation": "smooth",
"lineWidth": 1,
"fillOpacity": 15,
"showPoints": "never",
"spanNulls": false,
"axisPlacement": "auto"
}
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 11,
"x": 0,
"y": 6
},
"hiddenSeries": false,
"id": 1,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": false,
"show": true,
"total": true,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
"legend": {
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"percentage": false,
"pluginVersion": "9.1.7",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"alias": "",
@ -222,43 +226,10 @@
]
}
],
"thresholds": [],
"timeRegions": [],
"title": "Prefix Advertisements & Withdrawals",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"mode": "time",
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:289",
"format": "none",
"logBase": 1,
"show": true
},
{
"$$hashKey": "object:290",
"format": "short",
"logBase": 1,
"show": false
}
],
"yaxis": {
"align": false
}
"type": "timeseries"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
@ -266,49 +237,42 @@
"decimals": 0,
"fieldConfig": {
"defaults": {
"links": []
"links": [],
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "line",
"lineInterpolation": "smooth",
"lineWidth": 1,
"fillOpacity": 15,
"showPoints": "never",
"spanNulls": false,
"axisPlacement": "auto"
}
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 13,
"x": 11,
"y": 6
},
"hiddenSeries": false,
"id": 2,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": false,
"rightSide": true,
"show": true,
"sort": "total",
"sortDesc": true,
"total": true,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
"legend": {
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"percentage": false,
"pluginVersion": "9.1.7",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"alias": "",
@ -338,39 +302,8 @@
]
}
],
"thresholds": [],
"timeRegions": [],
"title": "Changes by Peer",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"mode": "time",
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:346",
"decimals": 0,
"format": "none",
"label": "",
"logBase": 1,
"show": true
},
{
"$$hashKey": "object:347",
"format": "short",
"logBase": 1,
"show": false
}
],
"yaxis": {
"align": false
}
"type": "timeseries"
},
{
"datasource": {
@ -505,7 +438,9 @@
"schemaVersion": 37,
"style": "dark",
"tags": [
"obmp-history"
"obmp-history",
"obmp",
"obmp-nav"
],
"templating": {
"list": [

View File

@ -0,0 +1,901 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "Generic Router Diff. Compares the BGP routing tables (BMP Adj-RIB-In) of up to 4 selectable routers. Generalized from the 2-router RR Loc-RIB Diff dashboard. Router 1 and Router 2 are always compared; Router 3 and Router 4 are optional - set them to '-- none --' to compare just two.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"gridPos": {
"h": 5,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"content": "## Router Diff\n\nCompares the BGP routing tables of up to **4 routers** via BMP (Adj-RIB-In). Select routers with the **Router 1-4** dropdowns. **Router 1** and **Router 2** are always compared; set **Router 3** / **Router 4** to `-- none --` to compare just two or three.\n\n- **Presence Matrix** — one row per prefix, one column per selected router, cell = best-path next-hop. Blank cell = prefix absent on that router.\n- **Divergence** — prefixes that are missing on some routers or whose best-path next-hop / AS-path disagree.\n- **Summary stats** — prefix count per router and total divergent prefixes.\n- **All Paths** — per-prefix drill-down across the selected routers (pick a prefix with the **Prefix** dropdown).",
"mode": "markdown"
},
"title": "Router Diff — Overview",
"type": "text"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Active (non-withdrawn) prefixes on Router 1 for the selected AFI.",
"fieldConfig": {
"defaults": {
"color": {
"fixedColor": "blue",
"mode": "fixed"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "blue",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 4,
"x": 0,
"y": 5
},
"id": 10,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(DISTINCT r.prefix::text) AS \"prefixes\"\nFROM ip_rib r\nJOIN bgp_peers p ON p.hash_id = r.peer_hash_id\nJOIN routers rt ON rt.hash_id = p.router_hash_id\nWHERE rt.name = '$router1' AND r.iswithdrawn = false\n AND ('$afi' = 'All' OR ('$afi' = 'IPv4' AND r.isipv4) OR ('$afi' = 'IPv6' AND NOT r.isipv4))",
"refId": "A"
}
],
"title": "$router1 Prefixes",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Active (non-withdrawn) prefixes on Router 2 for the selected AFI.",
"fieldConfig": {
"defaults": {
"color": {
"fixedColor": "green",
"mode": "fixed"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 4,
"x": 4,
"y": 5
},
"id": 11,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(DISTINCT r.prefix::text) AS \"prefixes\"\nFROM ip_rib r\nJOIN bgp_peers p ON p.hash_id = r.peer_hash_id\nJOIN routers rt ON rt.hash_id = p.router_hash_id\nWHERE rt.name = '$router2' AND r.iswithdrawn = false\n AND ('$afi' = 'All' OR ('$afi' = 'IPv4' AND r.isipv4) OR ('$afi' = 'IPv6' AND NOT r.isipv4))",
"refId": "A"
}
],
"title": "$router2 Prefixes",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Active prefixes on Router 3 for the selected AFI. Shows 0 when Router 3 is set to '-- none --'.",
"fieldConfig": {
"defaults": {
"color": {
"fixedColor": "purple",
"mode": "fixed"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "purple",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 4,
"x": 8,
"y": 5
},
"id": 12,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(DISTINCT r.prefix::text) AS \"prefixes\"\nFROM ip_rib r\nJOIN bgp_peers p ON p.hash_id = r.peer_hash_id\nJOIN routers rt ON rt.hash_id = p.router_hash_id\nWHERE rt.name = '$router3' AND r.iswithdrawn = false\n AND ('$afi' = 'All' OR ('$afi' = 'IPv4' AND r.isipv4) OR ('$afi' = 'IPv6' AND NOT r.isipv4))",
"refId": "A"
}
],
"title": "$router3 Prefixes",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Active prefixes on Router 4 for the selected AFI. Shows 0 when Router 4 is set to '-- none --'.",
"fieldConfig": {
"defaults": {
"color": {
"fixedColor": "orange",
"mode": "fixed"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "orange",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 4,
"x": 12,
"y": 5
},
"id": 13,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(DISTINCT r.prefix::text) AS \"prefixes\"\nFROM ip_rib r\nJOIN bgp_peers p ON p.hash_id = r.peer_hash_id\nJOIN routers rt ON rt.hash_id = p.router_hash_id\nWHERE rt.name = '$router4' AND r.iswithdrawn = false\n AND ('$afi' = 'All' OR ('$afi' = 'IPv4' AND r.isipv4) OR ('$afi' = 'IPv6' AND NOT r.isipv4))",
"refId": "A"
}
],
"title": "$router4 Prefixes",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Distinct prefixes that diverge across the selected routers — either missing on some routers or with a different best-path next-hop / AS-path.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 25
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 4,
"x": 16,
"y": 5
},
"id": 14,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "WITH params AS (\n SELECT '$router1'::text AS r1, '$router2'::text AS r2,\n '$router3'::text AS r3, '$router4'::text AS r4, '$afi'::text AS afi\n),\nbp AS (\n SELECT DISTINCT ON (rt.name, r.prefix, r.prefix_len)\n rt.name AS rname, r.prefix, r.prefix_len, ba.next_hop, ba.as_path\n FROM ip_rib r\n JOIN bgp_peers p ON p.hash_id = r.peer_hash_id\n JOIN routers rt ON rt.hash_id = p.router_hash_id\n JOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\n CROSS JOIN params\n WHERE r.iswithdrawn = false\n AND rt.name IN (params.r1, params.r2, params.r3, params.r4)\n AND (params.afi = 'All' OR (params.afi = 'IPv4' AND r.isipv4) OR (params.afi = 'IPv6' AND NOT r.isipv4))\n ORDER BY rt.name, r.prefix, r.prefix_len, ba.local_pref DESC NULLS LAST\n),\nagg AS (\n SELECT bp.prefix, bp.prefix_len,\n COUNT(DISTINCT bp.rname) AS present_on,\n COUNT(DISTINCT host(bp.next_hop)) AS nh_variants,\n COUNT(DISTINCT bp.as_path) AS path_variants\n FROM bp GROUP BY bp.prefix, bp.prefix_len\n),\nsel AS (\n SELECT COUNT(*) AS n FROM (\n SELECT t.v FROM params, unnest(ARRAY[params.r1, params.r2, params.r3, params.r4]) AS t(v)\n WHERE t.v <> '-- none --'\n ) s\n)\nSELECT COUNT(*) AS \"divergent\"\nFROM agg\nWHERE agg.present_on < (SELECT n FROM sel) OR agg.nh_variants > 1 OR agg.path_variants > 1",
"refId": "A"
}
],
"title": "Divergent Prefixes",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Number of routers currently selected (Router 1-4, excluding any set to '-- none --').",
"fieldConfig": {
"defaults": {
"color": {
"fixedColor": "text",
"mode": "fixed"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "text",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 4,
"x": 20,
"y": 5
},
"id": 15,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) AS \"routers\"\nFROM (\n SELECT t.v\n FROM unnest(ARRAY['$router1','$router2','$router3','$router4']) AS t(v)\n WHERE t.v <> '-- none --'\n) s",
"refId": "A"
}
],
"title": "Routers Selected",
"type": "stat"
},
{
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 9
},
"id": 20,
"title": "Presence Matrix",
"type": "row"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "One row per prefix, one column per selected router. Each cell shows the best-path next-hop on that router; a blank cell means the prefix is absent there. Columns for routers set to '-- none --' are headed '-- none --' and stay empty. Filter by AFI with the dropdown.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Prefix"
},
"properties": [
{
"id": "custom.width",
"value": 200
}
]
},
{
"matcher": {
"id": "byName",
"options": "AFI"
},
"properties": [
{
"id": "custom.width",
"value": 70
}
]
}
]
},
"gridPos": {
"h": 14,
"w": 24,
"x": 0,
"y": 10
},
"id": 21,
"options": {
"footer": {
"fields": "",
"reducer": [
"count"
],
"show": true
},
"showHeader": true,
"sortBy": [
{
"desc": false,
"displayName": "Prefix"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "WITH params AS (\n SELECT '$router1'::text AS r1, '$router2'::text AS r2,\n '$router3'::text AS r3, '$router4'::text AS r4, '$afi'::text AS afi\n),\nbp AS (\n SELECT DISTINCT ON (rt.name, r.prefix, r.prefix_len)\n rt.name AS rname, r.prefix, r.prefix_len, r.isipv4, ba.next_hop\n FROM ip_rib r\n JOIN bgp_peers p ON p.hash_id = r.peer_hash_id\n JOIN routers rt ON rt.hash_id = p.router_hash_id\n JOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\n CROSS JOIN params\n WHERE r.iswithdrawn = false\n AND rt.name IN (params.r1, params.r2, params.r3, params.r4)\n AND (params.afi = 'All' OR (params.afi = 'IPv4' AND r.isipv4) OR (params.afi = 'IPv6' AND NOT r.isipv4))\n ORDER BY rt.name, r.prefix, r.prefix_len, ba.local_pref DESC NULLS LAST\n)\nSELECT\n bp.prefix::text AS \"Prefix\",\n CASE WHEN bp.isipv4 THEN 'IPv4' ELSE 'IPv6' END AS \"AFI\",\n MAX(CASE WHEN bp.rname = pr.r1 THEN host(bp.next_hop) END) AS \"$router1\",\n MAX(CASE WHEN bp.rname = pr.r2 THEN host(bp.next_hop) END) AS \"$router2\",\n MAX(CASE WHEN bp.rname = pr.r3 THEN host(bp.next_hop) END) AS \"$router3\",\n MAX(CASE WHEN bp.rname = pr.r4 THEN host(bp.next_hop) END) AS \"$router4\"\nFROM bp CROSS JOIN params pr\nGROUP BY bp.prefix, bp.prefix_len, bp.isipv4, pr.r1, pr.r2, pr.r3, pr.r4\nORDER BY \"Prefix\"",
"refId": "A"
}
],
"title": "Prefix Presence Matrix (cell = best-path next-hop)",
"type": "table"
},
{
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 24
},
"id": 30,
"title": "Divergence",
"type": "row"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefixes where the selected routers disagree: either the prefix is present on some routers but not all, or the best-path next-hop / AS-path differs between routers. 'Present On' counts how many selected routers carry the prefix; 'Selected' is the total selected router count.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Divergence"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "mappings",
"value": [
{
"options": {
"AS-Path differs": {
"color": "orange",
"index": 1
},
"Missing on some": {
"color": "red",
"index": 0
},
"NH + AS-Path differ": {
"color": "red",
"index": 3
},
"Next-Hop differs": {
"color": "yellow",
"index": 2
}
},
"type": "value"
}
]
}
]
},
{
"matcher": {
"id": "byName",
"options": "Next-Hops"
},
"properties": [
{
"id": "custom.width",
"value": 320
}
]
}
]
},
"gridPos": {
"h": 14,
"w": 24,
"x": 0,
"y": 25
},
"id": 31,
"options": {
"footer": {
"fields": "",
"reducer": [
"count"
],
"show": true
},
"showHeader": true,
"sortBy": [
{
"desc": false,
"displayName": "Prefix"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "WITH params AS (\n SELECT '$router1'::text AS r1, '$router2'::text AS r2,\n '$router3'::text AS r3, '$router4'::text AS r4, '$afi'::text AS afi\n),\nbp AS (\n SELECT DISTINCT ON (rt.name, r.prefix, r.prefix_len)\n rt.name AS rname, r.prefix, r.prefix_len, r.isipv4, ba.next_hop, ba.as_path\n FROM ip_rib r\n JOIN bgp_peers p ON p.hash_id = r.peer_hash_id\n JOIN routers rt ON rt.hash_id = p.router_hash_id\n JOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\n CROSS JOIN params\n WHERE r.iswithdrawn = false\n AND rt.name IN (params.r1, params.r2, params.r3, params.r4)\n AND (params.afi = 'All' OR (params.afi = 'IPv4' AND r.isipv4) OR (params.afi = 'IPv6' AND NOT r.isipv4))\n ORDER BY rt.name, r.prefix, r.prefix_len, ba.local_pref DESC NULLS LAST\n),\nagg AS (\n SELECT bp.prefix, bp.prefix_len, bool_and(bp.isipv4) AS isipv4,\n COUNT(DISTINCT bp.rname) AS present_on,\n COUNT(DISTINCT host(bp.next_hop)) AS nh_variants,\n COUNT(DISTINCT bp.as_path) AS path_variants,\n string_agg(DISTINCT bp.rname || '=' || host(bp.next_hop), ', ' ORDER BY bp.rname || '=' || host(bp.next_hop)) AS nh_detail\n FROM bp GROUP BY bp.prefix, bp.prefix_len\n),\nsel AS (\n SELECT COUNT(*) AS n FROM (\n SELECT t.v FROM params, unnest(ARRAY[params.r1, params.r2, params.r3, params.r4]) AS t(v)\n WHERE t.v <> '-- none --'\n ) s\n)\nSELECT\n agg.prefix::text AS \"Prefix\",\n CASE WHEN agg.isipv4 THEN 'IPv4' ELSE 'IPv6' END AS \"AFI\",\n agg.present_on AS \"Present On\",\n (SELECT n FROM sel) AS \"Selected\",\n CASE\n WHEN agg.present_on < (SELECT n FROM sel) THEN 'Missing on some'\n WHEN agg.nh_variants > 1 AND agg.path_variants > 1 THEN 'NH + AS-Path differ'\n WHEN agg.nh_variants > 1 THEN 'Next-Hop differs'\n WHEN agg.path_variants > 1 THEN 'AS-Path differs'\n ELSE 'Consistent'\n END AS \"Divergence\",\n agg.nh_detail AS \"Next-Hops\"\nFROM agg\nWHERE agg.present_on < (SELECT n FROM sel) OR agg.nh_variants > 1 OR agg.path_variants > 1\nORDER BY \"Prefix\"",
"refId": "A"
}
],
"title": "Divergent Prefixes — Detail",
"type": "table"
},
{
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 39
},
"id": 40,
"title": "Per-Prefix All Paths",
"type": "row"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Every path for the prefix chosen in the Prefix dropdown, across all selected routers. Use this to drill into a divergent prefix and see exactly which path each router holds and where it was learned from.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Router"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-text"
},
{
"id": "color",
"value": {
"fixedColor": "blue",
"mode": "fixed"
}
}
]
}
]
},
"gridPos": {
"h": 12,
"w": 24,
"x": 0,
"y": 40
},
"id": 41,
"options": {
"footer": {
"fields": "",
"reducer": [
"count"
],
"show": true
},
"showHeader": true,
"sortBy": [
{
"desc": false,
"displayName": "Router"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n rt.name AS \"Router\",\n p.peer_addr::text AS \"Learned From\",\n host(ba.next_hop) AS \"Next Hop\",\n ba.as_path::text AS \"AS Path\",\n ba.origin_as AS \"Origin AS\",\n COALESCE(ba.local_pref, 0) AS \"Local Pref\",\n COALESCE(ba.med, 0) AS \"MED\",\n ba.community_list::text AS \"Communities\",\n ba.cluster_list::text AS \"Cluster List\",\n host(ba.originator_id) AS \"Originator ID\",\n r.labels AS \"Labels\",\n r.timestamp AS \"Last Update\"\nFROM ip_rib r\nJOIN bgp_peers p ON p.hash_id = r.peer_hash_id\nJOIN routers rt ON rt.hash_id = p.router_hash_id\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE rt.name IN ('$router1', '$router2', '$router3', '$router4')\n AND r.iswithdrawn = false\n AND r.prefix::text = '$prefix'\nORDER BY rt.name, p.peer_addr",
"refId": "A"
}
],
"title": "All Paths for $prefix",
"type": "table"
}
],
"refresh": "30s",
"schemaVersion": 36,
"tags": [
"obmp",
"obmp-nav",
"bgp",
"diff"
],
"templating": {
"list": [
{
"current": {},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "SELECT name FROM routers WHERE state = 'up' ORDER BY name",
"hide": 0,
"includeAll": false,
"label": "Router 1",
"multi": false,
"name": "router1",
"options": [],
"query": "SELECT name FROM routers WHERE state = 'up' ORDER BY name",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"type": "query"
},
{
"current": {},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "SELECT name FROM routers WHERE state = 'up' ORDER BY name",
"hide": 0,
"includeAll": false,
"label": "Router 2",
"multi": false,
"name": "router2",
"options": [],
"query": "SELECT name FROM routers WHERE state = 'up' ORDER BY name",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"type": "query"
},
{
"current": {},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "SELECT '-- none --' AS name UNION ALL SELECT name FROM routers WHERE state = 'up' ORDER BY name",
"description": "Optional third router. Select '-- none --' to compare only two routers.",
"hide": 0,
"includeAll": false,
"label": "Router 3",
"multi": false,
"name": "router3",
"options": [],
"query": "SELECT '-- none --' AS name UNION ALL SELECT name FROM routers WHERE state = 'up' ORDER BY name",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"type": "query"
},
{
"current": {},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "SELECT '-- none --' AS name UNION ALL SELECT name FROM routers WHERE state = 'up' ORDER BY name",
"description": "Optional fourth router. Select '-- none --' to compare fewer routers.",
"hide": 0,
"includeAll": false,
"label": "Router 4",
"multi": false,
"name": "router4",
"options": [],
"query": "SELECT '-- none --' AS name UNION ALL SELECT name FROM routers WHERE state = 'up' ORDER BY name",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"type": "query"
},
{
"current": {
"selected": true,
"text": "All",
"value": "All"
},
"hide": 0,
"includeAll": false,
"label": "AFI",
"multi": false,
"name": "afi",
"options": [
{
"selected": true,
"text": "All",
"value": "All"
},
{
"selected": false,
"text": "IPv4",
"value": "IPv4"
},
{
"selected": false,
"text": "IPv6",
"value": "IPv6"
}
],
"query": "All,IPv4,IPv6",
"skipUrlSync": false,
"type": "custom"
},
{
"current": {},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "SELECT DISTINCT r.prefix::text FROM ip_rib r JOIN bgp_peers p ON p.hash_id = r.peer_hash_id JOIN routers rt ON rt.hash_id = p.router_hash_id WHERE rt.name IN ('$router1', '$router2', '$router3', '$router4') AND r.iswithdrawn = false ORDER BY 1",
"hide": 0,
"includeAll": false,
"label": "Prefix",
"multi": false,
"name": "prefix",
"options": [],
"query": "SELECT DISTINCT r.prefix::text FROM ip_rib r JOIN bgp_peers p ON p.hash_id = r.peer_hash_id JOIN routers rt ON rt.hash_id = p.router_hash_id WHERE rt.name IN ('$router1', '$router2', '$router3', '$router4') AND r.iswithdrawn = false ORDER BY 1",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"type": "query"
}
]
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "Router Diff",
"uid": "router-diff",
"version": 1
}

View File

@ -26,7 +26,19 @@
"graphTooltip": 0,
"id": 11,
"iteration": 1654876675775,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -949,7 +961,9 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-tops"
"obmp-tops",
"obmp",
"obmp-nav"
],
"templating": {
"list": [

View File

@ -26,7 +26,19 @@
"graphTooltip": 0,
"id": 12,
"iteration": 1654876366831,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -1268,7 +1280,9 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-tops"
"obmp-tops",
"obmp",
"obmp-nav"
],
"templating": {
"list": [

View File

@ -0,0 +1,112 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "BGP EVPN routes monitored over BMP and stored in evpn_rib by the obmp-evpn-consumer (roadmap E5). Covers type-2 (MAC/IP advertisement) and type-3 (inclusive multicast); collector 2.2.3 mis-decodes type-5 (IP-prefix) so it is not shown. Scope with the RD/EVI variable.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Active EVPN routes (not withdrawn) in the selected RD scope.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(*) AS \"EVPN Routes\" FROM evpn_rib WHERE iswithdrawn = false AND ('$rd' = '-- all --' OR rd = '$rd')","refId": "A"}],
"title": "EVPN Routes","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Distinct route distinguishers (EVPN instances) in scope.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "purple","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 6,"y": 0},
"id": 2,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(DISTINCT rd) AS \"EVIs (RDs)\" FROM evpn_rib WHERE iswithdrawn = false AND ('$rd' = '-- all --' OR rd = '$rd')","refId": "A"}],
"title": "EVIs (RDs)","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Type-2 MAC/IP advertisement routes — learned MAC addresses.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 12,"y": 0},
"id": 3,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(DISTINCT mac) AS \"MACs\" FROM evpn_rib WHERE iswithdrawn = false AND route_type = 2 AND mac IS NOT NULL AND ('$rd' = '-- all --' OR rd = '$rd')","refId": "A"}],
"title": "Learned MACs","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Type-3 inclusive-multicast routes — per-EVI broadcast/unknown-unicast/multicast flood endpoints.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"unit": "short","thresholds": {"mode": "absolute","steps": [{"color": "orange","value": null}]}},"overrides": []},
"gridPos": {"h": 4,"w": 6,"x": 18,"y": 0},
"id": 4,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT count(*) AS \"Multicast Routes\" FROM evpn_rib WHERE iswithdrawn = false AND route_type = 3 AND ('$rd' = '-- all --' OR rd = '$rd')","refId": "A"}],
"title": "Multicast (T3)","type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Route count by EVPN route type.",
"fieldConfig": {"defaults": {"color": {"mode": "palette-classic"},"custom": {"lineWidth": 1,"fillOpacity": 80,"axisPlacement": "auto"}},"overrides": []},
"gridPos": {"h": 8,"w": 8,"x": 0,"y": 4},
"id": 5,
"options": {"orientation": "horizontal","showValue": "auto","xField": "Type","legend": {"showLegend": false},"tooltip": {"mode": "single"}},
"pluginVersion": "9.1.7",
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT CASE route_type WHEN 1 THEN 'T1 Eth A-D' WHEN 2 THEN 'T2 MAC/IP' WHEN 3 THEN 'T3 Multicast' WHEN 4 THEN 'T4 Eth Segment' WHEN 5 THEN 'T5 IP-prefix' ELSE 'T' || route_type END AS \"Type\",\n count(*) AS \"Routes\"\nFROM evpn_rib\nWHERE iswithdrawn = false AND ('$rd' = '-- all --' OR rd = '$rd')\nGROUP BY route_type\nORDER BY route_type","refId": "A"}],
"title": "Routes by Type","type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Per-EVI summary — MAC/IP and multicast route counts and distinct MACs per route distinguisher.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": []},
"gridPos": {"h": 8,"w": 16,"x": 8,"y": 4},
"id": 6,
"options": {"showHeader": true,"sortBy": [{"desc": false,"displayName": "RD / EVI"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT rd AS \"RD / EVI\",\n count(*) FILTER (WHERE route_type = 2) AS \"MAC/IP\",\n count(*) FILTER (WHERE route_type = 3) AS \"Multicast\",\n count(DISTINCT mac) FILTER (WHERE mac IS NOT NULL) AS \"Distinct MACs\"\nFROM evpn_rib\nWHERE iswithdrawn = false AND ('$rd' = '-- all --' OR rd = '$rd')\nGROUP BY rd\nORDER BY rd","refId": "A"}],
"title": "Per-EVI Summary","type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Type-2 MAC/IP advertisements — every MAC (and host IP) learned in the selected EVPN instances.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": []},
"gridPos": {"h": 10,"w": 24,"x": 0,"y": 12},
"id": 7,
"options": {"showHeader": true,"sortBy": [{"desc": false,"displayName": "RD"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT rd AS \"RD\",\n eth_tag_id AS \"Eth Tag\",\n mac AS \"MAC\",\n host(ip) AS \"Host IP\",\n mpls_label1 AS \"VNI / Label\",\n array_to_string(ext_community_list, ', ') AS \"Route Targets\",\n eth_segment_id AS \"ESI\",\n timestamp AS \"Last Update\"\nFROM evpn_rib\nWHERE iswithdrawn = false AND route_type = 2 AND ('$rd' = '-- all --' OR rd = '$rd')\nORDER BY rd, mac","refId": "A"}],
"title": "MAC/IP Advertisements (Type-2)","type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Type-3 inclusive-multicast routes — the flood list per EVPN instance.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": []},
"gridPos": {"h": 8,"w": 24,"x": 0,"y": 22},
"id": 8,
"options": {"showHeader": true,"sortBy": [{"desc": false,"displayName": "RD"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT rd AS \"RD\",\n eth_tag_id AS \"Eth Tag\",\n host(orig_router_ip) AS \"Originating Router\",\n array_to_string(ext_community_list, ', ') AS \"Route Targets\",\n timestamp AS \"Last Update\"\nFROM evpn_rib\nWHERE iswithdrawn = false AND route_type = 3 AND ('$rd' = '-- all --' OR rd = '$rd')\nORDER BY rd","refId": "A"}],
"title": "Inclusive Multicast (Type-3)","type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "bgp", "evpn"],
"templating": {
"list": [
{"name": "rd","type": "query","label": "RD / EVI","datasource": {"type": "postgres","uid": "obmp_postgres"},"query": "SELECT '-- all --' AS rd UNION SELECT DISTINCT rd FROM evpn_rib WHERE iswithdrawn = false ORDER BY rd","definition": "SELECT '-- all --' AS rd UNION SELECT DISTINCT rd FROM evpn_rib WHERE iswithdrawn = false ORDER BY rd","refresh": 1,"includeAll": false,"multi": false,"current": {"selected": true,"text": "-- all --","value": "-- all --"},"options": [],"sort": 1,"hide": 0}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "EVPN RIB",
"uid": "evpn-rib",
"version": 1,
"weekStart": ""
}

View File

@ -1,780 +0,0 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 19,
"iteration": 1654877653557,
"links": [],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefix found in router's RIB.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 5,
"w": 6,
"x": 0,
"y": 0
},
"id": 9,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"sum"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT\n floor(extract(epoch from max(r.timestamp))) as time,\n CASE WHEN v.router_hash_id is null THEN 'Not in Router RIB' ELSE 'In Router Rib' END as metric,\n 1 as value\nFROM routers r\n left join (select distinct router_hash_id\n from v_l3vpn_routes\n where prefix = '$prefix'\n and ('$rd' = '-' OR rd = '$rd')\n and iswithdrawn = false group by router_hash_id) v \n on (r.hash_id = v.router_hash_id)\nWHERE r.state = 'up'\nGROUP BY r.hash_id,v.router_hash_id\norder by time\n\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Router Visibility",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefix found in peer RIB's",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 5,
"w": 6,
"x": 6,
"y": 0
},
"id": 10,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"sum"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT\n floor(extract(epoch from max(p.timestamp))) as time,\n CASE WHEN v.peer_hash_id is null THEN 'Not in Peers RIB' ELSE 'In Peer RIB' END as metric,\n 1 as value\nFROM bgp_peers p\n left join (select peer_hash_id,isipv4\n from l3vpn_rib \n where prefix = '$prefix' and prefix != '0.0.0.0/0'\n AND ('$rd' = '-' OR rd = '$rd')\n and iswithdrawn = false group by peer_hash_id,isipv4) v \n on (p.hash_id = v.peer_hash_id)\nWHERE p.isipv4 = CASE WHEN family('$prefix') = 4 THEN true ELSE false END\n AND p.state = 'up'\nGROUP BY p.hash_id,v.peer_hash_id,p.isipv4\norder by time\n\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Peer Visibility",
"type": "piechart"
},
{
"circleMaxSize": "15",
"circleMinSize": 2,
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"esMetric": "Count",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"hideEmpty": false,
"hideZero": false,
"id": 17,
"initialZoom": "1",
"locationData": "table",
"mapCenter": "(0°, 0°)",
"mapCenterLatitude": 0,
"mapCenterLongitude": 0,
"maxDataPoints": 1,
"mouseWheelZoom": false,
"showLegend": false,
"stickyLabels": false,
"tableQueryOptions": {
"geohashField": "geohash",
"labelField": "name",
"latitudeField": "latitude",
"longitudeField": "longitude",
"metricField": "value",
"queryType": "coordinates"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT\n 10 as value, latitude, longitude, stateprov as name\nFROM geo_ip\nWHERE\n ip && '$input'\nORDER BY ip desc limit 1",
"refId": "A",
"select": [
[
{
"params": [
"latitude"
],
"type": "column"
}
]
],
"table": "v_ip_routes_geo",
"timeColumn": "lastmodified",
"timeColumnType": "timestamp",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"thresholds": "0,10",
"title": "Prefix Location",
"type": "grafana-worldmap-panel",
"unitPlural": "",
"unitSingle": "",
"valueName": "current"
},
{
"columns": [],
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fontSize": "100%",
"gridPos": {
"h": 6,
"w": 24,
"x": 0,
"y": 8
},
"id": 12,
"links": [],
"scroll": true,
"showHeader": true,
"sort": {
"col": 0,
"desc": true
},
"styles": [
{
"alias": "Time",
"align": "auto",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "date"
},
{
"alias": "",
"align": "auto",
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"mappingType": 1,
"pattern": "raw_output",
"preserveFormat": true,
"sanitize": false,
"thresholds": [],
"type": "string",
"unit": "short"
},
{
"alias": "",
"align": "auto",
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"decimals": 2,
"pattern": "/.*/",
"thresholds": [],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select distinct origin_as,i.as_name,org_id,org_name,remarks,address,city,state_prov,country,raw_output,source\n from l3vpn_rib r LEFT JOIN info_asn i ON (i.asn = r.origin_as)\n where r.prefix = '$prefix'\n and ('$rd' = '-' OR rd = '$rd')\n and origin_as > 0\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "ASN Info",
"transform": "table",
"type": "table-old"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": false
},
"decimals": 0,
"displayName": "",
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "locale"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "lastmodified"
},
"properties": [
{
"id": "displayName",
"value": "Time"
},
{
"id": "unit",
"value": "time: YYYY-MM-DD HH:mm:ss.SSS"
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "prefix"
},
"properties": [
{
"id": "displayName",
"value": "Prefix"
},
{
"id": "unit",
"value": "short"
},
{
"id": "decimals",
"value": 2
},
{
"id": "links",
"value": [
{
"targetBlank": true,
"title": "Prefix History ",
"url": "/d/l3vpn-prefix-hist/prefix-history-by-prefix-l3vpn?orgId=1&var-input=${__value.text}&var-rd=$rd"
}
]
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "origin_as"
},
"properties": [
{
"id": "displayName",
"value": "Origin"
},
{
"id": "unit",
"value": "none"
},
{
"id": "links",
"value": [
{
"targetBlank": true,
"title": "ASN View",
"url": "/grafana/d/asnview/asn-view?orgId=1&var-asn_num=${__value.text}"
}
]
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "iswithdrawn"
},
"properties": [
{
"id": "displayName",
"value": "Withdrawn"
},
{
"id": "unit",
"value": "bool"
},
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "custom.align",
"value": "auto"
},
{
"id": "color",
"value": {
"mode": "continuous-GrYlRd"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "Time"
},
"properties": [
{
"id": "custom.width",
"value": 194
}
]
}
]
},
"gridPos": {
"h": 23,
"w": 24,
"x": 0,
"y": 14
},
"id": 3,
"links": [],
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": []
},
"pluginVersion": "8.5.4",
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select distinct ip.*, \n \tFIRST_VALUE(geo_ip.city) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as city,\n \tFIRST_VALUE(geo_ip.stateprov) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as stateprov,\n \tFIRST_VALUE(geo_ip.country) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as country,\n ls.local_router_name\n\tFROM (SELECT lastmodified,peername,rd,prefix,\n \tiswithdrawn,origin_as,med,localpref,nh,as_path,extcommunities,communities,largecommunities\n from v_l3vpn_routes\n \t\twhere prefix && '$input' \n \t\t AND peer_hash_id in ($peer_hash)\n \t\t AND ('$rd' = '-' OR rd = '$rd')\n \t\tlimit 2000\n \t) ip\n\t\tLEFT JOIN geo_ip on (geo_ip.ip >>= ip.prefix AND geo_ip.ip != '0.0.0.0/0')\n LEFT JOIN v_ls_prefixes ls ON (ls.prefix >>= ip.nh and length(ls.local_router_name) > 0)",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Looking Glass",
"transformations": [
{
"id": "merge",
"options": {
"reducers": []
}
}
],
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-l3vpn"
],
"templating": {
"list": [
{
"current": {
"selected": false,
"text": "80.0.0.2",
"value": "80.0.0.2"
},
"hide": 0,
"label": "Prefix/IP",
"name": "input",
"options": [
{
"selected": true,
"text": "80.0.0.2",
"value": "80.0.0.2"
}
],
"query": "80.0.0.2",
"queryValue": "50.227.215.188",
"skipUrlSync": false,
"type": "textbox"
},
{
"current": {
"selected": true,
"text": [
"All"
],
"value": [
"$__all"
]
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select name as __text, hash_id as __value from routers where state = 'up'",
"hide": 0,
"includeAll": true,
"label": "Router",
"multi": true,
"name": "router_hash",
"options": [],
"query": "select name as __text, hash_id as __value from routers where state = 'up'",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"current": {
"selected": true,
"text": [
"All"
],
"value": [
"$__all"
]
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select peername as __text, peer_hash_id as __value from v_peers where router_hash_id in ($router_hash) and recvcapabilities like '% afi=1 safi=128 %';",
"hide": 0,
"includeAll": true,
"label": "Peer",
"multi": true,
"name": "peer_hash",
"options": [],
"query": "select peername as __text, peer_hash_id as __value from v_peers where router_hash_id in ($router_hash) and recvcapabilities like '% afi=1 safi=128 %';",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"current": {
"isNone": true,
"selected": false,
"text": "None",
"value": ""
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select prefix from l3vpn_rib \nwhere prefix >>= '$input' and peer_hash_id in ($peer_hash) and ('$rd' = '-' OR rd = '$rd')\norder by prefix desc limit 1",
"hide": 2,
"includeAll": false,
"multi": false,
"name": "prefix",
"options": [],
"query": "select prefix from l3vpn_rib \nwhere prefix >>= '$input' and peer_hash_id in ($peer_hash) and ('$rd' = '-' OR rd = '$rd')\norder by prefix desc limit 1",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
},
{
"description": "RD in the format of N:N. Set to - for all.",
"hide": 2,
"label": "RD",
"name": "rd",
"query": "-",
"skipUrlSync": false,
"type": "constant"
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Looking Glass - L3VPN",
"uid": "jiQW6VB7k",
"version": 1,
"weekStart": ""
}

View File

@ -26,7 +26,19 @@
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 20,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -145,245 +157,192 @@
"type": "table"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"fieldConfig": {
"defaults": {
"links": []
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"decimals": 0,
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 6
},
"hiddenSeries": false,
"id": 1,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": false,
"rightSide": true,
"show": true,
"total": true,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
"legend": {
"calcs": [
"max",
"mean",
"sum"
],
"displayMode": "table",
"placement": "right",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"percentage": false,
"pluginVersion": "9.1.7",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT\n interval_time as time,\n sum(updates) as updates, sum(withdraws) as withdraws\nFROM stats_l3vpn_chg_byprefix s\nWHERE $__timeFilter(interval_time)\n AND peer_hash_id in ($peer_hash)\n ${prefix_clause:raw}\n\ngroup by interval_time\nORDER BY interval_time ASC\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"thresholds": [],
"timeRegions": [],
"title": "Prefix Advertisements & Withdrawals",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"mode": "time",
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:289",
"format": "none",
"logBase": 1,
"show": true
},
{
"$$hashKey": "object:290",
"format": "short",
"logBase": 1,
"show": false
}
],
"yaxis": {
"align": false
}
"type": "timeseries"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"fieldConfig": {
"defaults": {
"links": []
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"decimals": 0,
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 6
},
"hiddenSeries": false,
"id": 2,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": false,
"rightSide": true,
"show": true,
"sort": "total",
"sortDesc": true,
"total": true,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
"legend": {
"calcs": [
"max",
"mean",
"sum"
],
"displayMode": "table",
"placement": "right",
"showLegend": true,
"sortBy": "Total",
"sortDesc": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"percentage": false,
"pluginVersion": "9.1.7",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT\n interval_time as time,\n sum(updates) + sum(withdraws) as value,\n left(PeerName,32) as metric\nFROM stats_l3vpn_chg_byprefix s\n JOIN v_peers p ON (s.peer_hash_id = p.peer_hash_id)\nWHERE $__timeFilter(interval_time)\n AND s.peer_hash_id in ($peer_hash)\n ${prefix_clause:raw}\n\nGROUP BY s.interval_time,peername\nORDER BY interval_time ASC\n\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"thresholds": [],
"timeRegions": [],
"title": "Changes by Peer",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"mode": "time",
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:346",
"decimals": 0,
"format": "none",
"label": "",
"logBase": 1,
"show": true
},
{
"$$hashKey": "object:347",
"format": "short",
"logBase": 1,
"show": false
}
],
"yaxis": {
"align": false
}
"type": "timeseries"
},
{
"datasource": {
@ -537,9 +496,11 @@
}
],
"refresh": "",
"schemaVersion": 37,
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"l3vpn",
"obmp-l3vpn"
],
"templating": {
@ -684,27 +645,28 @@
"type": "query"
},
{
"name": "rd",
"type": "query",
"label": "RD",
"description": "Route Distinguisher (VRF). Auto-discovered from l3vpn_rib; '-' = all VRFs.",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"query": "SELECT '-' AS rd UNION SELECT DISTINCT rd FROM l3vpn_rib WHERE iswithdrawn = false ORDER BY rd",
"definition": "SELECT '-' AS rd UNION SELECT DISTINCT rd FROM l3vpn_rib WHERE iswithdrawn = false ORDER BY rd",
"refresh": 1,
"includeAll": false,
"multi": false,
"current": {
"selected": false,
"selected": true,
"text": "-",
"value": "-"
},
"description": "RD in the format of N:N. Set to - for all.",
"options": [],
"sort": 1,
"hide": 0,
"includeAll": false,
"label": "RD",
"multi": false,
"name": "rd",
"options": [
{
"selected": true,
"text": "-",
"value": "-"
}
],
"query": "-",
"skipUrlSync": false,
"type": "custom"
"skipUrlSync": false
}
]
},
@ -742,4 +704,4 @@
"uid": "l3vpn-prefix-hist",
"version": 2,
"weekStart": ""
}
}

View File

@ -21,14 +21,39 @@
}
]
},
"description": "L3VPN RIB browser combined with the per-prefix Looking Glass: route counts by RD, prefix visibility, geolocation and ASN ownership.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 21,
"iteration": 1654877634754,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 0
},
"id": 20,
"panels": [],
"title": "RIB Browser",
"type": "row"
},
{
"datasource": {
"type": "postgres",
@ -76,7 +101,7 @@
"h": 8,
"w": 11,
"x": 0,
"y": 0
"y": 1
},
"id": 5,
"options": {
@ -100,7 +125,7 @@
"xTickLabelRotation": 0,
"xTickLabelSpacing": 0
},
"pluginVersion": "8.3.4",
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
@ -108,31 +133,9 @@
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select\n count(*) as count,\n rd\n from l3vpn_rib\n where\n peer_hash_id in ($peer_hash)\n and ('$rd' = '-' or rd = '$rd')\n and iswithdrawn = false\n group by rd\n",
"refId": "A",
"select": [
[
{
"params": [
"latitude"
],
"type": "column"
}
]
],
"table": "v_ip_routes_geo",
"timeColumn": "lastmodified",
"timeColumnType": "timestamp",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Routes Advertised/Active",
@ -208,9 +211,9 @@
},
"gridPos": {
"h": 8,
"w": 12,
"w": 13,
"x": 11,
"y": 0
"y": 1
},
"id": 6,
"options": {
@ -234,7 +237,7 @@
"xTickLabelRotation": 0,
"xTickLabelSpacing": 0
},
"pluginVersion": "8.3.4",
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
@ -242,31 +245,9 @@
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select\n count(*) as count,\n rd\n from l3vpn_rib\n where\n peer_hash_id in ($peer_hash)\n and ('$rd' = '-' OR rd = '$rd')\n and iswithdrawn = true\n group by rd\n",
"refId": "A",
"select": [
[
{
"params": [
"latitude"
],
"type": "column"
}
]
],
"table": "v_ip_routes_geo",
"timeColumn": "lastmodified",
"timeColumnType": "timestamp",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Routes Withdrawn/Inactive",
@ -429,10 +410,10 @@
]
},
"gridPos": {
"h": 23,
"h": 18,
"w": 24,
"x": 0,
"y": 8
"y": 9
},
"id": 3,
"links": [],
@ -447,39 +428,17 @@
"showHeader": true,
"sortBy": []
},
"pluginVersion": "8.5.4",
"pluginVersion": "9.1.7",
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select distinct ip.*, \n \tFIRST_VALUE(geo_ip.city) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as city,\n \tFIRST_VALUE(geo_ip.stateprov) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as stateprov,\n \tFIRST_VALUE(geo_ip.country) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as country,\n ls.local_router_name\n\tFROM (SELECT lastmodified,peername,rd,prefix,\n \tiswithdrawn,origin_as,med,localpref,nh,as_path,communities,extcommunities\n from v_l3vpn_routes\n \t\twhere \n \t\t peer_hash_id in ($peer_hash)\n \t\t AND ('$rd' = '-' OR rd = '$rd')\n \t\t AND (iswithdrawn in ($state))\n \t\tlimit $limit\n \t) ip\n\t\tLEFT JOIN geo_ip on (geo_ip.ip >>= ip.prefix AND geo_ip.ip != '0.0.0.0/0')\n LEFT JOIN v_ls_prefixes ls ON (ls.prefix >>= ip.nh and length(ls.local_router_name) > 0)",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Looking Glass (RD = $rd)",
@ -492,11 +451,555 @@
}
],
"type": "table"
},
{
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 27
},
"id": 21,
"panels": [],
"title": "Looking Glass - Prefix Lookup ($input)",
"type": "row"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefix found in router's RIB.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 6,
"x": 0,
"y": 28
},
"id": 9,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"sum"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawQuery": true,
"rawSql": "SELECT\n floor(extract(epoch from max(r.timestamp))) as time,\n CASE WHEN v.router_hash_id is null THEN 'Not in Router RIB' ELSE 'In Router Rib' END as metric,\n 1 as value\nFROM routers r\n left join (select distinct router_hash_id\n from v_l3vpn_routes\n where prefix = '$prefix'\n and ('$rd' = '-' OR rd = '$rd')\n and iswithdrawn = false group by router_hash_id) v \n on (r.hash_id = v.router_hash_id)\nWHERE r.state = 'up'\nGROUP BY r.hash_id,v.router_hash_id\norder by time\n\n",
"refId": "A"
}
],
"title": "Router Visibility",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefix found in peer RIB's",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 6,
"x": 6,
"y": 28
},
"id": 10,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"sum"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawQuery": true,
"rawSql": "SELECT\n floor(extract(epoch from max(p.timestamp))) as time,\n CASE WHEN v.peer_hash_id is null THEN 'Not in Peers RIB' ELSE 'In Peer RIB' END as metric,\n 1 as value\nFROM bgp_peers p\n left join (select peer_hash_id,isipv4\n from l3vpn_rib \n where prefix = '$prefix' and prefix != '0.0.0.0/0'\n AND ('$rd' = '-' OR rd = '$rd')\n and iswithdrawn = false group by peer_hash_id,isipv4) v \n on (p.hash_id = v.peer_hash_id)\nWHERE p.isipv4 = CASE WHEN family('$prefix') = 4 THEN true ELSE false END\n AND p.state = 'up'\nGROUP BY p.hash_id,v.peer_hash_id,p.isipv4\norder by time\n\n",
"refId": "A"
}
],
"title": "Peer Visibility",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Geolocation of the looked-up prefix.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 28
},
"id": 17,
"options": {
"basemap": {
"config": {},
"name": "Layer 0",
"type": "default"
},
"controls": {
"mouseWheelZoom": true,
"showAttribution": true,
"showDebug": false,
"showMeasure": false,
"showScale": false,
"showZoom": true
},
"layers": [
{
"config": {
"showLegend": false,
"style": {
"color": {
"fixed": "dark-orange"
},
"opacity": 0.4,
"rotation": {
"fixed": 0,
"max": 360,
"min": -360,
"mode": "mod"
},
"size": {
"fixed": 8,
"max": 15,
"min": 2
},
"symbol": {
"fixed": "img/icons/marker/circle.svg",
"mode": "fixed"
},
"textConfig": {
"fontSize": 12,
"offsetX": 0,
"offsetY": 0,
"textAlign": "center",
"textBaseline": "middle"
}
}
},
"location": {
"latitude": "latitude",
"longitude": "longitude",
"mode": "coords"
},
"name": "Prefix Location",
"tooltip": true,
"type": "markers"
}
],
"tooltip": {
"mode": "details"
},
"view": {
"id": "zero",
"lat": 0,
"lon": 0,
"zoom": 1
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawQuery": true,
"rawSql": "SELECT\n 10 as value, latitude, longitude, stateprov as name\nFROM geo_ip\nWHERE\n ip && '$input'\nORDER BY ip desc limit 1",
"refId": "A"
}
],
"title": "Prefix Location",
"type": "geomap"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Origin-AS ownership for the looked-up prefix.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": false
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 6,
"w": 24,
"x": 0,
"y": 36
},
"id": 12,
"links": [],
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawQuery": true,
"rawSql": "select distinct origin_as,i.as_name,org_id,org_name,remarks,address,city,state_prov,country,raw_output,source\n from l3vpn_rib r LEFT JOIN info_asn i ON (i.asn = r.origin_as)\n where r.prefix = '$prefix'\n and ('$rd' = '-' OR rd = '$rd')\n and origin_as > 0\n",
"refId": "A"
}
],
"title": "ASN Info",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": false
},
"decimals": 0,
"displayName": "",
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "locale"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "lastmodified"
},
"properties": [
{
"id": "displayName",
"value": "Time"
},
{
"id": "unit",
"value": "time: YYYY-MM-DD HH:mm:ss.SSS"
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "prefix"
},
"properties": [
{
"id": "displayName",
"value": "Prefix"
},
{
"id": "unit",
"value": "short"
},
{
"id": "decimals",
"value": 2
},
{
"id": "links",
"value": [
{
"targetBlank": true,
"title": "Prefix History ",
"url": "/d/l3vpn-prefix-hist/prefix-history-by-prefix-l3vpn?orgId=1&var-input=${__value.text}&var-rd=$rd"
}
]
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "origin_as"
},
"properties": [
{
"id": "displayName",
"value": "Origin"
},
{
"id": "unit",
"value": "none"
},
{
"id": "links",
"value": [
{
"targetBlank": true,
"title": "ASN View",
"url": "/grafana/d/asnview/asn-view?orgId=1&var-asn_num=${__value.text}"
}
]
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "iswithdrawn"
},
"properties": [
{
"id": "displayName",
"value": "Withdrawn"
},
{
"id": "unit",
"value": "bool"
},
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "custom.align",
"value": "auto"
},
{
"id": "color",
"value": {
"mode": "continuous-GrYlRd"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "Time"
},
"properties": [
{
"id": "custom.width",
"value": 194
}
]
}
]
},
"gridPos": {
"h": 18,
"w": 24,
"x": 0,
"y": 42
},
"id": 13,
"links": [],
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": []
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawQuery": true,
"rawSql": "select distinct ip.*, \n \tFIRST_VALUE(geo_ip.city) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as city,\n \tFIRST_VALUE(geo_ip.stateprov) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as stateprov,\n \tFIRST_VALUE(geo_ip.country) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as country,\n ls.local_router_name\n\tFROM (SELECT lastmodified,peername,rd,prefix,\n \tiswithdrawn,origin_as,med,localpref,nh,as_path,extcommunities,communities,largecommunities\n from v_l3vpn_routes\n \t\twhere prefix && '$input' \n \t\t AND peer_hash_id in ($peer_hash)\n \t\t AND ('$rd' = '-' OR rd = '$rd')\n \t\tlimit 2000\n \t) ip\n\t\tLEFT JOIN geo_ip on (geo_ip.ip >>= ip.prefix AND geo_ip.ip != '0.0.0.0/0')\n LEFT JOIN v_ls_prefixes ls ON (ls.prefix >>= ip.nh and length(ls.local_router_name) > 0)",
"refId": "A"
}
],
"title": "Looking Glass (Prefix Lookup)",
"transformations": [
{
"id": "merge",
"options": {
"reducers": []
}
}
],
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"l3vpn",
"obmp-l3vpn"
],
"templating": {
@ -534,10 +1037,13 @@
},
{
"current": {
"isNone": true,
"selected": false,
"text": "None",
"value": ""
"selected": true,
"text": [
"All"
],
"value": [
"$__all"
]
},
"datasource": {
"type": "postgres",
@ -545,7 +1051,7 @@
},
"definition": "select peername as __text, peer_hash_id as __value from v_peers where router_hash_id in ($router_hash) and recvcapabilities like '% afi=1 safi=128 %';",
"hide": 0,
"includeAll": false,
"includeAll": true,
"label": "Peer",
"multi": true,
"name": "peer_hash",
@ -561,28 +1067,28 @@
"useTags": false
},
{
"name": "rd",
"type": "query",
"label": "RD",
"description": "Route Distinguisher (VRF). Auto-discovered from l3vpn_rib; '-' = all VRFs.",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"query": "SELECT '-' AS rd UNION SELECT DISTINCT rd FROM l3vpn_rib WHERE iswithdrawn = false ORDER BY rd",
"definition": "SELECT '-' AS rd UNION SELECT DISTINCT rd FROM l3vpn_rib WHERE iswithdrawn = false ORDER BY rd",
"refresh": 1,
"includeAll": false,
"multi": false,
"current": {
"selected": true,
"text": "-",
"value": "-"
},
"description": "RD in the format of N:N. Set to - for all.",
"options": [],
"sort": 1,
"hide": 0,
"includeAll": false,
"label": "RD",
"multi": false,
"name": "rd",
"options": [
{
"selected": true,
"text": "-",
"value": "-"
}
],
"query": "-",
"queryValue": "203:20",
"skipUrlSync": false,
"type": "custom"
"skipUrlSync": false
},
{
"current": {
@ -639,6 +1145,7 @@
},
"hide": 0,
"includeAll": true,
"label": "State",
"multi": true,
"name": "state",
"options": [
@ -662,6 +1169,51 @@
"queryValue": "",
"skipUrlSync": false,
"type": "custom"
},
{
"current": {
"selected": false,
"text": "80.0.0.2",
"value": "80.0.0.2"
},
"hide": 0,
"label": "Prefix/IP Lookup",
"name": "input",
"options": [
{
"selected": true,
"text": "80.0.0.2",
"value": "80.0.0.2"
}
],
"query": "80.0.0.2",
"queryValue": "50.227.215.188",
"skipUrlSync": false,
"type": "textbox"
},
{
"current": {
"isNone": true,
"selected": false,
"text": "None",
"value": ""
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select prefix from l3vpn_rib \nwhere prefix >>= '$input' and peer_hash_id in ($peer_hash) and ('$rd' = '-' OR rd = '$rd')\norder by prefix desc limit 1",
"hide": 2,
"includeAll": false,
"multi": false,
"name": "prefix",
"options": [],
"query": "select prefix from l3vpn_rib \nwhere prefix >>= '$input' and peer_hash_id in ($peer_hash) and ('$rd' = '-' OR rd = '$rd')\norder by prefix desc limit 1",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
}
]
},
@ -695,8 +1247,8 @@
]
},
"timezone": "",
"title": "L3VPN RIB Browser",
"title": "L3VPN RIB & Looking Glass",
"uid": "v-cdzIBnz",
"version": 1,
"version": 2,
"weekStart": ""
}
}

View File

@ -26,7 +26,19 @@
"graphTooltip": 0,
"id": 14,
"iteration": 1654877691622,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -278,6 +290,8 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"linkstate",
"obmp-linkstate"
],
"templating": {

View File

@ -1,479 +0,0 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 15,
"iteration": 1654877712696,
"links": [],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 7,
"w": 7,
"x": 0,
"y": 0
},
"id": 4,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"sum"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"format": "time_series",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select floor(extract(epoch from max(timestamp))) as time,\n count(*) as value, CASE WHEN iswithdrawn THEN 'WITHDRAWN' ELSE 'ACTIVE' END as metric\nfrom ls_links\nwhere local_node_hash_id = '$local_node_hash_id'\n AND peer_hash_id = '$peer_hash'\ngroup by iswithdrawn\norder by time\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Link States",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 7,
"w": 6,
"x": 7,
"y": 0
},
"id": 6,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"format": "time_series",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select floor(extract(epoch from max(timestamp))) as time,\n count(*) as value, CASE WHEN mt_id = 2 THEN 'IPv6' ELSE 'IPv4' END as metric\nfrom ls_links\nwhere local_node_hash_id = '$local_node_hash_id'\n AND peer_hash_id = '$peer_hash'\ngroup by mt_id\norder by time\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Links by Type",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"inspect": false
},
"decimals": 0,
"displayName": "",
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "timestamp"
},
"properties": [
{
"id": "displayName",
"value": "Time"
},
{
"id": "unit",
"value": "short"
},
{
"id": "decimals",
"value": 2
},
{
"id": "unit",
"value": "time: YYYY-MM-DD HH:mm:ss.SSS"
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "seq"
},
"properties": [
{
"id": "unit",
"value": "locale"
}
]
},
{
"matcher": {
"id": "byName",
"options": "state"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "mappings",
"value": [
{
"options": {
"ACTIVE": {
"color": "semi-dark-green",
"index": 0
},
"WITHDRAWN": {
"color": "semi-dark-red",
"index": 1
}
},
"type": "value"
}
]
}
]
}
]
},
"gridPos": {
"h": 14,
"w": 24,
"x": 0,
"y": 7
},
"id": 2,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"pluginVersion": "8.5.4",
"targets": [
{
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT state,local_router_name,local_igp_routerid,remote_router_name,remote_igp_routerid,mt_id,igp_metric,protocol, timestamp, seq\n FROM v_ls_links\n WHERE local_node_hash_id = '$local_node_hash_id'\n AND peer_hash_id = '$peer_hash'",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "$local_node_name Links",
"transformations": [
{
"id": "merge",
"options": {
"reducers": []
}
}
],
"transparent": true,
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-linkstate"
],
"templating": {
"list": [
{
"current": {
"selected": false,
"text": "yyz01-wxbb-crt01-lo0.webex.com",
"value": "367c22e4-57d9-2328-654b-96ea750e0267"
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0\n ",
"hide": 0,
"includeAll": false,
"label": "BGP Peer",
"multi": false,
"name": "peer_hash",
"options": [],
"query": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0\n ",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"current": {
"selected": false,
"text": "AMS10-WXBB-CRT02",
"value": "1ed1da6b-6f57-57aa-92f5-edda59049e9a"
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select name as __text, hash_id as __value from ls_nodes where peer_hash_id = '$peer_hash' and not igp_router_id ~ '\\..[1-9A-F]00$'",
"hide": 0,
"includeAll": false,
"label": "ISIS Node",
"multi": false,
"name": "local_node_hash_id",
"options": [],
"query": "select name as __text, hash_id as __value from ls_nodes where peer_hash_id = '$peer_hash' and not igp_router_id ~ '\\..[1-9A-F]00$'",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 5,
"tagValuesQuery": "",
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"current": {
"selected": false,
"text": "AMS10-WXBB-CRT02",
"value": "AMS10-WXBB-CRT02"
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select name from ls_nodes where hash_id = '$local_node_hash_id' and peer_hash_id = '$peer_hash'",
"hide": 2,
"includeAll": false,
"multi": false,
"name": "local_node_name",
"options": [],
"query": "select name from ls_nodes where hash_id = '$local_node_hash_id' and peer_hash_id = '$peer_hash'",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
]
},
"timezone": "",
"title": "LS Links",
"uid": "MPqNG_sWz",
"version": 1,
"weekStart": ""
}

View File

@ -21,12 +21,24 @@
}
]
},
"description": "Combined BGP-LS node and link inventory for a selected BGP peer.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 16,
"iteration": 1654877745288,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -54,7 +66,7 @@
"mode": "absolute",
"steps": [
{
"color": "green",
"color": "blue",
"value": null
}
]
@ -65,7 +77,7 @@
},
"gridPos": {
"h": 6,
"w": 3,
"w": 4,
"x": 0,
"y": 0
},
@ -84,36 +96,19 @@
"fields": "",
"values": false
},
"text": {},
"textMode": "auto"
},
"pluginVersion": "8.5.4",
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT count(*)\n FROM ls_nodes where peer_hash_id = '$peer_hash';",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Total Nodes",
@ -144,8 +139,8 @@
},
"gridPos": {
"h": 6,
"w": 7,
"x": 3,
"w": 10,
"x": 4,
"y": 0
},
"id": 8,
@ -173,33 +168,17 @@
"sort": "none"
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select floor(extract(epoch from max(timestamp))) as time,\n count(*) as count, \n CASE WHEN iswithdrawn THEN 'WITHDRAWN' ELSE 'ACTIVE' END as metric\nfrom ls_links\nwhere peer_hash_id = '$peer_hash'\ngroup by iswithdrawn\norder by time\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Link States",
@ -230,8 +209,8 @@
},
"gridPos": {
"h": 6,
"w": 7,
"x": 10,
"w": 10,
"x": 14,
"y": 0
},
"id": 9,
@ -259,32 +238,17 @@
"sort": "none"
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select floor(extract(epoch from max(timestamp))) as time,\n count(*) as count, \n CASE WHEN mt_id = 2 THEN 'IPv6' ELSE 'IPv4' END as metric\nfrom ls_links \nwhere peer_hash_id = '$peer_hash'\ngroup by metric\norder by time\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Links by Type",
@ -389,33 +353,17 @@
},
"showHeader": true
},
"pluginVersion": "8.5.4",
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT state, nodename, routerid, protocol, timestamp, seq\n FROM v_ls_nodes\n where peer_hash_id = '$peer_hash'\n\n ",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Backbone ISIS Nodes",
@ -434,24 +382,146 @@
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": false
},
"decimals": 0,
"displayName": "",
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "timestamp"
},
"properties": [
{
"id": "displayName",
"value": "Time"
},
{
"id": "unit",
"value": "time: YYYY-MM-DD HH:mm:ss.SSS"
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "seq"
},
"properties": [
{
"id": "unit",
"value": "locale"
}
]
},
{
"matcher": {
"id": "byName",
"options": "state"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "mappings",
"value": [
{
"options": {
"ACTIVE": {
"color": "semi-dark-green",
"index": 0
},
"WITHDRAWN": {
"color": "semi-dark-red",
"index": 1
}
},
"type": "value"
}
]
}
]
}
]
},
"gridPos": {
"h": 2,
"h": 14,
"w": 24,
"x": 0,
"y": 19
},
"id": 2,
"options": {
"content": "\n\n",
"mode": "markdown"
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"pluginVersion": "8.5.4",
"type": "text"
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawQuery": true,
"rawSql": "SELECT state,local_router_name,local_igp_routerid,remote_router_name,remote_igp_routerid,mt_id,igp_metric,protocol, timestamp, seq\n FROM v_ls_links\n WHERE peer_hash_id = '$peer_hash'",
"refId": "A"
}
],
"title": "Backbone ISIS Links",
"transformations": [
{
"id": "merge",
"options": {
"reducers": []
}
}
],
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"linkstate",
"obmp-linkstate"
],
"templating": {
@ -504,8 +574,8 @@
]
},
"timezone": "",
"title": "LS Nodes",
"title": "LS Nodes & Links",
"uid": "dzdSWlyWz",
"version": 1,
"version": 2,
"weekStart": ""
}
}

View File

@ -26,7 +26,19 @@
"graphTooltip": 0,
"id": 17,
"iteration": 1654877763755,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -265,6 +277,8 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"linkstate",
"obmp-linkstate"
],
"templating": {

View File

@ -1,216 +0,0 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 23,
"iteration": 1654877522167,
"links": [],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 28,
"w": 23,
"x": 0,
"y": 0
},
"id": 2,
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select local_node_hash_id as id,\n CASE WHEN max(local_router_name) = '' THEN max(local_igp_routerid) ELSE max(local_router_name) END as title,\n max(local_igp_routerid) as detail__routerid\n from v_ls_links\n where peer_hash_id = '$peer_hash'\n and local_igp_routerid like '%.0000' and remote_igp_routerid like '%.0000'\n and igp_metric <= 16777215\n and state in ($state)\n and (local_node_hash_id = '$local_node_hash' or remote_node_hash_id = '$local_node_hash')\n group by local_node_hash_id\n order by title;",
"refId": " nodes",
"select": [
[
{
"params": [
"amr_rx_hbhloss_pct"
],
"type": "column"
}
]
],
"table": "as_path_metrics",
"timeColumn": "start_timestamp",
"timeColumnType": "timestamp",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select local_node_hash_id || '->' || remote_node_hash_id as id,\n local_node_hash_id as source,\n remote_node_hash_id as target,\n max(igp_metric)::int as mainstat,\n max(state) as secondarystat,\n max(remote_router_name) as detail__remote\n from v_ls_links\n where peer_hash_id = '$peer_hash'\n and local_igp_routerid like '%.0000' and remote_igp_routerid like '%.0000'\n and igp_metric <= 16777215\n and state in ($state)\n and (local_node_hash_id = '$local_node_hash' or remote_node_hash_id = '$local_node_hash')\ngroup by local_node_hash_id,remote_node_hash_id;\n ",
"refId": "edges",
"select": [
[
{
"params": [
"amr_rx_hbhloss_pct"
],
"type": "column"
}
]
],
"table": "as_path_metrics",
"timeColumn": "start_timestamp",
"timeColumnType": "timestamp",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Topology",
"type": "nodeGraph"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-linkstate"
],
"templating": {
"list": [
{
"current": {
"selected": false,
"text": "yyz01-wxbb-crt01-lo0.webex.com",
"value": "367c22e4-57d9-2328-654b-96ea750e0267"
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0",
"hide": 0,
"includeAll": false,
"label": "BGP Peer",
"multi": false,
"name": "peer_hash",
"options": [],
"query": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
},
{
"current": {
"selected": true,
"text": "Active",
"value": "ACTIVE"
},
"hide": 0,
"includeAll": true,
"label": "State",
"multi": false,
"name": "state",
"options": [
{
"selected": false,
"text": "All",
"value": "$__all"
},
{
"selected": false,
"text": "Inactive",
"value": "WITHDRAWN"
},
{
"selected": true,
"text": "Active",
"value": "ACTIVE"
}
],
"query": "Inactive : WITHDRAWN, Active : ACTIVE",
"queryValue": "",
"skipUrlSync": false,
"type": "custom"
},
{
"current": {
"selected": false,
"text": "NRT02-WXBB-CRT01",
"value": "3e96d517-e4b8-7264-1479-2814e9691f10"
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select local_router_name as __text, local_node_hash_id as __value \nfrom v_ls_links\nwhere peer_hash_id = '$peer_hash'\n and local_igp_routerid like '%.0000'\n and igp_metric <= 16777215\n and state in ($state)\ngroup by local_router_name,local_node_hash_id",
"hide": 0,
"includeAll": false,
"label": "Node",
"multi": false,
"name": "local_node_hash",
"options": [],
"query": "select local_router_name as __text, local_node_hash_id as __value \nfrom v_ls_links\nwhere peer_hash_id = '$peer_hash'\n and local_igp_routerid like '%.0000'\n and igp_metric <= 16777215\n and state in ($state)\ngroup by local_router_name,local_node_hash_id",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
}
]
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "LinkState Topology",
"uid": "SNOLrQlnz",
"version": 3,
"weekStart": ""
}

View File

@ -0,0 +1,761 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": []
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT local_router_name as \"Local Router\", \n remote_router_name as \"Remote Router\",\n igp_metric as \"IGP Metric\",\n te_def_metric as \"TE Metric\",\n max_link_bw as \"Max BW (B/s)\",\n max_resv_bw as \"Max Reservable BW\",\n unreserved_bw as \"Unreserved BW\",\n admin_group as \"Admin Group\",\n protection_type as \"Protection\",\n srlg as \"SRLG\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nORDER BY local_router_name, remote_router_name",
"refId": "A"
}
],
"title": "TE Link Capacity Map",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
}
},
"overrides": []
},
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 10
},
"id": 2,
"options": {
"barRadius": 0,
"barWidth": 0.97,
"groupWidth": 0.7,
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"orientation": "auto",
"showValue": "auto",
"stacking": "none",
"tooltip": {
"mode": "single",
"sort": "none"
},
"xTickLabelRotation": -45
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT local_router_name || ' -> ' || remote_router_name as \"Link\",\n igp_metric as \"IGP Metric\",\n COALESCE(te_def_metric, igp_metric) as \"TE Metric\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nORDER BY igp_metric DESC",
"refId": "A"
}
],
"title": "IGP Metric vs TE Metric Comparison",
"type": "barchart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
}
},
"overrides": []
},
"gridPos": {
"h": 10,
"w": 6,
"x": 12,
"y": 10
},
"id": 3,
"options": {
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COALESCE(admin_group::text, 'None') as \"Admin Group\",\n COUNT(*) as \"Link Count\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nGROUP BY admin_group\nORDER BY \"Link Count\" DESC",
"refId": "A"
}
],
"title": "Admin Group Distribution",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
}
},
"overrides": []
},
"gridPos": {
"h": 10,
"w": 6,
"x": 18,
"y": 10
},
"id": 4,
"options": {
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COALESCE(protection_type, 'None') as \"Protection Type\",\n COUNT(*) as \"Link Count\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nGROUP BY protection_type\nORDER BY \"Link Count\" DESC",
"refId": "A"
}
],
"title": "Link Protection Types",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 20
},
"id": 5,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT nodename as \"Node\",\n routerid as \"Router ID\",\n protocol as \"Protocol\",\n sr_capabilities as \"SR Capabilities (SRGB)\"\nFROM v_ls_nodes\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nORDER BY nodename",
"refId": "A"
}
],
"title": "SR Node Capabilities",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 20
},
"id": 6,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT n.nodename as \"Node\",\n p.prefix::text as \"Prefix\",\n p.prefix_len as \"Len\",\n p.metric as \"Metric\",\n p.sr_prefix_sids as \"Prefix SID\",\n p.protocol::text as \"Protocol\"\nFROM ls_prefixes p\nJOIN ls_nodes n ON n.hash_id = p.local_node_hash_id \n AND n.peer_hash_id = p.peer_hash_id\nWHERE p.peer_hash_id = '$peer_hash' AND p.iswithdrawn = false\nORDER BY n.nodename, p.prefix",
"refId": "A"
}
],
"title": "SR Prefix SIDs",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 28
},
"id": 7,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT local_router_name as \"Local\",\n remote_router_name as \"Remote\",\n sr_adjacency_sids as \"Adjacency SIDs\",\n peer_node_sid as \"Peer Node SID\",\n mpls_proto_mask::text as \"MPLS Proto\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nORDER BY local_router_name, remote_router_name",
"refId": "A"
}
],
"title": "SR Adjacency SIDs",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 28
},
"id": 8,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT srlg as \"SRLG Value\",\n COUNT(*) as \"Link Count\",\n string_agg(DISTINCT local_router_name || ' -> ' || remote_router_name, ', ') as \"Links\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false \n AND srlg IS NOT NULL AND srlg != ''\nGROUP BY srlg\nORDER BY COUNT(*) DESC",
"refId": "A"
}
],
"title": "SRLG Groups",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 5,
"x": 0,
"y": 36
},
"id": 9,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) FROM v_ls_links WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false AND te_def_metric IS NOT NULL",
"refId": "A"
}
],
"title": "Links with TE Metric",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 5,
"x": 5,
"y": 36
},
"id": 10,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) FROM v_ls_links WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false AND max_link_bw IS NOT NULL AND max_link_bw > 0",
"refId": "A"
}
],
"title": "Links with Bandwidth",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 5,
"x": 10,
"y": 36
},
"id": 11,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) FROM v_ls_links WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false AND srlg IS NOT NULL AND srlg != ''",
"refId": "A"
}
],
"title": "Links with SRLG",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 5,
"x": 15,
"y": 36
},
"id": 12,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) FROM v_ls_nodes WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false AND sr_capabilities IS NOT NULL AND sr_capabilities != ''",
"refId": "A"
}
],
"title": "Nodes with SR",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 4,
"x": 20,
"y": 36
},
"id": 13,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) FROM v_ls_links WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false AND sr_adjacency_sids IS NOT NULL AND sr_adjacency_sids != ''",
"refId": "A"
}
],
"title": "Links with Adj SID",
"type": "stat"
},
{
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 40
},
"id": 14,
"options": {
"code": {
"language": "plaintext",
"showLineNumbers": false,
"showMiniMap": false
},
"content": "## Traffic Engineering & Segment Routing Analytics\n\nThis dashboard exposes TE and SR attributes from BGP-LS (RFC 7752) that OpenBMP collects but existing dashboards don't display.\n\n### TE Fields (from ls_links)\n- **admin_group**: Link color/affinity bitmap for RSVP-TE constraints\n- **max_link_bw / max_resv_bw**: Link capacity in bytes/sec\n- **unreserved_bw**: Available bandwidth per priority level\n- **te_def_metric**: TE metric (may differ from IGP metric)\n- **protection_type**: FRR protection (unprotected, shared, dedicated, etc.)\n- **srlg**: Shared Risk Link Group for diverse path computation\n\n### SR Fields\n- **sr_capabilities**: Node SRGB (Segment Routing Global Block) range\n- **sr_prefix_sids**: Prefix SID for SR-MPLS forwarding\n- **sr_adjacency_sids**: Adjacency SIDs for SR-TE path steering\n- **peer_node_sid**: BGP EPE SID (RFC 9086)\n\n### Notes\n- NULL values indicate the router is not advertising that TLV\n- To enable TE metrics on IOS-XR: `mpls traffic-eng` under IS-IS\n- To enable SR: `segment-routing mpls` under IS-IS with prefix-sid-map",
"mode": "markdown"
},
"title": "About This Dashboard",
"type": "text"
}
],
"schemaVersion": 39,
"tags": [
"obmp-learning",
"obmp",
"obmp-nav"
],
"templating": {
"list": [
{
"current": {},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0",
"hide": 0,
"includeAll": false,
"label": "BGP Peer",
"multi": false,
"name": "peer_hash",
"options": [],
"query": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
}
]
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "TE & Segment Routing Analytics",
"uid": "obmp-learn-08",
"version": 1
}

View File

@ -0,0 +1,369 @@
{
"uid": "obmp-learn-09",
"title": "Topology Change & Anomaly Detection",
"tags": [
"obmp-learning",
"obmp",
"obmp-nav"
],
"editable": true,
"schemaVersion": 39,
"time": {
"from": "now-6h",
"to": "now"
},
"templating": {
"list": [
{
"name": "peer_hash",
"label": "BGP Peer",
"type": "query",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"query": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0",
"refresh": 1,
"multi": false
}
]
},
"panels": [
{
"id": 1,
"title": "Link State Changes Over Time",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 0
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT $__timeGroupAlias(timestamp, '5m') as time,\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) as \"Links Up\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) as \"Links Down\"\nFROM ls_links_log\nWHERE $__timeFilter(timestamp) AND peer_hash_id = '$peer_hash'\nGROUP BY 1 ORDER BY 1",
"format": "time_series",
"refId": "A"
}
]
},
{
"id": 2,
"title": "Node Changes Over Time",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT $__timeGroupAlias(timestamp, '5m') as time,\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) as \"Nodes Appeared\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) as \"Nodes Withdrawn\"\nFROM ls_nodes_log\nWHERE $__timeFilter(timestamp) AND peer_hash_id = '$peer_hash'\nGROUP BY 1 ORDER BY 1",
"format": "time_series",
"refId": "A"
}
]
},
{
"id": 3,
"title": "BGP Peer Session Events",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 8
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT $__timeGroupAlias(pel.timestamp, '5m') as time,\n SUM(CASE WHEN pel.state = 'up' THEN 1 ELSE 0 END) as \"Sessions Up\",\n SUM(CASE WHEN pel.state = 'down' THEN 1 ELSE 0 END) as \"Sessions Down\"\nFROM peer_event_log pel\nWHERE $__timeFilter(pel.timestamp)\nGROUP BY 1 ORDER BY 1",
"format": "time_series",
"refId": "A"
}
]
},
{
"id": 4,
"title": "RIB Update Rate",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT $__timeGroupAlias(timestamp, '5m') as time,\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) as \"Advertisements\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) as \"Withdrawals\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY 1 ORDER BY 1",
"format": "time_series",
"refId": "A"
}
]
},
{
"id": 5,
"title": "Origin AS Changes (Potential Hijacks)",
"type": "table",
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 16
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT DISTINCT ON (r1.prefix, r1.prefix_len)\n r1.prefix::text as \"Prefix\",\n r1.prefix_len as \"Len\",\n r1.origin_as as \"Current Origin AS\",\n r2.origin_as as \"Previous Origin AS\",\n r1.timestamp as \"Changed At\"\nFROM ip_rib_log r1\nJOIN ip_rib_log r2 ON r1.prefix = r2.prefix \n AND r1.prefix_len = r2.prefix_len\n AND r1.timestamp > r2.timestamp\nWHERE r1.origin_as != r2.origin_as\n AND $__timeFilter(r1.timestamp)\nORDER BY r1.prefix, r1.prefix_len, r1.timestamp DESC\nLIMIT 50",
"format": "table",
"refId": "A"
}
]
},
{
"id": 6,
"title": "Most Churned Prefixes",
"type": "table",
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 16
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT prefix::text as \"Prefix\",\n prefix_len as \"Len\",\n COUNT(*) as \"Total Updates\",\n SUM(CASE WHEN iswithdrawn THEN 1 ELSE 0 END) as \"Withdrawals\",\n MIN(timestamp) as \"First Seen\",\n MAX(timestamp) as \"Last Change\",\n CASE \n WHEN COUNT(*) <= 2 THEN 'Stable'\n WHEN COUNT(*) <= 10 THEN 'Moderate'\n ELSE 'Unstable'\n END as \"Stability\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY prefix, prefix_len\nHAVING COUNT(*) > 1\nORDER BY COUNT(*) DESC\nLIMIT 30",
"format": "table",
"refId": "A"
}
]
},
{
"id": 7,
"title": "Recent Link State Changes",
"type": "table",
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 26
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT l.timestamp as \"Time\",\n CASE WHEN l.iswithdrawn THEN 'DOWN' ELSE 'UP' END as \"State\",\n ln.name as \"Local Node\",\n l.local_igp_router_id as \"Local IGP ID\",\n rn.name as \"Remote Node\",\n l.remote_igp_router_id as \"Remote IGP ID\",\n l.igp_metric as \"IGP Metric\",\n l.protocol::text as \"Protocol\"\nFROM ls_links_log l\nLEFT JOIN ls_nodes ln ON ln.hash_id = l.local_node_hash_id AND ln.peer_hash_id = l.peer_hash_id\nLEFT JOIN ls_nodes rn ON rn.hash_id = l.remote_node_hash_id AND rn.peer_hash_id = l.peer_hash_id\nWHERE $__timeFilter(l.timestamp) AND l.peer_hash_id = '$peer_hash'\nORDER BY l.timestamp DESC\nLIMIT 50",
"format": "table",
"refId": "A"
}
]
},
{
"id": 8,
"title": "Multi-Peer Route Consistency",
"type": "table",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 36
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT r.prefix::text as \"Prefix\",\n r.prefix_len as \"Len\",\n COUNT(DISTINCT r.peer_hash_id) as \"Peer Count\",\n COUNT(DISTINCT ba.origin_as) as \"Distinct Origins\",\n COUNT(DISTINCT ba.as_path_count) as \"Distinct Path Lengths\",\n string_agg(DISTINCT ba.origin_as::text, ', ') as \"Origin ASNs\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY r.prefix, r.prefix_len\nHAVING COUNT(DISTINCT ba.origin_as) > 1\nORDER BY COUNT(DISTINCT ba.origin_as) DESC\nLIMIT 30",
"format": "table",
"refId": "A"
}
]
},
{
"id": 9,
"title": "Active Peers",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 0,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(*) FROM bgp_peers WHERE state = 'up'",
"format": "table",
"refId": "A"
}
]
},
{
"id": 10,
"title": "Total LS Links",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 4,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(*) FROM ls_links WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false",
"format": "table",
"refId": "A"
}
]
},
{
"id": 11,
"title": "Total LS Nodes",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 8,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(*) FROM ls_nodes WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false",
"format": "table",
"refId": "A"
}
]
},
{
"id": 12,
"title": "RIB Updates (24h)",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 12,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(*) FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '24 hours'",
"format": "table",
"refId": "A"
}
]
},
{
"id": 13,
"title": "Link Changes (24h)",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 16,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(*) FROM ls_links_log WHERE timestamp > NOW() - INTERVAL '24 hours' AND peer_hash_id = '$peer_hash'",
"format": "table",
"refId": "A"
}
]
},
{
"id": 14,
"title": "Origin Changes (24h)",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 20,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(DISTINCT r1.prefix) FROM ip_rib_log r1\nJOIN ip_rib_log r2 ON r1.prefix = r2.prefix AND r1.prefix_len = r2.prefix_len AND r1.timestamp > r2.timestamp\nWHERE r1.origin_as != r2.origin_as AND r1.timestamp > NOW() - INTERVAL '24 hours'",
"format": "table",
"refId": "A"
}
]
},
{
"id": 15,
"title": "About This Dashboard",
"type": "text",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 36
},
"options": {
"mode": "markdown",
"content": "## Topology Change & Anomaly Detection\n\nThis dashboard provides heuristic analysis of BMP data to detect network anomalies:\n\n### What to Watch For\n- **Link flaps**: Rapid up/down cycles in the Link State Changes panel indicate instability\n- **Origin AS changes**: Could indicate a route hijack or legitimate migration\n- **Multi-origin prefixes**: Same prefix seen from different origin ASNs across peers\n- **Correlated events**: Peer session drops followed by mass withdrawals indicate convergence events\n\n### Testing with ExaBGP Scenarios\n1. Load `origin_shift` scenario to simulate origin AS changes\n2. Load `hijack_simulation` to see how shorter paths override legitimate routes\n3. Load/unload `churn` scenario repeatedly to generate instability patterns\n\n### Data Sources\n- **ls_links_log / ls_nodes_log**: TimescaleDB hypertables tracking all BGP-LS topology changes\n- **ip_rib_log**: All BGP RIB updates and withdrawals with timestamps\n- **peer_event_log**: BGP session state changes (up/down events)"
}
}
],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
]
}

View File

@ -0,0 +1,70 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "AS adjacency graph derived from consecutive AS pairs in observed AS_PATHs. Reads the mv_as_adjacency materialized view (full RIB, refreshed hourly by pg_cron) so panels load instantly. Edge label = how many times that adjacency appears. Raise 'Min occurrences' to thin the graph; set 'Focus AS' for a 1-hop view around one AS.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Each node is an AS (enriched from info_asn whois data); each edge is an adjacency seen in the AS_PATH data. Edge label is the occurrence count.",
"fieldConfig": {"defaults": {},"overrides": []},
"gridPos": {"h": 24,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {"nodes": {"mainStatUnit": "","secondaryStatUnit": ""}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "WITH edges AS (\n SELECT asn_a, asn_b, occ FROM mv_as_adjacency\n WHERE occ >= $min_occ\n AND ('$focus_as' = '' OR asn_a::text = '$focus_as' OR asn_b::text = '$focus_as')\n ORDER BY occ DESC LIMIT 300\n),\nnlist AS (SELECT asn_a AS asn FROM edges UNION SELECT asn_b FROM edges),\ndeg AS (SELECT asn, COUNT(*) AS d FROM (SELECT asn_a AS asn FROM edges UNION ALL SELECT asn_b FROM edges) z GROUP BY asn)\nSELECT n.asn::text AS id,\n COALESCE(NULLIF(ia.as_name,''),'AS'||n.asn) AS title,\n 'AS ' || n.asn AS mainstat,\n COALESCE(NULLIF(ia.country,''),'?') || ' · ' || dg.d::text || ' links' AS secondarystat,\n COALESCE(NULLIF(ia.org_name,''),'—') AS detail__org,\n COALESCE(NULLIF(ia.country,''),'—') AS detail__country,\n dg.d::text AS detail__degree\nFROM nlist n\nLEFT JOIN info_asn ia ON ia.asn = n.asn\nLEFT JOIN deg dg ON dg.asn = n.asn\nORDER BY dg.d DESC",
"refId": "nodes"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT asn_a::text || '-' || asn_b::text AS id,\n asn_a::text AS source, asn_b::text AS target,\n occ AS mainstat,\n occ::text || ' paths' AS detail__occurrences\nFROM mv_as_adjacency\nWHERE occ >= $min_occ\n AND ('$focus_as' = '' OR asn_a::text = '$focus_as' OR asn_b::text = '$focus_as')\nORDER BY occ DESC LIMIT 300",
"refId": "edges"
}
],
"title": "AS Adjacency Graph",
"type": "nodeGraph"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "The strongest AS adjacencies, with whois names for both endpoints.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": [{"matcher": {"id": "byName","options": "Occurrences"},"properties": [{"id": "custom.displayMode","value": "gradient-gauge"},{"id": "color","value": {"mode": "continuous-BlPu"}}]}]},
"gridPos": {"h": 10,"w": 24,"x": 0,"y": 24},
"id": 2,
"options": {"showHeader": true,"sortBy": [{"desc": true,"displayName": "Occurrences"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT e.asn_a AS \"AS A\",\n COALESCE(NULLIF(ax.as_name,''),'—') AS \"Name A\",\n e.asn_b AS \"AS B\",\n COALESCE(NULLIF(ay.as_name,''),'—') AS \"Name B\",\n e.occ AS \"Occurrences\"\nFROM (\n SELECT asn_a, asn_b, occ FROM mv_as_adjacency\n WHERE occ >= $min_occ\n AND ('$focus_as' = '' OR asn_a::text = '$focus_as' OR asn_b::text = '$focus_as')\n ORDER BY occ DESC LIMIT 300\n) e\nLEFT JOIN info_asn ax ON ax.asn = e.asn_a\nLEFT JOIN info_asn ay ON ay.asn = e.asn_b\nORDER BY e.occ DESC",
"refId": "A"
}
],
"title": "Top AS Adjacencies",
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "obmp-maps", "asn"],
"templating": {
"list": [
{"name": "min_occ","type": "custom","label": "Min occurrences","description": "Only draw adjacencies seen at least this many times across the RIB. Raise it to thin a cluttered graph.","query": "2000,5000,10000,50000","current": {"text": "2000","value": "2000"},"options": [{"text": "2000","value": "2000","selected": true},{"text": "5000","value": "5000","selected": false},{"text": "10000","value": "10000","selected": false},{"text": "50000","value": "50000","selected": false}],"hide": 0},
{"name": "focus_as","type": "textbox","label": "Focus AS","description": "Optional. Enter an ASN to show only adjacencies touching that AS (1-hop view). Leave blank for the full graph.","query": "","current": {"text": "","value": ""},"options": [{"text": "","value": "","selected": true}],"hide": 0}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "AS Relationship Map",
"uid": "as-rel-map",
"version": 1,
"weekStart": ""
}

View File

@ -0,0 +1,78 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "BGP peering topology — every monitored router as a node, every BGP session as an edge. Route reflectors (set the RR Loopbacks variable) show a red ring. Sessions to non-monitored neighbours (e.g. the ExaBGP injector) are listed in the table below.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routers and their BGP sessions. Node ring: red = Route Reflector (per the RR Loopbacks variable), blue = standard router. Edge label shows iBGP vs eBGP.",
"fieldConfig": {
"defaults": {},
"overrides": [
{"matcher": {"id": "byName","options": "arc__rr"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}},{"id": "displayName","value": "Route Reflector"}]},
{"matcher": {"id": "byName","options": "arc__std"},"properties": [{"id": "color","value": {"fixedColor": "blue","mode": "fixed"}},{"id": "displayName","value": "Router"}]}
]
},
"gridPos": {"h": 22,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {"nodes": {"mainStatUnit": "","secondaryStatUnit": ""}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT vp.router_hash_id::text AS id,\n MAX(vp.routername) AS title,\n 'AS ' || MAX(vp.localasn)::text AS mainstat,\n COUNT(DISTINCT vp.peerbgpid) FILTER (WHERE vp.peer_state = 'up')::text || ' peers up' AS secondarystat,\n host(MAX(vp.localbgpid)) AS detail__bgp_id,\n CASE WHEN host(MAX(vp.localbgpid)) = ANY(ARRAY[${rr_loopbacks:singlequote}]::text[]) THEN 'Route Reflector' ELSE 'Router' END AS detail__role,\n CASE WHEN host(MAX(vp.localbgpid)) = ANY(ARRAY[${rr_loopbacks:singlequote}]::text[]) THEN 1 ELSE 0 END AS arc__rr,\n CASE WHEN host(MAX(vp.localbgpid)) = ANY(ARRAY[${rr_loopbacks:singlequote}]::text[]) THEN 0 ELSE 1 END AS arc__std\nFROM v_peers vp\nWHERE vp.localbgpid IS NOT NULL\nGROUP BY vp.router_hash_id\nORDER BY title",
"refId": "nodes"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "WITH rid AS (\n SELECT DISTINCT router_hash_id, host(localbgpid) AS bgpid\n FROM v_peers WHERE localbgpid IS NOT NULL\n),\nsess AS (\n SELECT\n LEAST(vp.router_hash_id, rid2.router_hash_id)::text AS a,\n GREATEST(vp.router_hash_id, rid2.router_hash_id)::text AS b,\n CASE WHEN vp.peerasn = vp.localasn THEN 'iBGP' ELSE 'eBGP' END AS kind,\n vp.peer_hash_id AS feed\n FROM v_peers vp\n JOIN rid rid2 ON rid2.bgpid = host(vp.peerbgpid) OR rid2.bgpid = host(vp.peerip)\n WHERE vp.peer_state = 'up' AND vp.router_hash_id <> rid2.router_hash_id\n)\nSELECT a || '-' || b AS id, a AS source, b AS target,\n COUNT(DISTINCT feed)::int AS mainstat,\n MAX(kind) AS secondarystat,\n MAX(kind) AS detail__session_type\nFROM sess GROUP BY a, b",
"refId": "edges"
}
],
"title": "BGP Peering Topology",
"type": "nodeGraph"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Up BGP sessions whose far end is not a monitored router (no node to draw an edge to) — e.g. eBGP sessions to the ExaBGP route injector or external peers.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "Type"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"eBGP": {"color": "orange","index": 0},"iBGP": {"color": "blue","index": 1}},"type": "value"}]}]}]
},
"gridPos": {"h": 8,"w": 24,"x": 0,"y": 22},
"id": 2,
"options": {"showHeader": true,"sortBy": [{"desc": false,"displayName": "Router"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT vp.routername AS \"Router\",\n host(vp.peerip) AS \"Peer\",\n vp.peerasn AS \"Peer AS\",\n CASE WHEN vp.peerasn = vp.localasn THEN 'iBGP' ELSE 'eBGP' END AS \"Type\",\n vp.peer_state AS \"State\"\nFROM v_peers vp\nWHERE vp.peer_state = 'up'\n AND NOT EXISTS (SELECT 1 FROM v_peers r WHERE host(r.localbgpid) = host(vp.peerbgpid) OR host(r.localbgpid) = host(vp.peerip))\nORDER BY vp.routername, vp.peerip",
"refId": "A"
}
],
"title": "Sessions to External / Non-Monitored Neighbours",
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "obmp-maps", "bgp"],
"templating": {
"list": [
{"name": "rr_loopbacks","type": "custom","label": "RR Loopbacks","description": "Loopback / BGP router-id addresses of your route reflectors. Edit this list for your environment.","query": "10.10.255.0,10.10.255.20,10.11.255.0,10.11.255.20","multi": true,"includeAll": true,"current": {"text": ["All"],"value": ["$__all"]},"options": [{"text": "All","value": "$__all","selected": true},{"text": "10.10.255.0","value": "10.10.255.0","selected": false},{"text": "10.10.255.20","value": "10.10.255.20","selected": false},{"text": "10.11.255.0","value": "10.11.255.0","selected": false},{"text": "10.11.255.20","value": "10.11.255.20","selected": false}],"hide": 0}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "BGP Peer Map",
"uid": "bgp-peer-map",
"version": 1,
"weekStart": ""
}

View File

@ -0,0 +1,104 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Geographic view of the routes in the RIB — prefixes and origin ASes plotted by IP-geolocation (geo_ip). Shows the injected real-IP prefixes, not the lab fabric (lab routers use 10.x loopbacks that do not geolocate — see the BGP/IGP node-graph maps for the fabric). Manual refresh — the geo_ip containment join is heavy.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"gridPos": {"h": 4,"w": 24,"x": 0,"y": 0},
"id": 10,
"options": {"code": {"language": "plaintext","showLineNumbers": false,"showMiniMap": false},"content": "## Geographic Prefix & Origin-AS Map\n\nThese maps plot RIB routes by **IP geolocation** (`geo_ip`). The CML lab routers use `10.x` loopback addressing, which does **not** geolocate — so this dashboard visualises the **injected real-IP prefixes** and their **origin ASes**, not the lab fabric itself. For the lab topology see **BGP Peer Map** and **IGP / Link-State Topology Map**.\n\nQueries are bounded by the **Sample size** variable and run on **manual refresh** only — the `geo_ip` containment join scans ~20M rows.","mode": "markdown"},
"pluginVersion": "9.1.7",
"title": "",
"type": "text"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Each marker is a RIB prefix placed at the location of a covering geo_ip block. Bounded by the Sample size variable.",
"fieldConfig": {"defaults": {"color": {"mode": "continuous-GrYlRd"},"custom": {"hideFrom": {"legend": false,"tooltip": false,"viz": false}}},"overrides": []},
"gridPos": {"h": 16,"w": 12,"x": 0,"y": 4},
"id": 1,
"options": {
"basemap": {"config": {},"name": "Layer 0","type": "default"},
"controls": {"mouseWheelZoom": true,"showAttribution": true,"showDebug": false,"showMeasure": false,"showScale": false,"showZoom": true},
"layers": [{"config": {"showLegend": false,"style": {"color": {"fixed": "blue"},"opacity": 0.4,"size": {"fixed": 4},"symbol": {"fixed": "img/icons/marker/circle.svg","mode": "fixed"}}},"location": {"latitude": "latitude","longitude": "longitude","mode": "coords"},"name": "Prefixes","tooltip": true,"type": "markers"}],
"tooltip": {"mode": "details"},
"view": {"id": "zero","lat": 25,"lon": 0,"zoom": 2}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "WITH p AS (\n SELECT DISTINCT prefix FROM ip_rib\n WHERE iswithdrawn = false AND isipv4 = true AND family(prefix) = 4\n AND NOT (prefix << '10.0.0.0/8')\n LIMIT $sample_limit\n)\nSELECT host(p.prefix) AS prefix, g.latitude, g.longitude\nFROM p CROSS JOIN LATERAL (\n SELECT latitude, longitude FROM geo_ip WHERE ip >>= p.prefix LIMIT 1\n) g",
"refId": "A"
}
],
"title": "RIB Prefix Geolocation",
"type": "geomap"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "One marker per origin AS, placed at the geolocation of one of its prefixes and enriched with whois name.",
"fieldConfig": {"defaults": {"color": {"mode": "continuous-BlPu"},"custom": {"hideFrom": {"legend": false,"tooltip": false,"viz": false}}},"overrides": []},
"gridPos": {"h": 16,"w": 12,"x": 12,"y": 4},
"id": 2,
"options": {
"basemap": {"config": {},"name": "Layer 0","type": "default"},
"controls": {"mouseWheelZoom": true,"showAttribution": true,"showDebug": false,"showMeasure": false,"showScale": false,"showZoom": true},
"layers": [{"config": {"showLegend": false,"style": {"color": {"fixed": "orange"},"opacity": 0.6,"size": {"fixed": 6},"symbol": {"fixed": "img/icons/marker/circle.svg","mode": "fixed"}}},"location": {"latitude": "latitude","longitude": "longitude","mode": "coords"},"name": "Origin ASes","tooltip": true,"type": "markers"}],
"tooltip": {"mode": "details"},
"view": {"id": "zero","lat": 25,"lon": 0,"zoom": 2}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "WITH oa AS (\n SELECT origin_as, MIN(prefix) AS prefix\n FROM ip_rib\n WHERE iswithdrawn = false AND isipv4 = true AND family(prefix) = 4\n AND NOT (prefix << '10.0.0.0/8') AND origin_as IS NOT NULL\n GROUP BY origin_as\n)\nSELECT 'AS' || oa.origin_as AS \"AS\",\n COALESCE(NULLIF(ia.as_name,''),'AS'||oa.origin_as) AS as_name,\n g.latitude, g.longitude\nFROM oa\nLEFT JOIN info_asn ia ON ia.asn = oa.origin_as\nCROSS JOIN LATERAL (\n SELECT latitude, longitude FROM geo_ip WHERE ip >>= oa.prefix LIMIT 1\n) g",
"refId": "A"
}
],
"title": "Origin-AS Geographic Distribution",
"type": "geomap"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Origin ASes grouped by whois country. ASes with no whois enrichment show as (unknown). Fast — no geo_ip join.",
"fieldConfig": {"defaults": {"custom": {"align": "auto","displayMode": "auto"}},"overrides": [{"matcher": {"id": "byName","options": "Origin ASes"},"properties": [{"id": "custom.displayMode","value": "gradient-gauge"},{"id": "color","value": {"mode": "continuous-BlPu"}}]}]},
"gridPos": {"h": 10,"w": 24,"x": 0,"y": 20},
"id": 3,
"options": {"showHeader": true,"sortBy": [{"desc": true,"displayName": "Origin ASes"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT COALESCE(NULLIF(ia.country,''),'(unknown)') AS \"Country\",\n COUNT(DISTINCT r.origin_as) AS \"Origin ASes\"\nFROM (SELECT DISTINCT origin_as FROM ip_rib WHERE iswithdrawn = false AND origin_as IS NOT NULL) r\nLEFT JOIN info_asn ia ON ia.asn = r.origin_as\nGROUP BY 1\nORDER BY 2 DESC",
"refId": "A"
}
],
"title": "Origin-AS by Country",
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "obmp-maps", "geo"],
"templating": {
"list": [
{"name": "sample_limit","type": "custom","label": "Sample size","description": "Number of distinct prefixes to geolocate on the RIB Prefix map. Larger = slower (the geo_ip join scans ~20M rows).","query": "2000,5000,10000","current": {"text": "2000","value": "2000"},"options": [{"text": "2000","value": "2000","selected": true},{"text": "5000","value": "5000","selected": false},{"text": "10000","value": "10000","selected": false}],"hide": 0}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "Geographic Prefix Map",
"uid": "geo-prefix-map",
"version": 1,
"weekStart": ""
}

View File

@ -0,0 +1,103 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "IGP link-state topology (BGP-LS) as a node graph. Scope with the BGP Peer feed, IGP protocol, and AS to keep the graph readable. Edge labels are IGP metric; node rings show Segment Routing capability.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Every BGP-LS node and link for the selected peer feed / protocol / AS. SR-capable nodes show a green ring.",
"fieldConfig": {
"defaults": {},
"overrides": [
{"matcher": {"id": "byName","options": "arc__sr"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}},{"id": "displayName","value": "SR-capable"}]},
{"matcher": {"id": "byName","options": "arc__plain"},"properties": [{"id": "color","value": {"fixedColor": "blue","mode": "fixed"}},{"id": "displayName","value": "No SR"}]}
]
},
"gridPos": {"h": 28,"w": 24,"x": 0,"y": 0},
"id": 2,
"options": {"nodes": {"mainStatUnit": "","secondaryStatUnit": ""}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT n.hash_id::text AS id,\n CASE WHEN COALESCE(n.name,'') = '' THEN n.igp_router_id ELSE n.name END AS title,\n n.router_id AS mainstat,\n n.protocol::text AS secondarystat,\n n.igp_router_id AS detail__igp_id,\n 'AS ' || n.asn AS detail__asn,\n COALESCE(NULLIF(n.sr_capabilities,''),'none') AS detail__sr_caps,\n CASE WHEN COALESCE(n.sr_capabilities,'') <> '' THEN 1 ELSE 0 END AS arc__sr,\n CASE WHEN COALESCE(n.sr_capabilities,'') = '' THEN 1 ELSE 0 END AS arc__plain\nFROM ls_nodes n\nWHERE n.iswithdrawn = false\n AND n.peer_hash_id = '$peer_hash'\n AND n.protocol::text IN ($protocol)\n AND ($asn = 0 OR n.asn = $asn)\nORDER BY title",
"refId": "nodes"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT l.local_node_hash_id::text || '->' || l.remote_node_hash_id::text AS id,\n l.local_node_hash_id::text AS source,\n l.remote_node_hash_id::text AS target,\n MAX(l.igp_metric)::bigint AS mainstat,\n MAX(l.protocol::text) AS secondarystat,\n MAX(l.te_def_metric)::text AS detail__te_metric,\n MAX(l.max_link_bw)::text AS detail__max_bw\nFROM ls_links l\nJOIN ls_nodes ln ON ln.hash_id = l.local_node_hash_id AND ln.peer_hash_id = l.peer_hash_id\nWHERE l.iswithdrawn = false\n AND l.peer_hash_id = '$peer_hash'\n AND l.protocol::text IN ($protocol)\n AND ($asn = 0 OR ln.asn = $asn)\nGROUP BY l.local_node_hash_id, l.remote_node_hash_id",
"refId": "edges"
}
],
"title": "IGP / Link-State Topology",
"type": "nodeGraph"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "obmp-maps", "linkstate"],
"templating": {
"list": [
{
"current": {},
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"definition": "SELECT __text,__value FROM (select peername as __text, peer_hash_id as __value, count(*) as count from v_ls_nodes group by peername,peer_hash_id) d where count > 0",
"hide": 0,
"includeAll": false,
"label": "BGP Peer feed",
"multi": false,
"name": "peer_hash",
"options": [],
"query": "SELECT __text,__value FROM (select peername as __text, peer_hash_id as __value, count(*) as count from v_ls_nodes group by peername,peer_hash_id) d where count > 0",
"refresh": 1,
"sort": 1,
"type": "query"
},
{
"allValue": "",
"current": {"selected": true,"text": "All","value": "$__all"},
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"definition": "SELECT DISTINCT protocol::text FROM ls_nodes WHERE iswithdrawn=false AND protocol IS NOT NULL AND protocol::text <> '' ORDER BY 1",
"hide": 0,
"includeAll": true,
"label": "IGP Protocol",
"multi": true,
"name": "protocol",
"options": [],
"query": "SELECT DISTINCT protocol::text FROM ls_nodes WHERE iswithdrawn=false AND protocol IS NOT NULL AND protocol::text <> '' ORDER BY 1",
"refresh": 1,
"sort": 1,
"type": "query"
},
{
"allValue": "0",
"current": {"selected": true,"text": "All","value": "$__all"},
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"definition": "SELECT DISTINCT asn FROM ls_nodes WHERE iswithdrawn=false AND asn > 0 ORDER BY asn",
"hide": 0,
"includeAll": true,
"label": "AS",
"multi": false,
"name": "asn",
"options": [],
"query": "SELECT DISTINCT asn FROM ls_nodes WHERE iswithdrawn=false AND asn > 0 ORDER BY asn",
"refresh": 1,
"sort": 3,
"type": "query"
}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "IGP / Link-State Topology Map",
"uid": "SNOLrQlnz",
"version": 1,
"weekStart": ""
}

View File

@ -0,0 +1,71 @@
# OpenBMP — Grafana contact points & notification policy provisioning
# Grafana 9.1.7 (apiVersion: 1)
#
# Defines WHERE alert notifications go (contact points) and WHICH alerts go
# there (the notification policy tree). Pairs with obmp-alerts.yaml in this
# directory.
#
# ----------------------------------------------------------------------
# OPERATOR REVIEW — this file ships with PLACEHOLDERS. Fill them in.
# ----------------------------------------------------------------------
# * The 'obmp-ops' contact point below has BOTH an email and a webhook
# receiver as examples. Delete whichever you do not use and fill in real
# values for the one you keep.
# * EMAIL requires Grafana SMTP to be configured (the [smtp] section of
# grafana.ini, or GF_SMTP_* env vars on the obmp-grafana container).
# Without working SMTP the email receiver silently fails.
# * WEBHOOK url: point it at your alerting system (Slack incoming webhook,
# PagerDuty Events API, Mattermost, an internal handler, etc.).
# * After editing, restart Grafana and verify under
# Alerting > Contact points > (test).
# ----------------------------------------------------------------------
apiVersion: 1
# --- Contact points ----------------------------------------------------
contactPoints:
- orgId: 1
name: obmp-ops
receivers:
# ---- Email receiver (requires Grafana SMTP configured) ----
- uid: obmp-ops-email
type: email
settings:
# REPLACE with the real NOC / on-call distribution address(es).
# Comma-separate multiple recipients.
addresses: noc@example.net
singleEmail: false
disableResolveMessage: false
# ---- Webhook receiver (Slack / PagerDuty / internal handler) ----
# Delete this block if you only use email.
- uid: obmp-ops-webhook
type: webhook
settings:
# REPLACE with your real webhook endpoint.
url: https://hooks.example.net/services/REPLACE-ME
httpMethod: POST
disableResolveMessage: false
# --- Notification policy tree -----------------------------------------
# The root policy routes every alert from obmp-alerts.yaml to 'obmp-ops'.
# Sub-routes split by the `severity` label so critical alerts can page
# faster / repeat sooner than warnings.
policies:
- orgId: 1
receiver: obmp-ops
# Group alerts that share these labels into a single notification.
group_by: ['alertname', 'service']
# Timing for the default (warning-ish) path.
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
routes:
# Critical alerts (peer down, router BMP down): notify fast, repeat
# more often until resolved.
- receiver: obmp-ops
matchers:
- severity = critical
group_wait: 10s
group_interval: 2m
repeat_interval: 1h

View File

@ -0,0 +1,270 @@
# OpenBMP — Grafana unified-alerting rule provisioning
# Grafana 9.1.7 (apiVersion: 1)
#
# Provisioned alert rules for the OpenBMP BGP-monitoring stack. They query the
# PostgreSQL datasource (uid: obmp_postgres) and fire on BGP peer/router
# session loss, peer flap storms, and RPKI-invalid routes.
#
# ----------------------------------------------------------------------
# DEPLOYMENT
# ----------------------------------------------------------------------
# This file is read by Grafana from /etc/grafana/provisioning/alerting/.
# The compose stack bind-mounts ${OBMP_DATA_ROOT}/grafana/provisioning into
# the container, so copy this directory there and restart Grafana:
#
# cp -r obmp-grafana/provisioning/alerting ${OBMP_DATA_ROOT}/grafana/provisioning/
# docker compose -p obmp restart grafana
#
# Pair it with contact-points.yaml (in this directory) for notifications.
#
# ----------------------------------------------------------------------
# OPERATOR REVIEW — fields you should check before relying on these
# ----------------------------------------------------------------------
# * folderUID: '1001' — reuses the existing 'OBMP-Base' dashboard folder so
# the rules have a home in the UI. Change it to a dedicated alerting
# folder UID if you prefer; the folder must already exist in Grafana.
# * datasourceUid: obmp_postgres — confirmed correct for this stack.
# * Thresholds and `for:` durations below are reasonable starting points.
# Tune them against your production baseline (40 full-table routers will
# have a different normal flap/churn profile than the lab).
# * The reduce/threshold expression UIDs (B, C) and refIds are internal to
# each rule; do not rename them without updating the matching references.
# * Alert-rule provisioning YAML is intricate. These definitions are
# intentionally minimal and well-commented. After first load, open each
# rule in the Grafana UI (Alerting > Alert rules) and confirm it
# evaluates without error before depending on it for paging.
# ----------------------------------------------------------------------
apiVersion: 1
groups:
- orgId: 1
name: OpenBMP BGP Health
folder: OBMP-Base
# How often every rule in this group is evaluated.
interval: 1m
rules:
# ------------------------------------------------------------------
# (a) BGP peer down within the last 15 minutes
# ------------------------------------------------------------------
# bgp_peers.state is an enum ('up'/'down'); .timestamp is the last
# state-change time. A peer whose state is 'down' AND changed within
# the last 15 min indicates a recent session loss.
- uid: obmp-peer-down
title: BGP Peer Down (recent)
condition: C
for: 5m
data:
- refId: A
relativeTimeRange: { from: 600, to: 0 }
datasourceUid: obmp_postgres
model:
refId: A
datasource: { type: postgres, uid: obmp_postgres }
format: table
rawSql: >
SELECT count(*)::float8 AS value
FROM bgp_peers
WHERE state = 'down'
AND timestamp > (now() AT TIME ZONE 'utc') - interval '15 minutes';
- refId: B
datasourceUid: __expr__
model:
refId: B
type: reduce
datasource: { type: __expr__, uid: __expr__ }
expression: A
reducer: last
- refId: C
datasourceUid: __expr__
model:
refId: C
type: threshold
datasource: { type: __expr__, uid: __expr__ }
expression: B
# Fire when one or more peers went down in the last 15 min.
conditions:
- evaluator: { type: gt, params: [0] }
labels:
severity: critical
service: bmp
annotations:
summary: One or more BGP peers went down in the last 15 minutes
description: >
{{ $values.B }} BGP peer(s) are in state 'down' with a state
change within the last 15 minutes. Check the OBMP peer
inventory and the affected routers.
# ------------------------------------------------------------------
# (b) Peer flap storm — >5 down-events for one peer in 1 hour
# ------------------------------------------------------------------
# peer_event_log records every peer state transition. Counting 'down'
# events per peer over the last hour detects a flapping session even
# if the peer is currently 'up'. The inner query groups per peer; the
# outer takes the worst offender's count.
- uid: obmp-peer-flap-storm
title: BGP Peer Flap Storm
condition: C
for: 0m
data:
- refId: A
relativeTimeRange: { from: 3600, to: 0 }
datasourceUid: obmp_postgres
model:
refId: A
datasource: { type: postgres, uid: obmp_postgres }
format: table
rawSql: >
SELECT coalesce(max(c), 0)::float8 AS value
FROM (
SELECT count(*) AS c
FROM peer_event_log
WHERE state = 'down'
AND timestamp > (now() AT TIME ZONE 'utc') - interval '1 hour'
GROUP BY peer_hash_id
) s;
- refId: B
datasourceUid: __expr__
model:
refId: B
type: reduce
datasource: { type: __expr__, uid: __expr__ }
expression: A
reducer: last
- refId: C
datasourceUid: __expr__
model:
refId: C
type: threshold
datasource: { type: __expr__, uid: __expr__ }
expression: B
# >5 down-events for a single peer within 1h = flap storm.
conditions:
- evaluator: { type: gt, params: [5] }
labels:
severity: warning
service: bmp
annotations:
summary: A BGP peer is flapping (more than 5 resets in the last hour)
description: >
At least one peer has logged {{ $values.B }} 'down' events in
peer_event_log within the last hour. Investigate link/session
instability on the affected peer.
# ------------------------------------------------------------------
# (c) RPKI-invalid routes present
# ------------------------------------------------------------------
# ip_rib has no RPKI column on this schema, so validity is derived by
# joining against rpki_validator (ROA cache, refreshed by the psql-app
# RPKI cron). A route is "invalid" when a covering ROA exists for the
# prefix but NO ROA matches its origin AS.
#
# NOTE: rpki_validator is empty until ENABLE_RPKI=1 has run at least
# once (every ~2h). Until then this rule correctly reports 0.
- uid: obmp-rpki-invalid
title: RPKI-Invalid Routes Present
condition: C
for: 10m
data:
- refId: A
relativeTimeRange: { from: 600, to: 0 }
datasourceUid: obmp_postgres
model:
refId: A
datasource: { type: postgres, uid: obmp_postgres }
format: table
rawSql: >
SELECT count(*)::float8 AS value
FROM ip_rib r
WHERE r.iswithdrawn = false
AND r.origin_as IS NOT NULL
AND EXISTS (
SELECT 1 FROM rpki_validator v
WHERE r.prefix <<= v.prefix
AND r.prefix_len BETWEEN masklen(v.prefix) AND v.prefix_len_max
)
AND NOT EXISTS (
SELECT 1 FROM rpki_validator v2
WHERE r.prefix <<= v2.prefix
AND r.prefix_len BETWEEN masklen(v2.prefix) AND v2.prefix_len_max
AND v2.origin_as = r.origin_as
);
- refId: B
datasourceUid: __expr__
model:
refId: B
type: reduce
datasource: { type: __expr__, uid: __expr__ }
expression: A
reducer: last
- refId: C
datasourceUid: __expr__
model:
refId: C
type: threshold
datasource: { type: __expr__, uid: __expr__ }
expression: B
# Any RPKI-invalid route is worth surfacing. Raise the param
# (e.g. to 10) if you expect a steady-state baseline of
# invalids and only want to alert on spikes.
conditions:
- evaluator: { type: gt, params: [0] }
labels:
severity: warning
service: routing-security
annotations:
summary: RPKI-invalid routes are present in the RIB
description: >
{{ $values.B }} route(s) in ip_rib are RPKI-invalid (a covering
ROA exists but none matches the route's origin AS). Possible
mis-origination or hijack — review the RPKI Validation dashboard.
# ------------------------------------------------------------------
# (d) Router BMP session down
# ------------------------------------------------------------------
# routers.state is the BMP session state for each monitored router.
# 'down' means the router's BMP feed to the collector has dropped.
- uid: obmp-router-bmp-down
title: Router BMP Session Down
condition: C
for: 5m
data:
- refId: A
relativeTimeRange: { from: 600, to: 0 }
datasourceUid: obmp_postgres
model:
refId: A
datasource: { type: postgres, uid: obmp_postgres }
format: table
rawSql: >
SELECT count(*)::float8 AS value
FROM routers
WHERE state = 'down';
- refId: B
datasourceUid: __expr__
model:
refId: B
type: reduce
datasource: { type: __expr__, uid: __expr__ }
expression: A
reducer: last
- refId: C
datasourceUid: __expr__
model:
refId: C
type: threshold
datasource: { type: __expr__, uid: __expr__ }
expression: B
# Any router with a down BMP session.
conditions:
- evaluator: { type: gt, params: [0] }
labels:
severity: critical
service: bmp
annotations:
summary: One or more routers have a down BMP session
description: >
{{ $values.B }} router(s) are in BMP state 'down' — the
collector is no longer receiving BMP from them. Check the
router BMP config and reachability to the collector on port 5000.

View File

@ -27,7 +27,7 @@ providers:
# <int> Org id. Default to 1
orgId: 1
# <string> name of the dashboard folder.
folder: 'OBMP-Base'
folder: 'OBMP-Operations'
# <string> folder UID. will be automatically generated if not specified
folderUid: '1001'
# <string> provider type. Default to 'file'
@ -47,7 +47,7 @@ providers:
# <int> Org id. Default to 1
orgId: 1
# <string> name of the dashboard folder.
folder: 'OBMP-History'
folder: 'OBMP-Routing'
# <string> folder UID. will be automatically generated if not specified
folderUid: '1002'
# <string> provider type. Default to 'file'
@ -63,26 +63,6 @@ providers:
path: /var/lib/grafana/dashboards/obmp/History-1002
# <bool> use folder names from filesystem to create folders in Grafana
foldersFromFilesStructure: false
- name: 'OpenBMP-Tops'
# <int> Org id. Default to 1
orgId: 1
# <string> name of the dashboard folder.
folder: 'OBMP-Tops'
# <string> folder UID. will be automatically generated if not specified
folderUid: '1003'
# <string> provider type. Default to 'file'
type: file
# <bool> disable dashboard deletion
disableDeletion: false
# <int> how often Grafana will scan for changed dashboards
updateIntervalSeconds: 30
# <bool> allow updating provisioned dashboards from the UI
allowUiUpdates: true
options:
# <string, required> path to dashboard files on disk. Required when using the 'file' type
path: /var/lib/grafana/dashboards/obmp/Tops-1003
# <bool> use folder names from filesystem to create folders in Grafana
foldersFromFilesStructure: false
- name: 'OpenBMP-LinkState'
# <int> Org id. Default to 1
orgId: 1
@ -125,7 +105,7 @@ providers:
foldersFromFilesStructure: false
- name: 'OpenBMP-Learning'
orgId: 1
folder: 'OBMP-Learning'
folder: 'OBMP-Reference'
folderUid: '2001'
type: file
disableDeletion: false
@ -133,4 +113,26 @@ providers:
allowUiUpdates: true
options:
path: /var/lib/grafana/dashboards/Learning
foldersFromFilesStructure: false
- name: 'OpenBMP-Telemetry'
orgId: 1
folder: 'OBMP-Telemetry'
folderUid: '3001'
type: file
disableDeletion: false
updateIntervalSeconds: 30
allowUiUpdates: true
options:
path: /var/lib/grafana/dashboards/Telemetry-3001
foldersFromFilesStructure: false
- name: 'OpenBMP-Maps'
orgId: 1
folder: 'OBMP-Maps'
folderUid: '1006'
type: file
disableDeletion: false
updateIntervalSeconds: 30
allowUiUpdates: true
options:
path: /var/lib/grafana/dashboards/obmp/Maps-1006
foldersFromFilesStructure: false

View File

@ -0,0 +1,16 @@
apiVersion: 1
datasources:
- name: InfluxDB-Telemetry
uid: obmp_influxdb
type: influxdb
access: proxy
url: http://obmp-influxdb:8086
jsonData:
version: Flux
organization: openbmp
defaultBucket: telemetry
secureJsonData:
token: openbmp-telemetry-token
isDefault: false
editable: true

Some files were not shown because too many files have changed in this diff Show More