Compare commits

...

24 Commits

Author SHA1 Message Date
sam
26dea47a55 Make the ASN View origin-AS selector a free-text input
asn_num was a fixed custom variable; converting it to a textbox lets an
operator look up any origin AS and see all of its RIB prefixes, upstreams,
and downstreams.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:22:21 -07:00
sam
9d74940614 Fix ExaBGP OOM, add container health checks and resource monitoring
RCA: the exabgp container was OOM-killed — its 512m mem_limit was far too
small for the full-table feature (900K route objects in memory). Raises the
limit to a parameterized 6g default (EXABGP_MEM_LIMIT).

Adds Docker healthchecks to 14 services (port/HTTP probes) so unhealthy
containers are visible. Adds a Telegraf docker input that collects per-
container CPU/memory/IO into InfluxDB, plus a "Stack Resources" dashboard —
so resource pressure is caught before it causes an OOM crash. telegraf runs
with an overridden entrypoint so it keeps root and can read the docker socket.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:03:52 -07:00
sam
482c0cdc01 Add ipv6 unicast to ExaBGP neighbor family
The IOS-XR routers negotiate IPv6 unicast capability, but the generated
exabgp.conf declared only ipv4 unicast — producing repeated "route family
(ipv6/unicast) is not configured" errors that crashed ExaBGP. Declaring
ipv6 unicast on the neighbor matches the routers' capabilities and stops
the crash-restart cycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 21:40:32 -07:00
sam
6d3387dfe5 Add RR next-hop sanity check to the RR Loc-RIB Diff dashboard
Adds a panel that flags the next-hop-self-on-an-RR anti-pattern: reflected
routes (those carrying ORIGINATOR_ID) whose NEXT_HOP is an RR loopback while
the route was originated by a different router — meaning the RR rewrote
next-hop to itself and has been pulled into the forwarding path. RR-originated
routes and legitimately-imported eBGP routes (originator == next-hop) are
excluded. An editable rr_loopbacks template variable keeps it environment-
agnostic — useful for validating RR behavior during an IOS-XR to Junos
migration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 21:18:22 -07:00
sam
a662496e53 Fix telemetry dashboard variables and parameterize gNMI targets
The telemetry dashboards' router/interface variables used a keep|distinct
Flux pattern that returned only one source; switch to schema.tagValues so all
streaming routers and interfaces are listed. Parameterize telegraf.conf gNMI
addresses and credentials via GNMI_ADDRESSES/GNMI_USERNAME/GNMI_PASSWORD so
the telemetry fleet can scale without editing the config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 21:10:57 -07:00
sam
0732ebfa07 Add production-readiness deliverables: security, backup, alerting
Adds a prioritized security-hardening checklist, a PostgreSQL logical-backup
script (pg-backup.sh) with a documented restore procedure, and Grafana
alerting provisioning (peer-down, flap-storm, RPKI-invalid, router-down rules
plus a contact-point template). The alerting YAML and contact points need
operator review before being relied on for paging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:55:03 -07:00
sam
7e3370b5a5 Rework Grafana dashboard information architecture
Reorganizes 31 dashboards into an operator-first structure with real
navigation. Adds Router Detail and Peer Detail drilldown dashboards; merges
LS Nodes+Links and the two L3VPN dashboards; modernizes all deprecated panels
(table-old/graph/worldmap). Every dashboard gets the obmp-nav dropdown so the
whole set is reachable from anywhere. Graduates the operational "Learning"
dashboards into Operations/Routing/LinkState folders, retires the Tops folder,
and relabels folders (Base->Operations, History->Routing, Learning->Reference).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:55:03 -07:00
sam
f430758992 Scope NOC Overview "Peers Down" panels to the dashboard time range
The scorecard and table counted every bgp_peers row in a down state,
including peers removed long ago (OpenBMP never prunes bgp_peers). They now
filter on the peer's last state-change timestamp via $__timeFilter, so the
panel reflects current/recent problems rather than all-time history.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:29:59 -07:00
sam
f1558946ae Add production sizing guide for 40 full-table-edge routers
Documents compute, memory, and storage requirements for a production
deployment: ~100-150M NLRI estimate, 96-128 GB RAM, 16-32 vCPU, 3-5 TB NVMe,
a split-host architecture option, PostgreSQL tuning, and a BMP RIB-scope
recommendation (Adj-RIB-In only initially).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:06:25 -07:00
sam
960806fc06 Add NOC Overview dashboard and rebuild home as a navigation hub
NOC Overview is the new flagship operator landing dashboard — health
scorecards, peer session timeline, BGP update rate, and attention tables for
peers down, churning prefixes, RPKI invalids, and topology changes. All counts
come from stats_* aggregate tables so it stays fast at production scale.
OBMP-Home is rebuilt as a lightweight navigation hub pointing at NOC Overview.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:04:37 -07:00
sam
4e9bd7cc5a Add container memory limits to all services
Sets mem_limit on every service to cap the OOM/swap-exhaustion risk (the lab
host had only 5 MiB swap free). The three heavy services (psql, kafka,
psql-app) read their limits from .env so production can raise them; the rest
use lab-appropriate fixed values. Total ~25 GB, leaving headroom on the 31 GB
lab host.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:04:37 -07:00
sam
8ac156ce86 Add second-lab ExaBGP peering and bulk BMP config script
Generalizes exabgp/startup.sh to template BGP neighbors from an EXABGP_PEERS
list (ip:peer_as:description), so ExaBGP peers with multiple labs. Adds
cml/proxmox_bmp_config.py to apply the bmp server block to a lab's IOS-XR
routers over SSH (BMP config is not exposed via NETCONF YANG on current XR).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 19:21:11 -07:00
sam
cf4e5b07c6 Add Compose profiles, setup.sh bootstrap, and config templates for portable deployment
Pins the Compose project name and splits services into core / test / auth
profiles so the BMP collector core can deploy standalone. Adds setup.sh
(idempotent bootstrap), .env.example, and repo-resident Authelia config
templates so a fresh host deploys without manual steps. Parameterizes
hardcoded host IP and domain; points the Grafana InfluxDB datasource at the
container name.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 19:21:04 -07:00
sam
31286d5d3e Add platform roadmap: multi-lab CML integration and production deployment
Four-track roadmap covering configuration centralization (inventory.yaml),
CML API automation (virl2_client), production ISP deployment (multi-vendor
IOS-XR + Junos), and packaging for distribution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:23:38 -07:00
sam
da49b3e462 Add CML integration: XRd and ExaBGP node/image definitions and build scripts
CML 2.9 node definitions for XRd Control-Plane (third RR) and ExaBGP route
injector as Docker-based CML nodes. Includes build scripts to export Docker
images as tars for CML import, with IOS-XR startup configs for IS-IS, BGP,
and BMP.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:23:30 -07:00
sam
541f018bc5 Add RR Loc-RIB diff dashboard and route diversity config
Dashboard compares Adj-RIB-In tables between two Route Reflectors via BMP,
showing missing prefixes, attribute diffs (next-hop, AS path), and per-client
consistency. Route diversity script deploys 29 prefixes across R9K-01-07 via
NETCONF to create verifiable next-hop differences between RRs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:23:19 -07:00
sam
45f4c9859d Add Authelia auth gateway, portal landing page, and subpath routing
Adds Authelia (forward-auth) and nginx portal container for single-endpoint
authenticated access via Caddy reverse proxy. Configures Grafana auth proxy
for header-based auto-login. Updates Vue UI base paths and API routes for
/exabgp/ and /traffic/ subpath serving. Adds traffic-gen responder container
on dedicated Docker network.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:23:09 -07:00
sam
422b98d555 Fix telemetry dashboards: update Flux queries and InfluxDB datasource URL
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:22:58 -07:00
sam
d691b512f9 Add full internet table injection with background worker and progress tracking
Generates realistic IPv4 routing tables (1K-900K prefixes) with DFZ-like
prefix length distribution, varied AS paths, and transit ASN diversity.
Background injection with progress API, CLI follow mode, and Vue UI
component with preset sizes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:22:51 -07:00
sam
1f0936763b Add traffic generator improvements: mode switching, ping, responder echo, RFC2544 fixes
Adds sender/responder mode switching via API, QuickPing component, echo-mode
responder with dedicated container, improved flow state sync, and RFC2544
test runner enhancements. Includes UI improvements across all traffic-gen
components.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-15 14:22:41 -07:00
sam
c28c9b2527 Fix gNMI telemetry: OpenConfig paths, json_ietf encoding, SSH config
- Switch Telegraf from native IOS-XR YANG paths to OpenConfig
  (openconfig-interfaces:interfaces/interface/state/counters)
- Use json_ietf encoding instead of proto (IOS-XR 24.3.1 compat)
- Target only CORE-01/CORE-02 (R9K routers blocked by CML mgmt net)
- Update all 3 Grafana dashboard queries to match OpenConfig field
  names (in-octets, out-octets, in-pkts, out-pkts, in-errors, etc.)
- Rewrite gnmi_grpc_config.py to use SSH/CLI via paramiko instead of
  NETCONF (IOS-XR 24.3.1 rejects NETCONF gRPC edit-config)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 16:19:16 -07:00
sam
6b45f124f0 Remove __pycache__ from tracking and add to .gitignore
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 15:40:14 -07:00
sam
dcebf15bb3 Add Phase 4: gNMI streaming telemetry and traffic generator
- gNMI integration: NETCONF script to enable gRPC on all 9 routers,
  Telegraf container with gnmi input plugin, InfluxDB for time-series
  storage, 3 Grafana telemetry dashboards (utilization, errors, combined)
- Traffic generator: Scapy-based dual-mode container (sender/responder)
  with Flask API, RFC 2544 test suite (throughput, latency, frame-loss,
  back-to-back), Vue 3 web UI with flow builder, test runner, real-time
  stats monitor, and results export
- docker-compose.yml updated with influxdb, telegraf, traffic-gen,
  traffic-gen-ui services
- Full documentation in DOCS.md sections 15-16

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 15:29:44 -07:00
sam
f23e222bc0 Add Phase 3: TE/SR analytics, anomaly detection, DB schema reference
- 4 new Grafana dashboards:
  - Database Schema Map (obmp-learn-07): interactive schema reference
    with live row counts, relationship diagrams, column details
  - TE & Segment Routing Analytics (obmp-learn-08): exposes BGP-LS TE/SR
    fields (bandwidth, admin groups, SRLG, SR SIDs, protection types)
  - Topology Change & Anomaly Detection (obmp-learn-09): link state
    change tracking, origin AS hijack detection, convergence timeline
  - Link Utilization & TE Thought Experiment (obmp-learn-10): capacity
    data from BGP-LS + streaming telemetry integration guide
- DB_SCHEMA.md: standalone database reference (33 tables, 11 views)
- 3 new ExaBGP scenarios: te_community_steering, origin_shift, path_diversity
- Updated DOCS.md with Phase 3 dashboards and scenarios

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 13:31:03 -07:00
104 changed files with 18166 additions and 3257 deletions

71
.env.example Normal file
View File

@ -0,0 +1,71 @@
# OpenBMP stack configuration — copy to .env and fill in.
# cp .env.example .env && $EDITOR .env && ./setup.sh
# The real .env is git-ignored and never committed.
# ---------------------------------------------------------------------------
# Core deployment
# ---------------------------------------------------------------------------
# Host path for all persistent data (postgres, kafka, grafana, authelia, ...).
OBMP_DATA_ROOT=/var/openbmp
# IP of this host that routers and external clients connect to
# (Kafka external listener, BMP source, ExaBGP peering).
HOST_IP=changeme
# Public domain fronting Grafana / Authelia / portal (TLS terminates upstream).
OBMP_DOMAIN=changeme.example.com
# Authelia session-cookie domain — the parent domain of OBMP_DOMAIN so the
# cookie is valid across subpaths/subdomains.
OBMP_COOKIE_DOMAIN=example.com
# Container memory limits. Lab defaults shown; raise for production
# (see docs/production-sizing.md). psql-app's limit must exceed its MEM heap.
PSQL_MEM_LIMIT=6g
PSQL_APP_MEM_LIMIT=4g
KAFKA_MEM_LIMIT=4g
# ExaBGP — the full-table feature holds up to 900K route objects in memory.
EXABGP_MEM_LIMIT=6g
# gNMI streaming telemetry (telegraf, test profile). GNMI_ADDRESSES is a
# quoted, comma-separated host:port list — add a router here once gNMI/grpc
# is enabled on it and the management path is reachable.
GNMI_ADDRESSES="10.100.0.100:57400", "10.100.0.200:57400"
GNMI_USERNAME=changeme
GNMI_PASSWORD=changeme
# ---------------------------------------------------------------------------
# ExaBGP route injector (test profile)
# ---------------------------------------------------------------------------
EXABGP_LOCAL_IP=changeme
EXABGP_LOCAL_AS=65100
EXABGP_API_PORT=5050
# Semicolon-separated peer list, each entry "ip:peer_as:description".
EXABGP_PEERS=10.100.0.100:65020:CML-R9K-CORE-01;10.100.0.200:65020:CML-R9K-CORE-02
# ---------------------------------------------------------------------------
# CML lab API + IOS-XR NETCONF (used by cml/ automation scripts)
# ---------------------------------------------------------------------------
PROX-CML_URL=http://changeme
PROX-CML_USERNAME=changeme
PROX-CML_PASSWORD=changeme
# Default IOS-XR NETCONF credentials, plus the admin-tier override for routers
# that use a separate account.
IOSXR_NETCONF_USER=changeme
IOSXR_NETCONF_PASS=changeme
IOSXR_NETCONF_ADMIN_USER=changeme
IOSXR_NETCONF_ADMIN_PASS=changeme
# ---------------------------------------------------------------------------
# Integrations
# ---------------------------------------------------------------------------
GITEA_API_KEY=changeme
# ---------------------------------------------------------------------------
# Authelia secrets — leave BLANK; setup.sh generates them with openssl on a
# fresh host and appends them here. Existing values are never overwritten.
# ---------------------------------------------------------------------------
AUTHELIA_SESSION_SECRET=
AUTHELIA_JWT_SECRET=
AUTHELIA_STORAGE_ENCRYPTION_KEY=

2
.gitignore vendored
View File

@ -2,4 +2,6 @@
*.log
.env
.claude/
__pycache__/
*.pyc

387
DB_SCHEMA.md Normal file
View File

@ -0,0 +1,387 @@
# OpenBMP Database Schema Reference
PostgreSQL database `openbmp` with TimescaleDB extension for time-series data.
## Entity Relationship Diagram
```
collectors
└── routers (collector_hash_id)
└── bgp_peers (router_hash_id)
├── ip_rib (peer_hash_id) ──► base_attrs (base_attr_hash_id)
├── ip_rib_log (peer_hash_id)
├── l3vpn_rib (peer_hash_id) ──► base_attrs
├── ls_nodes (peer_hash_id)
├── ls_links (peer_hash_id) ──► ls_nodes (local/remote_node_hash_id)
├── ls_prefixes (peer_hash_id) ──► ls_nodes (local_node_hash_id)
├── peer_event_log (peer_hash_id)
├── stat_reports (peer_hash_id)
└── stats_* tables (peer_hash_id)
ip_rib.prefix ◄──► global_ip_rib.prefix (aggregated view)
├── rpki_origin_as ◄── rpki_validator
└── irr_origin_as ◄── info_route
base_attrs.origin_as ──► info_asn.asn (ASN enrichment)
routers.geo_ip_start ──► geo_ip.ip (geolocation)
```
---
## BMP Core Tables
### routers
BMP-monitored routers (one row per monitored device).
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Primary key |
| name | varchar(200) | Router hostname |
| ip_address | inet | Router management IP |
| router_as | bigint | Router ASN |
| bgp_id | inet | BGP router-id |
| collector_hash_id | uuid | FK to collectors |
| state | opstate | up / down |
| timestamp | timestamp | Last update time |
| description | varchar(255) | Router description |
| init_data | text | BMP init message data |
| term_reason_code | int | BMP termination reason |
### collectors
BMP collector instances.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Primary key |
| admin_id | varchar(64) | Admin identifier |
| name | varchar(200) | Collector name |
| ip_address | varchar(40) | Collector IP |
| state | opstate | up / down |
| router_count | smallint | Number of monitored routers |
### bgp_peers
BGP sessions per router (one row per peer per router).
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Primary key (composite with router_hash_id) |
| router_hash_id | uuid | FK to routers |
| peer_addr | inet | Peer IP address |
| peer_as | bigint | Peer ASN |
| peer_bgp_id | inet | Peer BGP router-id |
| name | varchar(200) | Peer name |
| state | opstate | up / down |
| isl3vpnpeer | boolean | L3VPN peer flag |
| isipv4 | boolean | IPv4 peer |
| isprepolicy | boolean | Pre-policy RIB |
| islocrib | boolean | Local RIB |
| local_ip | inet | Local IP |
| local_asn | bigint | Local ASN |
| local_hold_time | smallint | Local hold time |
| remote_hold_time | smallint | Remote hold time |
| sent_capabilities | varchar(4096) | BGP capabilities sent |
| recv_capabilities | varchar(4096) | BGP capabilities received |
| table_name | varchar(255) | VRF/table name |
### peer_event_log (TimescaleDB)
Historical BGP session state changes.
| Column | Type | Description |
|--------|------|-------------|
| id | bigint | Event sequence |
| peer_hash_id | uuid | FK to bgp_peers |
| state | opstate | up / down |
| timestamp | timestamp | Event time (partition key) |
| bmp_reason | smallint | BMP reason code |
| bgp_err_code | smallint | BGP error code |
| bgp_err_subcode | smallint | BGP error subcode |
| error_text | varchar(255) | Error description |
---
## BGP Path Attributes
### base_attrs
BGP path attributes shared across routes.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Primary key |
| peer_hash_id | uuid | FK to bgp_peers |
| origin | varchar(16) | IGP / EGP / Incomplete |
| as_path | bigint[] | AS path array |
| as_path_count | smallint | AS path length |
| origin_as | bigint | Origin ASN |
| next_hop | inet | BGP next-hop |
| med | bigint | Multi-Exit Discriminator |
| local_pref | bigint | Local preference |
| community_list | varchar(15)[] | Standard communities |
| ext_community_list | varchar(50)[] | Extended communities (RT, etc.) |
| large_community_list | varchar(40)[] | Large communities (RFC 8092) |
| cluster_list | varchar(40)[] | Route reflector cluster list |
| isatomicagg | boolean | Atomic aggregate flag |
| originator_id | inet | RR originator ID |
| aggregator | varchar(64) | Aggregator |
**Indexes**: GIN on as_path, community_list, ext_community_list, large_community_list
---
## IP RIB Tables
### ip_rib
Current IPv4/IPv6 unicast routing table.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Route hash |
| peer_hash_id | uuid | FK to bgp_peers (composite PK) |
| base_attr_hash_id | uuid | FK to base_attrs |
| prefix | inet | IP prefix |
| prefix_len | smallint | Prefix length |
| origin_as | bigint | Origin ASN |
| isipv4 | boolean | IPv4 flag |
| iswithdrawn | boolean | Withdrawn flag |
| labels | varchar(255) | MPLS labels |
| path_id | bigint | Add-Path ID |
| isprepolicy | boolean | Pre-policy flag |
| isadjribin | boolean | Adj-RIB-In flag |
| timestamp | timestamp | Last update |
| first_added_timestamp | timestamp | First seen |
### ip_rib_log (TimescaleDB)
Historical RIB changes — every advertisement and withdrawal.
| Column | Type | Description |
|--------|------|-------------|
| id | bigint | Change event ID |
| peer_hash_id | uuid | FK to bgp_peers |
| base_attr_hash_id | uuid | FK to base_attrs |
| prefix | inet | IP prefix |
| prefix_len | smallint | Prefix length |
| origin_as | bigint | Origin ASN |
| iswithdrawn | boolean | Withdrawal flag |
| timestamp | timestamp | Event time (partition key) |
### global_ip_rib
Aggregated prefix summary across all peers.
| Column | Type | Description |
|--------|------|-------------|
| prefix | inet | IP prefix (composite PK) |
| prefix_len | smallint | Prefix length |
| recv_origin_as | bigint | Received origin AS |
| rpki_origin_as | bigint | RPKI-validated origin AS |
| irr_origin_as | bigint | IRR-registered origin AS |
| irr_source | varchar(32) | IRR source (RADB, RIPE, etc.) |
| num_peers | int | Total advertising peers |
| iswithdrawn | boolean | Withdrawn flag |
---
## L3VPN Tables
### l3vpn_rib
L3VPN (RFC 4364) routes with Route Distinguisher.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Route hash |
| peer_hash_id | uuid | FK to bgp_peers |
| base_attr_hash_id | uuid | FK to base_attrs |
| rd | varchar(128) | Route Distinguisher |
| prefix | inet | VPN prefix |
| prefix_len | smallint | Prefix length |
| origin_as | bigint | Origin ASN |
| labels | varchar(255) | MPLS VPN labels |
| ext_community_list | varchar(50)[] | Route Targets |
| path_id | bigint | Add-Path ID |
| iswithdrawn | boolean | Withdrawn flag |
### l3vpn_rib_log (TimescaleDB)
Historical L3VPN route changes.
---
## Link-State Tables (BGP-LS / RFC 7752)
### ls_nodes
IS-IS / OSPF node information from BGP-LS.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Node hash |
| peer_hash_id | uuid | FK to bgp_peers (composite PK) |
| base_attr_hash_id | uuid | FK to base_attrs |
| asn | bigint | Node ASN |
| bgp_ls_id | bigint | BGP-LS Identifier |
| igp_router_id | varchar(46) | IGP Router ID |
| router_id | varchar(46) | BGP Router ID |
| protocol | ls_proto | IS-IS_L1, IS-IS_L2, OSPFv2, OSPFv3 |
| isis_area_id | varchar(46) | IS-IS area |
| ospf_area_id | varchar(16) | OSPF area |
| name | varchar(255) | Node hostname |
| flags | varchar(20) | Node flags |
| mt_ids | varchar(128) | Multi-Topology IDs |
| **sr_capabilities** | **varchar(255)** | **SR Global Block (SRGB) ranges** |
| iswithdrawn | boolean | Withdrawn flag |
### ls_links
IS-IS / OSPF links with full TE and SR attributes.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Link hash |
| peer_hash_id | uuid | FK to bgp_peers (composite PK) |
| local_node_hash_id | uuid | FK to ls_nodes (local end) |
| remote_node_hash_id | uuid | FK to ls_nodes (remote end) |
| local_router_id | varchar(46) | Local BGP Router ID |
| remote_router_id | varchar(46) | Remote BGP Router ID |
| local_igp_router_id | varchar(46) | Local IGP Router ID |
| remote_igp_router_id | varchar(46) | Remote IGP Router ID |
| interface_addr | inet | Local interface IP |
| neighbor_addr | inet | Remote interface IP |
| igp_metric | bigint | IGP metric |
| protocol | ls_proto | IGP protocol |
| mt_id | int | Multi-Topology ID |
| local_link_id | bigint | Local link identifier |
| remote_link_id | bigint | Remote link identifier |
| name | varchar(255) | Link name |
| **admin_group** | **bigint** | **TE admin group / link color bitmap** |
| **max_link_bw** | **bigint** | **Maximum link bandwidth (bytes/sec)** |
| **max_resv_bw** | **bigint** | **Maximum reservable bandwidth** |
| **unreserved_bw** | **varchar(128)** | **Unreserved BW per priority (8 values)** |
| **te_def_metric** | **bigint** | **TE default metric (for CSPF)** |
| **protection_type** | **varchar(60)** | **Link protection (FRR type)** |
| **mpls_proto_mask** | **ls_mpls_proto_mask** | **MPLS protocol support flags** |
| **srlg** | **varchar(128)** | **Shared Risk Link Group** |
| **peer_node_sid** | **varchar(128)** | **SR Peer Node SID (EPE, RFC 9086)** |
| **sr_adjacency_sids** | **varchar(255)** | **SR Adjacency SIDs** |
| iswithdrawn | boolean | Withdrawn flag |
**Bold** = TE/SR fields available via BGP-LS but not used by default dashboards.
### ls_prefixes
IS-IS / OSPF prefix information.
| Column | Type | Description |
|--------|------|-------------|
| hash_id | uuid | Prefix hash |
| peer_hash_id | uuid | FK to bgp_peers (composite PK) |
| local_node_hash_id | uuid | FK to ls_nodes |
| prefix | inet | Advertised prefix |
| prefix_len | smallint | Prefix length |
| protocol | ls_proto | IGP protocol |
| metric | bigint | Prefix metric |
| mt_id | int | Multi-Topology ID |
| ospf_route_type | ospf_route_type | Intra/Inter/Ext-1/Ext-2/NSSA |
| igp_flags | varchar(20) | IGP flags |
| route_tag | bigint | Route tag |
| **sr_prefix_sids** | **varchar(255)** | **SR Prefix SIDs (node SIDs)** |
| iswithdrawn | boolean | Withdrawn flag |
### ls_nodes_log, ls_links_log, ls_prefixes_log (TimescaleDB)
Historical link-state changes. Same columns as parent tables plus `id` (bigint) and timestamp as partition key.
---
## Statistics Tables (TimescaleDB)
| Table | Purpose | Key Columns |
|-------|---------|-------------|
| **stat_reports** | BMP stat messages per peer | prefixes_rejected, known_dup_prefixes, num_routes_adj_rib_in, num_routes_local_rib |
| **stats_chg_byprefix** | Per-prefix update/withdrawal counts | interval_time, prefix, updates, withdraws |
| **stats_chg_byasn** | Per-ASN update/withdrawal counts | interval_time, origin_as, updates, withdraws |
| **stats_chg_bypeer** | Per-peer update/withdrawal counts | interval_time, updates, withdraws |
| **stats_peer_rib** | Per-peer RIB size over time | interval_time, v4_prefixes, v6_prefixes |
| **stats_peer_update_counts** | Update rate statistics | interval_time, advertise_avg/min/max, withdraw_avg/min/max |
| **stats_ip_origins** | Per-ASN IP prefix counts | interval_time, asn, v4_prefixes, v6_prefixes, v4_with_rpki, v4_with_irr |
| **stats_l3vpn_chg_byprefix** | L3VPN per-prefix stats | interval_time, rd, prefix, updates, withdraws |
| **stats_l3vpn_chg_bypeer** | L3VPN per-peer stats | interval_time, updates, withdraws |
| **stats_l3vpn_chg_byrd** | L3VPN per-RD stats | interval_time, rd, updates, withdraws |
---
## Reference & Enrichment Tables
| Table | Purpose | Key Columns |
|-------|---------|-------------|
| **rpki_validator** | RPKI ROAs | prefix, prefix_len, prefix_len_max, origin_as |
| **info_asn** | ASN WHOIS/IRR data | asn, as_name, org_name, country, source |
| **info_route** | Route IRR data | prefix, origin_as, descr, source |
| **geo_ip** | IP geolocation (DB-IP) | ip, country, city, latitude, longitude, isp_name |
| **pdb_exchange_peers** | PeeringDB IXP peering | ix_name, peer_name, peer_asn, speed, peer_ipv4/ipv6 |
---
## Views
| View | Joins | Purpose |
|------|-------|---------|
| **v_peers** | bgp_peers + routers + info_asn | Complete peer info with router name and ASN details |
| **v_ip_routes** | ip_rib + bgp_peers + base_attrs + routers | Full route detail with path attributes |
| **v_ip_routes_geo** | v_ip_routes + geo_ip | Routes with geolocation |
| **v_ip_routes_history** | ip_rib_log + base_attrs + bgp_peers + routers | Historical route changes with attributes |
| **v_l3vpn_routes** | l3vpn_rib + bgp_peers + base_attrs + routers | L3VPN routes with path attributes |
| **v_l3vpn_routes_history** | l3vpn_rib_log + base_attrs + bgp_peers + routers | Historical L3VPN changes |
| **v_ls_nodes** | ls_nodes + base_attrs + bgp_peers + routers | Link-state nodes with peer/router info |
| **v_ls_links** | ls_links + ls_nodes(x2) + routers | Links with local/remote node names + all TE/SR fields |
| **v_ls_prefixes** | ls_prefixes + ls_nodes + routers | LS prefixes with originating node info |
---
## Custom Enum Types
| Type | Values |
|------|--------|
| **opstate** | up, down |
| **ls_proto** | IS-IS_L1, IS-IS_L2, OSPFv2, OSPFv3, Direct, Static |
| **ospf_route_type** | Intra, Inter, Ext-1, Ext-2, NSSA-1, NSSA-2 |
| **ls_mpls_proto_mask** | MPLS protocol bitmask |
| **user_role** | admin, oper |
---
## Key Query Patterns
### Get all active routes with full attributes
```sql
SELECT r.prefix, r.prefix_len, ba.origin_as, ba.as_path,
ba.med, ba.local_pref, ba.community_list, ba.next_hop
FROM ip_rib r
JOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id
WHERE r.iswithdrawn = false AND r.isipv4 = true
```
### Get link-state topology with TE attributes
```sql
SELECT local_router_name, remote_router_name,
igp_metric, te_def_metric, max_link_bw, admin_group, srlg,
sr_adjacency_sids
FROM v_ls_links
WHERE peer_hash_id = '<peer_hash>' AND iswithdrawn = false
```
### Time-series RIB changes
```sql
SELECT date_trunc('minute', timestamp) as time,
SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) as ads,
SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) as withdrawals
FROM ip_rib_log
WHERE timestamp > NOW() - INTERVAL '24 hours'
GROUP BY 1 ORDER BY 1
```
### RPKI validation status
```sql
SELECT CASE
WHEN rv.origin_as IS NOT NULL AND rv.origin_as = r.origin_as THEN 'Valid'
WHEN rv.origin_as IS NOT NULL THEN 'Invalid'
ELSE 'NotFound'
END as status,
COUNT(*)
FROM ip_rib r
LEFT JOIN rpki_validator rv ON rv.prefix = r.prefix AND rv.prefix_len = r.prefix_len
WHERE r.iswithdrawn = false
GROUP BY 1
```

309
DOCS.md
View File

@ -16,6 +16,8 @@
12. [Troubleshooting](#12-troubleshooting)
13. [Data Retention](#13-data-retention)
14. [Environment Variables Reference](#14-environment-variables-reference)
15. [gNMI Streaming Telemetry (Phase 4)](#15-gnmi-streaming-telemetry-phase-4)
16. [Traffic Generator (Phase 4)](#16-traffic-generator-phase-4)
---
@ -28,7 +30,7 @@ This is a **BGP Monitoring Platform (BMP) lab stack** deployed via Docker Compos
- Receives BMP (BGP Monitoring Protocol, RFC 7854) telemetry from routers on TCP port 5000
- Streams BMP data through Kafka into a TimescaleDB/PostgreSQL database
- Provides **23 Grafana dashboards** (17 operational + 6 learning-focused) for real-time and historical BGP analysis
- Provides **30 Grafana dashboards** (17 operational + 6 learning + 4 advanced analytics + 3 streaming telemetry) for real-time and historical BGP analysis
- Includes an **ExaBGP route injector** that peers with the two CORE routers and injects synthetic BGP routes, enabling testing of BGP policy, route propagation, and Grafana dashboards without needing internet connectivity
- Provides a **Vue 3 web UI** at `:5001` for point-and-click scenario management, live route tables, and peer monitoring
@ -64,7 +66,7 @@ IOS-XR Routers (9x, AS 65020)
PostgreSQL 14 + TimescaleDB
|
+---------> obmp-grafana (grafana/grafana:9.1.7) :3000
| 23 dashboards, PostgreSQL datasource
| 30 dashboards, PostgreSQL + InfluxDB datasources
+---------> obmp-whois (openbmp/whois:2.2.0) :4300
WHOIS query server backed by the DB
@ -73,6 +75,24 @@ ExaBGP (obmp-exabgp, built locally)
Peers eBGP to CORE-01 and CORE-02 (AS 65100 -> AS 65020)
HTTP API on :5050 — inject/withdraw routes on demand
Routes propagate via iBGP mesh to all 9 routers -> BMP -> DB -> Grafana
gNMI Streaming Telemetry (Phase 4):
IOS-XR Routers (gRPC :57400)
|
v
obmp-telegraf (telegraf:1.28 + gnmi plugin)
|
v
obmp-influxdb (influxdb:2.7) :8086
|
v
obmp-grafana (InfluxDB datasource -> Telemetry dashboards)
Traffic Generator (Phase 4):
obmp-traffic-gen (python:3.11 + Scapy + Flask) :5051
Dual-mode: sender (generate traffic) / responder (echo/log)
RFC 2544 testing, custom packet flows
obmp-traffic-gen-ui (Vue 3 + NGINX) :5002
```
### Container Summary
@ -87,7 +107,11 @@ ExaBGP (obmp-exabgp, built locally)
| obmp-grafana | grafana/grafana:9.1.7 | 3000 | Visualization |
| obmp-whois | openbmp/whois:2.2.0 | 4300 | WHOIS query server |
| obmp-exabgp | local build | 5050 (host net) | BGP route injector |
| obmp-exabgp-ui | local build | 5001 (host net) | Vue 3 web control panel |
| obmp-exabgp-ui | local build | 5001 (host net) | Route injector web UI |
| obmp-influxdb | influxdb:2.7 | 8086 | Time-series DB for telemetry |
| obmp-telegraf | local build | - (host net) | gNMI telemetry collector |
| obmp-traffic-gen | local build | 5051 (host net) | Scapy traffic generator |
| obmp-traffic-gen-ui | local build | 5002 (host net) | Traffic generator web UI |
---
@ -103,6 +127,37 @@ ExaBGP (obmp-exabgp, built locally)
## 4. Initial Setup (First Time)
### 4.0 Quick deploy (recommended)
`setup.sh` bootstraps a fresh host — it creates the data directories, syncs
Grafana provisioning, generates Authelia secrets, and renders config. It is
idempotent and safe to re-run.
```bash
git clone <this-repo-url>
cd obmp-docker
cp .env.example .env
$EDITOR .env # set HOST_IP, OBMP_DOMAIN, OBMP_COOKIE_DOMAIN, credentials
./setup.sh
docker compose up -d # BMP collector core only
docker compose --profile test --profile auth up -d # full stack (lab tools + auth)
```
The stack uses Docker Compose **profiles**:
| Command | Brings up |
|---------|-----------|
| `docker compose up -d` | Collector core only — zookeeper, kafka, collector, psql, psql-app, grafana, whois |
| `docker compose --profile test up -d` | Core **+** ExaBGP, traffic generator, telegraf, influxdb |
| `docker compose --profile auth up -d` | Core **+** Authelia gateway and portal |
| `docker compose --profile test --profile auth up -d` | Everything |
The bare `docker compose up` is the shippable standalone BMP collector — it has
no dependency on the lab/test tooling.
The sections below (4.14.6) document the equivalent **manual** steps if you
prefer not to use `setup.sh`.
### 4.1 Clone the repository
```bash
@ -224,6 +279,22 @@ See `exabgp/iosxr_bgp_config.md` for a Python/ncclient script that pushes all of
Credentials: `username=webui`, `password=cisco`, port 830.
### 5.6 Bulk BMP config (`cml/proxmox_bmp_config.py`)
To point a whole lab of IOS-XR routers at the BMP collector at once,
`cml/proxmox_bmp_config.py` applies the `bmp server 1` block over SSH (IOS-XR
BMP config is not exposed via NETCONF YANG on current releases). It is
idempotent.
```bash
pip install paramiko
python3 cml/proxmox_bmp_config.py # all routers in the inventory
python3 cml/proxmox_bmp_config.py r9k-05 # a single router (smoke test)
```
Edit the `ROUTERS` list at the top of the script for your inventory and the
`COLLECTOR_HOST` constant for the collector address.
---
## 6. Starting and Stopping
@ -312,6 +383,9 @@ python3 inject.py scenarios
| `convergence_test` | 10 | Prefixes for timing BGP convergence — announce then check ip_rib_log timestamps |
| `route_leak` | 10 | Real prefixes re-announced with short AS paths — simulates a route leak (community 65100:999) |
| `hijack_simulation` | 10 | Prefixes claimed directly by AS 65100 — simulates a prefix hijack (community 65100:hijack) |
| `te_community_steering` | 15 | Routes tagged with TE communities for color-based steering (65020:100=red, 65020:200=blue, 65020:300=green) |
| `origin_shift` | 5 | Prefixes with changed origin AS — simulates origin migration for anomaly detection |
| `path_diversity` | 10 | Same prefixes with different AS paths/MEDs — demonstrates best-path selection |
### 7.4 Load a scenario
@ -495,6 +569,23 @@ Six learning-focused dashboards in a separate folder, designed to teach BGP conc
> **RPKI note:** The `rpki_validator` table is populated by a cron job in `psql-app` every 2 hours. Dashboard `obmp-learn-04` will show zero counts until the cron runs — check `ENABLE_RPKI=1` in `docker-compose.yml`.
### Advanced Analytics Dashboards (folder: `OBMP-Learning`)
Four advanced dashboards that go beyond basic BMP monitoring, unlocking TE/SR data and providing heuristic analysis.
| Dashboard | UID | What it provides |
|-----------|-----|-----------------|
| Database Schema Map | `obmp-learn-07` | Interactive schema reference — live table row counts, entity relationships, column details for all 33 tables and 11 views |
| TE & Segment Routing Analytics | `obmp-learn-08` | Exposes TE/SR fields from BGP-LS: link bandwidth, admin groups, SRLG, SR SIDs, adjacency SIDs, protection types |
| Topology Change & Anomaly Detection | `obmp-learn-09` | Heuristic analysis: link state changes over time, origin AS hijack detection, convergence timeline, route consistency |
| Link Utilization & TE Thought Experiment | `obmp-learn-10` | BGP-LS capacity data (bandwidth, TE metrics) + integration guide for streaming telemetry (gNMI/MDT) |
> **TE/SR data note:** Some TE fields (admin_group, max_link_bw, srlg, sr_adjacency_sids) may be NULL if routers don't advertise those TLVs. Enable `mpls traffic-eng` under IS-IS and `segment-routing mpls` for full data.
### Database Schema Reference
A standalone database schema reference is also available at `DB_SCHEMA.md` in the repo root. It documents all 33 tables, 11 views, TE/SR columns, enum types, and common query patterns.
---
## 10. Sanity Checks
@ -810,3 +901,215 @@ Adjust in `docker-compose.yml` under the `psql-app` service environment block.
| Variable | Default | Description |
|----------|---------|-------------|
| `EXABGP_API` | `http://localhost:5050` | ExaBGP API base URL |
---
## 15. gNMI Streaming Telemetry (Phase 4)
### Overview
gNMI (gRPC Network Management Interface) adds **data-plane visibility** alongside BMP's control-plane monitoring. Telegraf collects real-time interface counters from all 9 IOS-XR routers via gNMI subscriptions and stores them in InfluxDB. Grafana queries InfluxDB for telemetry dashboards.
### Architecture
```
IOS-XR Routers (9x, gRPC port 57400)
|
gNMI subscriptions (10s sample)
|
v
obmp-telegraf (telegraf:1.28 + gnmi input plugin)
host networking → reaches routers on 10.100.0.x
|
v
obmp-influxdb (influxdb:2.7, port 8086)
bucket: "telemetry", org: "openbmp"
|
v
obmp-grafana (InfluxDB datasource, Flux queries)
3 dashboards in OBMP-Telemetry folder
```
### Enabling gRPC on Routers
The routers need gRPC enabled before Telegraf can collect telemetry. A NETCONF script is provided:
```bash
# From the host (requires ncclient: pip install ncclient)
cd /home/user/obmp-docker/gnmi
python3 gnmi_grpc_config.py
```
This connects to all 9 routers via NETCONF (port 830, credentials webui/cisco) and pushes:
```
grpc
port 57400
no-tls
```
**Verify on router:**
```
show grpc status
```
Expected: gRPC listening on port 57400.
### Telemetry Data Collected
Telegraf subscribes to two IOS-XR YANG paths at 10-second intervals:
| Subscription | YANG Path | Data |
|-------------|-----------|------|
| interface_counters | `Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/generic-counters` | bytes/packets in/out, errors, drops, CRC |
| interface_rates | `Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/data-rate` | bits/sec in/out, packet rate |
### InfluxDB Access
- **URL:** `http://localhost:8086`
- **Org:** `openbmp`
- **Bucket:** `telemetry`
- **Token:** `openbmp-telemetry-token`
- **Retention:** 30 days
### Grafana Telemetry Dashboards
Three dashboards in the **OBMP-Telemetry** folder:
| Dashboard | UID | Description |
|-----------|-----|-------------|
| Interface Utilization | obmp-telem-01 | Input/output bytes rate, packets rate, top interfaces by throughput |
| Interface Errors | obmp-telem-02 | CRC errors, input/output errors, drops, overruns |
| Combined BMP + Telemetry | obmp-telem-03 | Mixed datasource — BGP peer status (PostgreSQL) alongside interface counters (InfluxDB) |
All dashboards have `$router` and `$interface` template variables for filtering.
### Troubleshooting gNMI
```bash
# Check Telegraf logs for gNMI connection status
docker logs obmp-telegraf --tail 50
# Verify InfluxDB has data
curl -s -H "Authorization: Token openbmp-telemetry-token" \
"http://localhost:8086/api/v2/query?org=openbmp" \
--data-urlencode 'q=from(bucket:"telemetry") |> range(start: -5m) |> limit(n:5)'
# Check InfluxDB health
curl http://localhost:8086/health
```
---
## 16. Traffic Generator (Phase 4)
### Overview
A portable, containerized traffic generator with a web UI for RFC 2544 testing and custom packet flows. Built with Scapy + Flask (backend) and Vue 3 + NGINX (frontend). The container supports **dual-mode operation**: sender (generate traffic) or responder (receive/echo packets).
### Accessing the UI
- **Web UI:** `http://localhost:5002`
- **API:** `http://localhost:5051`
### Dual-Mode Operation
Set via `TRAFFIC_GEN_MODE` environment variable in `docker-compose.yml`:
| Mode | Description |
|------|-------------|
| `sender` (default) | Generates traffic, runs RFC 2544 tests, sends custom flows |
| `responder` | Listens for incoming test packets, echoes/timestamps them, reports receive stats |
**Typical deployment:** One instance as `sender` on the host, optionally a second instance as `responder` on another endpoint. Without a responder, the sender uses ICMP echo for latency measurement (routers respond natively).
### Creating Flows
Use the **Flow Builder** panel (left sidebar) in the UI:
| Field | Default | Description |
|-------|---------|-------------|
| Name | - | Human-readable flow name |
| Destination IP | `10.100.0.100` | Target router IP |
| Source IP | `10.40.40.202` | Host IP |
| Protocol | UDP | UDP, TCP, or ICMP |
| Source Port | 50000 | (UDP/TCP only) |
| Destination Port | 5001 | (UDP/TCP only) |
| Frame Size | 512 | Packet size in bytes |
| Rate (pps) | 1000 | Packets per second |
| Duration | 30 | Seconds (0 = infinite) |
| DSCP | 0 | Differentiated Services Code Point |
After creating a flow, use the **Flows** tab to Start/Stop/Delete flows.
### RFC 2544 Testing
Use the **Tests** tab to configure and run RFC 2544 tests:
| Test Type | Description |
|-----------|-------------|
| **Throughput** | Binary search for maximum zero-loss forwarding rate |
| **Latency** | Measure round-trip time at determined throughput rate |
| **Frame Loss** | Loss percentage vs. offered load curve |
| **Back-to-Back** | Maximum burst length at line rate with zero loss |
**Parameters:**
- **Base Flow:** Select a previously created flow as the test template
- **Frame Sizes:** Standard sizes: 64, 128, 256, 512, 1024, 1280, 1518 bytes
- **Trial Duration:** Per-frame-size test duration (5300 sec)
- **Max Rate (pps):** Upper bound for binary search
- **Acceptable Loss %:** Threshold for pass/fail
### Quick Presets
Six built-in presets are available in the **Tests** tab:
| Preset | Description |
|--------|-------------|
| quick_icmp | ICMP ping to CORE-01 at 10 pps |
| udp_flood_small | 64-byte UDP at 5000 pps |
| udp_flood_large | 1518-byte UDP at 1000 pps |
| rfc2544_throughput | Full throughput test with standard frame sizes |
| rfc2544_latency | Latency measurement with standard frame sizes |
| tcp_session | TCP flow at 500 pps |
### API Reference
| Method | Path | Description |
|--------|------|-------------|
| GET | `/healthz` | Health check + engine status |
| GET | `/interfaces` | Available network interfaces |
| GET | `/mode` | Current mode (sender/responder) |
| GET/POST | `/flows` | List / create flows |
| GET/PUT/DELETE | `/flows/<id>` | Get / update / delete flow |
| POST | `/flows/<id>/start` | Start sending |
| POST | `/flows/<id>/stop` | Stop sending |
| GET | `/flows/<id>/stats` | Real-time stats for a flow |
| GET/POST | `/tests` | List / create RFC 2544 tests |
| GET | `/tests/<id>` | Test details + results |
| POST | `/tests/<id>/start` | Start test execution |
| POST | `/tests/<id>/stop` | Abort test |
| GET | `/tests/<id>/results` | Exportable results |
| GET | `/presets` | Available test presets |
| POST | `/presets/<name>` | Create flow + test from preset |
| GET | `/stats/history` | Stats ring buffer (300 samples) |
| GET | `/responder/stats` | Responder-mode receive stats |
| POST | `/responder/reset` | Reset responder counters |
### Integration with gNMI Telemetry
The key value of combining the traffic generator with gNMI: **send traffic while watching real-time interface counters**.
1. Create a UDP flow targeting a router (e.g., R9K-01 at 10.100.0.1)
2. Open the Grafana **Interface Utilization** dashboard, select that router
3. Start the flow — gNMI counters show traffic appearing on the interface
4. Run an RFC 2544 throughput test — Grafana shows the stepped traffic pattern from binary search iterations
5. Compare Scapy-reported stats with gNMI-reported counters for cross-validation
The **Combined BMP + Telemetry** dashboard shows both control-plane (BMP BGP updates) and data-plane (gNMI interface counters) side by side, enabling correlation of BGP changes with traffic impact.
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `TRAFFIC_GEN_API_PORT` | `5051` | Flask API listen port |
| `TRAFFIC_GEN_MODE` | `sender` | Operating mode: `sender` or `responder` |
| `INFLUXDB_TOKEN` | `openbmp-telemetry-token` | InfluxDB auth token (Telegraf) |

View File

@ -35,6 +35,15 @@ Each docker file contains a readme file, see below:
## Using Docker Compose to run everything
> **Quick start (recommended):** copy `.env.example` to `.env`, fill it in, and
> run `./setup.sh` — it creates the data directories, syncs Grafana
> provisioning, and generates Authelia secrets. Then:
> ```
> docker compose up -d # BMP collector core
> docker compose --profile test --profile auth up -d # full stack
> ```
> See [DOCS.md](DOCS.md) section 4 for details and the manual alternative below.
### Install Docker Compose
You will need docker-compose. You can install that via [Docker Compose](https://docs.docker.com/compose/install/)
instructions. Docker compose will run everything, including handling restarts of containers.

View File

@ -0,0 +1,51 @@
---
# Authelia configuration template.
# setup.sh renders this to ${OBMP_DATA_ROOT}/authelia/configuration.yml,
# substituting the ${...} values from .env. Only rendered if the target
# file does not already exist — an existing deployment is never overwritten.
theme: dark
server:
address: 'tcp://0.0.0.0:9091/authelia'
endpoints:
authz:
forward-auth:
implementation: ForwardAuth
log:
level: info
totp:
issuer: openbmp
authentication_backend:
file:
path: /config/users_database.yml
password:
algorithm: bcrypt
iterations: 12
session:
name: authelia_session
secret: ${AUTHELIA_SESSION_SECRET}
expiration: 12h
inactivity: 6h
cookies:
- domain: ${OBMP_COOKIE_DOMAIN}
authelia_url: https://${OBMP_DOMAIN}/authelia
identity_validation:
reset_password:
jwt_secret: ${AUTHELIA_JWT_SECRET}
storage:
local:
path: /config/db.sqlite3
encryption_key: ${AUTHELIA_STORAGE_ENCRYPTION_KEY}
access_control:
default_policy: one_factor
notifier:
filesystem:
filename: /config/notification.txt

View File

@ -0,0 +1,15 @@
---
# Authelia user database template.
# setup.sh copies this to ${OBMP_DATA_ROOT}/authelia/users_database.yml only
# if that file does not already exist. The bcrypt hash below is the default
# demo account (username: openbmp). Change it after first login, or generate
# a new hash with:
# docker run --rm authelia/authelia:4.38 \
# authelia crypto hash generate bcrypt --password '<new-password>'
users:
openbmp:
displayname: "OpenBMP Demo"
password: "$2b$12$KQiQo1bYWqadD51HlgfgO.M1JfVlA5qP2YVRoBMTPmWq6osPljUTW"
email: demo@apodacalab.com
groups:
- admins

53
cml/build-cml-image.sh Executable file
View File

@ -0,0 +1,53 @@
#!/bin/bash
# Build the ExaBGP Docker image and export it for CML 2.9 import.
#
# Usage:
# ./cml/build-cml-image.sh
#
# Output:
# /tmp/obmp-exabgp.tar — upload this to CML via:
# Tools > Node and Image Definitions > Image Definitions > Manage Image Uploads
#
# After upload, also import the node + image definition YAMLs:
# Tools > Node and Image Definitions > Import > cml/exabgp-node-definition.yaml
# Tools > Node and Image Definitions > Import > cml/exabgp-image-definition.yaml
set -e
cd "$(dirname "$0")/.."
echo "=== Building ExaBGP Docker image ==="
docker build -t obmp-exabgp:latest ./exabgp/
echo ""
echo "=== Exporting image to /tmp/obmp-exabgp.tar ==="
docker save -o /tmp/obmp-exabgp.tar obmp-exabgp:latest
echo ""
echo "=== Image details ==="
SIZE=$(du -h /tmp/obmp-exabgp.tar | cut -f1)
echo " File: /tmp/obmp-exabgp.tar ($SIZE)"
SHA=$(sha256sum /tmp/obmp-exabgp.tar | awk '{print $1}')
echo " SHA256: $SHA"
IMAGE_ID=$(docker image inspect obmp-exabgp:latest --format='{{.Id}}')
echo " Image ID: $IMAGE_ID"
echo ""
echo "=== Next steps ==="
echo "1. Update cml/exabgp-image-definition.yaml with:"
echo " sha256: $SHA"
echo ""
echo "2. Upload to CML:"
echo " a. Tools > Node and Image Definitions > Import"
echo " Upload: cml/exabgp-node-definition.yaml"
echo " b. Tools > Node and Image Definitions > Import"
echo " Upload: cml/exabgp-image-definition.yaml"
echo " c. Tools > Node and Image Definitions > Image Definitions > Manage Image Uploads"
echo " Upload: /tmp/obmp-exabgp.tar"
echo ""
echo "3. In your CML lab topology:"
echo " a. Drag 'ExaBGP Route Injector' from the node palette"
echo " b. Draw links to CORE-01 and CORE-02"
echo " c. Edit the boot.sh in the node config to set correct IPs"
echo " d. Start the node"

62
cml/build-xrd-image.sh Executable file
View File

@ -0,0 +1,62 @@
#!/bin/bash
# Export the XRd control-plane Docker image for CML 2.9 import.
#
# Usage:
# ./cml/build-xrd-image.sh
#
# The XRd image already exists locally (ios-xr/xrd-control-plane:25.1.1).
# This script just exports it to a .tar file for CML upload.
set -e
IMAGE="ios-xr/xrd-control-plane:25.1.1"
OUTPUT="/tmp/xrd-control-plane.tar"
echo "=== Verifying XRd image exists ==="
if ! docker image inspect "$IMAGE" >/dev/null 2>&1; then
echo "ERROR: Image $IMAGE not found locally."
echo "Check with: docker images | grep xrd"
exit 1
fi
echo " Image: $IMAGE"
SIZE=$(docker image inspect "$IMAGE" --format='{{.Size}}' | numfmt --to=iec 2>/dev/null || echo "unknown")
echo " Size: $SIZE"
echo ""
echo "=== Exporting image to $OUTPUT ==="
echo " (This may take a minute for ~1.3GB image...)"
docker save -o "$OUTPUT" "$IMAGE"
echo ""
echo "=== Export complete ==="
TAR_SIZE=$(du -h "$OUTPUT" | cut -f1)
echo " File: $OUTPUT ($TAR_SIZE)"
SHA=$(sha256sum "$OUTPUT" | awk '{print $1}')
echo " SHA256: $SHA"
echo ""
echo "=== Next steps ==="
echo "1. Update cml/xrd-image-definition.yaml with:"
echo " sha256: $SHA"
echo ""
echo "2. Upload to CML:"
echo " a. Tools > Node and Image Definitions > Import"
echo " Upload: cml/xrd-node-definition.yaml"
echo " b. Tools > Node and Image Definitions > Import"
echo " Upload: cml/xrd-image-definition.yaml"
echo " c. Tools > Node and Image Definitions > Image Definitions > Manage Image Uploads"
echo " Upload: $OUTPUT"
echo " (For large files, consider SCP to CML server instead)"
echo ""
echo "3. In your CML lab topology:"
echo " a. Drag 'XRd Control-Plane (IOS-XR)' from the node palette"
echo " b. Draw links to CORE-01 (→Gi0/0/0/0) and CORE-02 (→Gi0/0/0/1)"
echo " c. Edit xrd-startup.cfg if needed (IPs, BMP target, etc.)"
echo " d. Start the node (allow ~3-5 min for XRd boot)"
echo ""
echo "4. After boot, verify via XRd console:"
echo " show isis adjacency"
echo " show bgp summary"
echo " show bmp server 1"

View File

@ -0,0 +1,10 @@
id: obmp-exabgp.latest
node_definition_id: obmp-exabgp
description: |-
OpenBMP ExaBGP Route Injector
Python 3.11 + ExaBGP + Flask API for BGP route injection testing.
label: ExaBGP Route Injector
disk_image: obmp-exabgp.tar
read_only: false
schema_version: 0.0.1
# sha256: <UPDATE after running: sha256sum /tmp/obmp-exabgp.tar>

View File

@ -0,0 +1,112 @@
id: obmp-exabgp
boot:
timeout: 60
completed:
- "ExaBGP Route Injector"
uses_regex: false
sim:
linux_native:
libvirt_domain_driver: docker
driver: ubuntu
ram: 512
cpus: 1
cpu_limit: 100
video:
memory: 1
general:
nature: server
description: OpenBMP ExaBGP Route Injector (Docker container)
read_only: false
configuration:
generator:
driver: null
provisioning:
files:
- editable: false
name: config.json
content: |-
{
"docker": {
"image": "obmp-exabgp:latest",
"mounts": [
"type=bind,source=cfg/boot.sh,target=/cml-boot.sh"
],
"misc_args": [],
"env": [
"EXABGP_LOCAL_AS=65100",
"EXABGP_PEER_AS=65020",
"EXABGP_API_PORT=5050"
]
},
"shell": "/bin/bash",
"day0cmd": [ "/bin/bash", "/cml-boot.sh" ],
"busybox": false
}
- editable: true
name: boot.sh
content: |-
#!/bin/bash
# CML boot script for ExaBGP container
# Configures data-plane interfaces before starting ExaBGP
#
# Interface mapping (assigned by CML topology links):
# eth0 = first connected interface (data-plane link 1)
# eth1 = second connected interface (data-plane link 2)
# ...additional interfaces as connected in topology
#
# Edit the IPs below to match your topology addressing.
# These are examples using 10.120.x.x/30 point-to-point links.
# --- Data-plane interface configuration ---
# Link to CORE-01: ExaBGP=10.120.1.2/30, CORE-01=10.120.1.1/30
ip address add 10.120.1.2/30 dev eth0
ip link set dev eth0 up
# Link to CORE-02: ExaBGP=10.120.2.2/30, CORE-02=10.120.2.1/30
ip address add 10.120.2.2/30 dev eth1
ip link set dev eth1 up
# --- Set environment for ExaBGP peering ---
export EXABGP_LOCAL_IP=10.120.1.2
export EXABGP_PEER_1=10.120.1.1
export EXABGP_PEER_2=10.120.2.1
# --- Start ExaBGP ---
exec /bin/bash /exabgp/startup.sh
media_type: raw
volume_name: cfg
device:
interfaces:
serial_ports: 1
physical:
- eth0
- eth1
- eth2
- eth3
has_loopback_zero: false
default_count: 2
ui:
label_prefix: exabgp-
icon: server
label: ExaBGP Route Injector
visible: true
group: Others
description: |-
OpenBMP ExaBGP Route Injector
BGP route injection for OpenBMP testing.
AS 65100 (eBGP) peering with IOS-XR routers (AS 65020).
Flask API on port 5050 for route management.
inherited:
image:
ram: true
cpus: false
data_volume: false
boot_disk_size: false
cpu_limit: false
node:
ram: true
cpus: false
data_volume: false
boot_disk_size: false
cpu_limit: false
schema_version: 0.0.1

172
cml/proxmox_bmp_config.py Normal file
View File

@ -0,0 +1,172 @@
#!/usr/bin/env python3
"""Apply the OpenBMP `bmp server 1` config to the Proxmox CML lab routers.
IOS-XR BMP configuration is not exposed via the device's NETCONF YANG schema
on this release, so this applies config over the SSH CLI. It is idempotent
re-applying an identical block commits no changes.
PROX-R9K-03 was built without `bmp-activate` on its BGP neighbor-group; this
script adds it (the other 8 routers already have it from the re-addressing).
Usage:
pip install paramiko
python3 cml/proxmox_bmp_config.py # all 9 routers
python3 cml/proxmox_bmp_config.py r9k-05 # one router (smoke test)
Verify afterwards in OpenBMP:
docker exec -i obmp-psql psql -U openbmp -d openbmp \\
-c "SELECT name, ip_address, bgp_id, isconnected FROM routers ORDER BY name;"
"""
import sys
import time
import paramiko
# --- BMP collector ---------------------------------------------------------
COLLECTOR_HOST = "10.40.40.202"
COLLECTOR_PORT = "5000"
# `bmp server 1` block — flat formal form, identical to the ESXi lab.
# Each line is self-contained and applied at the (config)# prompt; a bare
# "bmp server 1" is deliberately omitted (it would drop into the bmp submode
# and the remaining flat lines would then be invalid).
BMP_LINES = [
f"bmp server 1 host {COLLECTOR_HOST} port {COLLECTOR_PORT}",
"bmp server 1 description OpenBMP-Collector",
"bmp server 1 update-source MgmtEth0/RP0/CPU0/0",
"bmp server 1 initial-delay 60",
"bmp server 1 stats-reporting-period 300",
"bmp server 1 initial-refresh delay 60 spread 30",
]
# Only PROX-R9K-03 needs this — its BMP-MONITORED neighbor-group was built
# without bmp-activate. AS 65021 is the Proxmox lab.
BMP_ACTIVATE_LINE = "router bgp 65021 neighbor-group BMP-MONITORED bmp-activate server 1"
# --- router inventory ------------------------------------------------------
# (name, mgmt_ip, user, password, needs_bmp_activate)
ROUTERS = [
("PROX-R9K-CORE-01", "10.100.1.100", "admin", "cisco", False),
("PROX-R9K-CORE-02", "10.100.1.200", "admin", "cisco", False),
("PROX-R9K-01", "10.100.1.1", "webui", "cisco", False),
("PROX-R9K-02", "10.100.1.2", "webui", "cisco", False),
("PROX-R9K-03", "10.100.1.3", "webui", "cisco", True),
("PROX-R9K-04", "10.100.1.4", "webui", "cisco", False),
("PROX-R9K-05", "10.100.1.5", "webui", "cisco", False),
("PROX-R9K-06", "10.100.1.6", "webui", "cisco", False),
("PROX-R9K-07", "10.100.1.7", "admin", "cisco", False),
]
def _drain(shell, settle=1.0, limit=15.0, until=None):
"""Read from the shell.
If `until` is given, keep reading until that substring appears (or `limit`
elapses). Otherwise return once output stops arriving for `settle` seconds.
"""
out = ""
start = time.time()
while time.time() - start < limit:
time.sleep(settle)
if shell.recv_ready():
out += shell.recv(65535).decode(errors="replace")
if until and until in out:
break
elif until is None:
break
elif until in out:
break
return out
def apply_router(name, ip, user, pwd, needs_activate):
"""Apply the BMP config to one router. Returns True on success."""
print(f"\n=== {name} ({ip}) ===")
lines = list(BMP_LINES)
if needs_activate:
lines.append(BMP_ACTIVATE_LINE)
try:
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(ip, username=user, password=pwd, timeout=15,
look_for_keys=False, allow_agent=False)
shell = ssh.invoke_shell(width=220, height=1000)
time.sleep(2)
shell.recv(65535) # banner
# "(config)#" is the universal IOS-XR config-prompt suffix — used as
# the wait marker so the device hostname is irrelevant.
CFG = "(config)#"
shell.send("terminal length 0\n")
_drain(shell, 0.5, 5)
# Enter config mode. IOS-XR may print an active-session banner first,
# so wait specifically for the (config) prompt.
shell.send("configure terminal\n")
out = _drain(shell, 0.4, 15, until=CFG)
if CFG not in out:
print(f" FAIL: could not enter config mode\n {out[-200:]}")
ssh.close()
return False
# Send config lines, paced.
for line in lines:
shell.send(line + "\n")
time.sleep(0.4)
_drain(shell, 0.3, 8, until=CFG)
# Confirm the candidate actually holds changes before committing.
shell.send("show configuration\n")
cand = _drain(shell, 0.3, 10, until=CFG)
if "bmp server" not in cand:
print(" OK: no changes (config already present) — nothing to commit")
shell.send("abort\n")
_drain(shell, 0.5, 5)
ssh.close()
return True
shell.send("commit\n")
result = _drain(shell, 0.3, 25, until=CFG)
if "fail" in result.lower() or "error" in result.lower():
print(f" FAIL: commit error\n {result[-300:]}")
shell.send("abort\n")
_drain(shell, 0.5, 5)
ssh.close()
return False
# Leave config mode and fully drain (settle-based, no marker) so the
# verify output is clean — not contaminated by echoed config lines.
shell.send("end\n")
_drain(shell, 1.0, 10)
shell.send("show run formal bmp\n")
verify = _drain(shell, 1.0, 12)
ok = f"host {COLLECTOR_HOST} port {COLLECTOR_PORT}" in verify
print(f" {'OK' if ok else 'FAIL'}: bmp server 1 "
f"{'present' if ok else 'NOT found'} in running config")
ssh.close()
return ok
except Exception as e:
print(f" FAIL: {e}")
return False
def main():
target = sys.argv[1].lower() if len(sys.argv) > 1 else None
results = {}
for name, ip, user, pwd, needs_activate in ROUTERS:
if target and target not in name.lower():
continue
results[name] = apply_router(name, ip, user, pwd, needs_activate)
print(f"\n{'='*48}\n SUMMARY")
for name, ok in results.items():
print(f" {name:22s} {'OK' if ok else 'FAILED'}")
sys.exit(0 if all(results.values()) else 1)
if __name__ == "__main__":
main()

View File

@ -0,0 +1,10 @@
id: xrd-control-plane.25.1.1
node_definition_id: xrd-control-plane-rr
description: |-
Cisco XRd Control-Plane 25.1.1
IOS-XR containerized routing daemon for BGP/IS-IS/BMP workloads.
label: XRd Control-Plane 25.1.1
disk_image: xrd-control-plane.tar
read_only: false
schema_version: 0.0.1
# sha256: <UPDATE after running: sha256sum /tmp/xrd-control-plane.tar>

View File

@ -0,0 +1,179 @@
id: xrd-control-plane-rr
boot:
timeout: 300
completed:
- "IOS XR RUN"
uses_regex: false
sim:
linux_native:
libvirt_domain_driver: docker
driver: ubuntu
ram: 2048
cpus: 2
cpu_limit: 100
video:
memory: 1
general:
nature: router
description: Cisco XRd Control-Plane - IOS-XR containerized routing daemon
read_only: false
configuration:
generator:
driver: null
provisioning:
files:
- editable: false
name: config.json
content: |-
{
"docker": {
"image": "ios-xr/xrd-control-plane:25.1.1",
"mounts": [
"type=bind,source=cfg/boot.sh,target=/cml-boot.sh",
"type=bind,source=cfg/xrd-startup.cfg,target=/etc/xrd/startup.cfg"
],
"misc_args": [
"--privileged"
],
"env": [
"XR_STARTUP_CFG=/etc/xrd/startup.cfg",
"XR_MGMT_INTERFACES=linux:eth0,chksum",
"XR_INTERFACES=linux:eth1,xr_name=Gi0/0/0/0;linux:eth2,xr_name=Gi0/0/0/1;linux:eth3,xr_name=Gi0/0/0/2;linux:eth4,xr_name=Gi0/0/0/3"
]
},
"shell": "/bin/bash",
"day0cmd": [ "/bin/bash", "/cml-boot.sh" ],
"busybox": false
}
- editable: true
name: boot.sh
content: |-
#!/bin/bash
# CML boot wrapper for XRd control-plane.
# XRd handles its own init — this script configures
# data-plane interfaces before XRd starts.
#
# Interface mapping (set via XR_INTERFACES env var):
# eth0 = MgmtEth0/RP0/CPU0/0 (CML mgmt)
# eth1 = Gi0/0/0/0 (data-plane link 1, e.g. to CORE-01)
# eth2 = Gi0/0/0/1 (data-plane link 2, e.g. to CORE-02)
# eth3+ = Gi0/0/0/2+ (additional links)
#
# Linux-level IP config is handled by XRd via startup.cfg.
# Just ensure interfaces are up.
for iface in eth0 eth1 eth2 eth3 eth4; do
[ -d /sys/class/net/$iface ] && ip link set dev $iface up
done
# XRd entrypoint
exec /usr/sbin/xrd
- editable: true
name: xrd-startup.cfg
content: |-
!! XRd Control-Plane - Third Route Reflector (RR3)
!! Peers with CORE-01 and CORE-02 as RR mesh (non-client iBGP)
!! Sends BMP to OpenBMP collector at 10.40.40.202:5000
!!
hostname XRd-RR3
!
interface Loopback0
ipv4 address 10.10.255.30 255.255.255.255
!
interface Gi0/0/0/0
description to-CORE-01
ipv4 address 10.120.3.2 255.255.255.252
no shutdown
!
interface Gi0/0/0/1
description to-CORE-02
ipv4 address 10.120.4.2 255.255.255.252
no shutdown
!
router isis 1
is-type level-2-only
net 49.0001.0100.1000.0030.00
address-family ipv4 unicast
metric-style wide
!
interface Loopback0
passive
address-family ipv4 unicast
!
!
interface Gi0/0/0/0
point-to-point
address-family ipv4 unicast
!
!
interface Gi0/0/0/1
point-to-point
address-family ipv4 unicast
!
!
!
router bgp 65020
bgp router-id 10.10.255.30
address-family ipv4 unicast
!
neighbor 10.10.255.0
remote-as 65020
update-source Loopback0
address-family ipv4 unicast
!
!
neighbor 10.10.255.20
remote-as 65020
update-source Loopback0
address-family ipv4 unicast
!
!
!
bmp server 1
host 10.40.40.202 port 5000
description OpenBMP
update-source Gi0/0/0/0
flapping-delay 60
initial-delay 5
stats-reporting-period 300
initial-refresh delay 30 spread 2
!
ssh server v2
end
media_type: raw
volume_name: cfg
device:
interfaces:
serial_ports: 1
physical:
- eth0
- eth1
- eth2
- eth3
- eth4
has_loopback_zero: false
default_count: 3
ui:
label_prefix: xrd-
icon: router
label: XRd Control-Plane (IOS-XR)
visible: true
group: Cisco
description: |-
Cisco XRd Control-Plane (IOS-XR 25.1.1)
Containerized IOS-XR routing daemon for control-plane workloads.
Full BGP, IS-IS, BMP, NETCONF support.
Configured as third Route Reflector (RR3) with BMP to OpenBMP.
inherited:
image:
ram: true
cpus: true
data_volume: false
boot_disk_size: false
cpu_limit: false
node:
ram: true
cpus: true
data_volume: false
boot_disk_size: false
cpu_limit: false
schema_version: 0.0.1

View File

@ -1,5 +1,5 @@
---
version: '3'
name: obmp
volumes:
data-volume:
driver_opts:
@ -17,7 +17,14 @@ services:
zookeeper:
restart: unless-stopped
container_name: obmp-zookeeper
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/2181'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
image: confluentinc/cp-zookeeper:7.1.1
mem_limit: 1g
volumes:
- ${OBMP_DATA_ROOT}/zk-data:/var/lib/zookeeper/data
- ${OBMP_DATA_ROOT}/zk-log:/var/lib/zookeeper/log
@ -28,7 +35,15 @@ services:
kafka:
restart: unless-stopped
container_name: obmp-kafka
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/9092'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 90s
image: confluentinc/cp-kafka:7.1.1
# Raise KAFKA_MEM_LIMIT for production (full-table initial dumps are bursty).
mem_limit: ${KAFKA_MEM_LIMIT:-4g}
# Change the mount point to where you want to store Kafka data.
# Normally 80GB or more
@ -45,7 +60,7 @@ services:
# Change/add listeners based on your FQDN that the host and other containers can access. You can use
# an IP address as well. By default, only within the compose/containers can Kafka be accesssed
# using port 29092. Outside access can be enabled, but you should use an FQDN listener.
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092,PLAINTEXT_HOST://10.40.40.202:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092,PLAINTEXT_HOST://${HOST_IP:-10.40.40.202}:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
@ -84,7 +99,14 @@ services:
grafana:
restart: unless-stopped
container_name: obmp-grafana
healthcheck:
test: ["CMD-SHELL", "wget -q --spider http://localhost:3000/api/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
image: grafana/grafana:9.1.7
mem_limit: 1g
ports:
- "3000:3000"
volumes:
@ -92,7 +114,13 @@ services:
- ${OBMP_DATA_ROOT}/grafana/provisioning:/etc/grafana/provisioning/
environment:
- GF_SECURITY_ADMIN_PASSWORD=openbmp
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ENABLED=false
- GF_SERVER_ROOT_URL=https://${OBMP_DOMAIN:-bmp.apodacalab.com}/grafana/
- GF_SERVER_SERVE_FROM_SUB_PATH=true
- GF_AUTH_PROXY_ENABLED=true
- GF_AUTH_PROXY_HEADER_NAME=Remote-User
- GF_AUTH_PROXY_HEADER_PROPERTY=username
- GF_AUTH_PROXY_AUTO_SIGN_UP=true
- GF_USERS_HOME_PAGE=d/obmp-home/obmp-home
- GF_INSTALL_PLUGINS=agenty-flowcharting-panel,grafana-piechart-panel,grafana-worldmap-panel,grafana-simple-json-datasource,vonage-status-panel
@ -118,7 +146,15 @@ services:
psql:
restart: unless-stopped
container_name: obmp-psql
healthcheck:
test: ["CMD-SHELL", "pg_isready -U openbmp -d openbmp"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
image: openbmp/postgres:2.2.1
# Raise PSQL_MEM_LIMIT for production (see docs/production-sizing.md).
mem_limit: ${PSQL_MEM_LIMIT:-6g}
privileged: true
shm_size: 1536m
sysctls:
@ -141,7 +177,14 @@ services:
collector:
restart: unless-stopped
container_name: obmp-collector
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/5000'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
image: openbmp/collector:2.2.3
mem_limit: 2g
sysctls:
- net.ipv4.tcp_keepalive_intvl=30
- net.ipv4.tcp_keepalive_probes=5
@ -156,7 +199,12 @@ services:
psql-app:
restart: unless-stopped
container_name: obmp-psql-app
# No healthcheck — the consumer exposes no health port; Docker's
# restart-on-exit covers process death.
image: openbmp/psql-app:2.2.2
# mem_limit must exceed the MEM (JVM heap) env below. Raise both for
# production — see docs/production-sizing.md.
mem_limit: ${PSQL_APP_MEM_LIMIT:-4g}
sysctls:
- net.ipv4.tcp_keepalive_intvl=30
- net.ipv4.tcp_keepalive_probes=5
@ -200,22 +248,31 @@ services:
exabgp:
restart: unless-stopped
container_name: obmp-exabgp
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/5050'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
profiles: ["test"]
# The full-table feature generates up to 900K route objects in memory;
# 512m OOM-killed it. Raise EXABGP_MEM_LIMIT in .env for larger tables.
mem_limit: ${EXABGP_MEM_LIMIT:-6g}
build:
context: ./exabgp
dockerfile: Dockerfile
# Host networking so ExaBGP can reach CML routers directly on port 179
network_mode: host
environment:
# IP on the host that CML routers can reach (matches Kafka external listener)
- EXABGP_LOCAL_IP=10.40.40.202
# ExaBGP presents as AS 65100 (eBGP peer to your AS 65020 lab)
- EXABGP_LOCAL_AS=65100
- EXABGP_PEER_AS=65020
# CORE routers to peer with — these propagate routes into the iBGP mesh
- EXABGP_PEER_1=10.100.0.100
- EXABGP_PEER_2=10.100.0.200
# IP on the host that CML routers reach (BGP peering source)
- EXABGP_LOCAL_IP=${HOST_IP:-10.40.40.202}
# ExaBGP presents as AS 65100 (eBGP peer to the lab route reflectors)
- EXABGP_LOCAL_AS=${EXABGP_LOCAL_AS:-65100}
# Peer list — ";"-separated entries of "ip:peer_as:description".
# Default covers both labs: AS 65020 (ESXi) and AS 65021 (Proxmox).
- EXABGP_PEERS=${EXABGP_PEERS:-10.100.0.100:65020:CML-R9K-CORE-01;10.100.0.200:65020:CML-R9K-CORE-02;10.100.1.100:65021:PROX-R9K-CORE-01;10.100.1.200:65021:PROX-R9K-CORE-02}
# Flask API port (also on host network)
- EXABGP_API_PORT=5050
- EXABGP_API_PORT=${EXABGP_API_PORT:-5050}
volumes:
# Mount scenarios dir so you can edit/add scenarios without rebuilding
- ./exabgp/scenarios:/exabgp/scenarios
@ -224,6 +281,14 @@ services:
exabgp-ui:
restart: unless-stopped
container_name: obmp-exabgp-ui
healthcheck:
test: ["CMD-SHELL", "wget -q --spider http://localhost:5001/ || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
profiles: ["test"]
mem_limit: 256m
build:
context: ./exabgp-ui
dockerfile: Dockerfile
@ -231,10 +296,140 @@ services:
network_mode: host
# Serves on port 5001 (host network, defined in nginx.conf)
# --- Phase 4: gNMI Streaming Telemetry ---
influxdb:
restart: unless-stopped
container_name: obmp-influxdb
healthcheck:
test: ["CMD-SHELL", "curl -fsS http://localhost:8086/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
profiles: ["test"]
image: influxdb:2.7
mem_limit: 2g
ports:
- "8086:8086"
volumes:
- ${OBMP_DATA_ROOT}/influxdb:/var/lib/influxdb2
environment:
- DOCKER_INFLUXDB_INIT_MODE=setup
- DOCKER_INFLUXDB_INIT_USERNAME=openbmp
- DOCKER_INFLUXDB_INIT_PASSWORD=openbmp123
- DOCKER_INFLUXDB_INIT_ORG=openbmp
- DOCKER_INFLUXDB_INIT_BUCKET=telemetry
- DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=openbmp-telemetry-token
- DOCKER_INFLUXDB_INIT_RETENTION=30d
telegraf:
restart: unless-stopped
container_name: obmp-telegraf
profiles: ["test"]
mem_limit: 512m
build:
context: ./telegraf
dockerfile: Dockerfile
network_mode: host
# Run telegraf as root and override the image entrypoint (which otherwise
# drops back to the telegraf user) so [[inputs.docker]] can read the
# Docker daemon socket for container resource metrics.
user: root
entrypoint: ["telegraf"]
volumes:
- /var/run/docker.sock:/var/run/docker.sock
depends_on:
- influxdb
environment:
- INFLUXDB_TOKEN=openbmp-telemetry-token
# gNMI fleet — quoted, comma-separated host:port list. Default = the two
# ESXi CORE routers; extend via GNMI_ADDRESSES in .env for more routers.
- 'GNMI_ADDRESSES=${GNMI_ADDRESSES:-"10.100.0.100:57400", "10.100.0.200:57400"}'
- GNMI_USERNAME=${GNMI_USERNAME:-webui}
- GNMI_PASSWORD=${GNMI_PASSWORD:-cisco}
# --- Phase 4: Traffic Generator ---
traffic-gen:
restart: unless-stopped
container_name: obmp-traffic-gen
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/5051'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
profiles: ["test"]
mem_limit: 1g
build:
context: ./traffic-gen
dockerfile: Dockerfile
network_mode: host
cap_add:
- NET_RAW
- NET_ADMIN
environment:
- TRAFFIC_GEN_PORT=5051
- TRAFFIC_GEN_MODE=sender
- RESPONDER_URL=http://172.30.0.10:5053
traffic-gen-ui:
restart: unless-stopped
container_name: obmp-traffic-gen-ui
healthcheck:
test: ["CMD-SHELL", "wget -q --spider http://localhost:5002/ || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
profiles: ["test"]
mem_limit: 256m
build:
context: ./traffic-gen-ui
dockerfile: Dockerfile
network_mode: host
# Serves on port 5002 (host network, defined in nginx.conf)
traffic-gen-responder:
restart: unless-stopped
container_name: obmp-traffic-gen-responder
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/5053'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
profiles: ["test"]
mem_limit: 1g
build:
context: ./traffic-gen
dockerfile: Dockerfile
cap_add:
- NET_RAW
- NET_ADMIN
environment:
- TRAFFIC_GEN_PORT=5053
- TRAFFIC_GEN_MODE=responder
- TRAFFIC_GEN_RESPONDER_MODE=echo
- TRAFFIC_GEN_INTERFACE=eth0
networks:
traffic-test-net:
ipv4_address: 172.30.0.10
ports:
- "5053:5053"
whois:
restart: unless-stopped
container_name: obmp-whois
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/localhost/43'"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
image: openbmp/whois:2.2.0
mem_limit: 1g
sysctls:
- net.ipv4.tcp_keepalive_intvl=30
- net.ipv4.tcp_keepalive_probes=5
@ -249,3 +444,40 @@ services:
- POSTGRES_DB=openbmp
- POSTGRES_HOST=obmp-psql
- POSTGRES_PORT=5432
authelia:
restart: unless-stopped
container_name: obmp-authelia
profiles: ["auth"]
mem_limit: 256m
image: authelia/authelia:4.38
ports:
- "9091:9091"
volumes:
- ${OBMP_DATA_ROOT}/authelia:/config
environment:
- TZ=UTC
portal:
restart: unless-stopped
container_name: obmp-portal
healthcheck:
test: ["CMD-SHELL", "wget -q --spider http://localhost:80/ || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 20s
profiles: ["auth"]
mem_limit: 128m
image: nginx:alpine
ports:
- "8080:80"
volumes:
- ./portal:/usr/share/nginx/html:ro
networks:
traffic-test-net:
driver: bridge
ipam:
config:
- subnet: 172.30.0.0/24

269
docs/ROADMAP.md Normal file
View File

@ -0,0 +1,269 @@
# OpenBMP Platform Roadmap
## Context
This BMP monitoring platform is being developed against CML virtual labs (IOS-XR) and will be deployed into an ISP production network running IOS-XR and Juniper routers/route reflectors. The two tracks share a common foundation: configuration must be environment-agnostic so the same stack runs identically against virtual or production routers.
Currently, router IPs, AS numbers, and credentials are hardcoded across 8+ files, tightly coupling the stack to a single CML lab. This roadmap addresses both the multi-lab development workflow and production deployment.
---
## Track A: Configuration Centralization (Foundation for Both Tracks)
### A1. Create `inventory.yaml` — unified topology inventory
**File**: `inventory.yaml` (new)
Single source of truth for all environments. Structure:
```yaml
platform:
host_ip: 10.40.40.202
bmp_port: 5000
exabgp_port: 5050
environments:
cml-lab1:
type: cml # cml | production
description: "CML RR cluster - 9 IOS-XR virtual routers"
cml_server: "https://10.40.40.174"
cml_user: webui
bgp_as: 65020
netconf: { user: webui, password: cisco, port: 830 }
exabgp:
local_as: 65100
peers:
- { ip: 10.100.0.100, name: CORE-01, peer_as: 65020 }
- { ip: 10.100.0.200, name: CORE-02, peer_as: 65020 }
routers:
CORE-01: { mgmt: 10.100.0.100, loopback: 10.10.255.0, role: rr, vendor: iosxr, gnmi: true }
CORE-02: { mgmt: 10.100.0.200, loopback: 10.10.255.20, role: rr, vendor: iosxr, gnmi: true }
R9K-01: { mgmt: 10.100.0.1, loopback: 10.10.255.1, role: client, vendor: iosxr }
# ...
cml-lab2:
type: cml
description: "Second CML Lab (TBD topology)"
cml_server: "https://<lab2-ip>"
routers: {}
production:
type: production
description: "ISP production network"
bgp_as: <prod-as>
netconf: { user: <prod-user>, port: 830 }
routers:
# IOS-XR and Juniper RRs + routers
PROD-RR1: { mgmt: x.x.x.x, role: rr, vendor: iosxr, gnmi: true }
PROD-RR2: { mgmt: x.x.x.x, role: rr, vendor: junos }
# ...
```
Key design decisions:
- `vendor: iosxr | junos` — drives NETCONF dialect, gNMI paths, and config templates
- `type: cml | production` — CML environments have `cml_server` for API automation; production does not
- Credentials in `inventory.yaml` (gitignored) or pulled from env vars
### A2. Create `config_loader.py` — Python inventory helper
**File**: `config_loader.py` (new)
Functions: `get_env(name)`, `get_all_routers()`, `get_routers_by_vendor(vendor)`, `get_exabgp_peers()`, `get_gnmi_targets()`, `get_routers_for_env(env_name)`
### A3. Refactor hardcoded Python scripts
Replace `ROUTERS` dicts/lists with `config_loader` calls:
- `exabgp/route_diversity_config.py` (line 47)
- `exabgp/bgpls_config.py` (line 35)
- `gnmi/gnmi_grpc_config.py` (line 25)
### A4. Expand `.env` and parameterize `docker-compose.yml`
Add to `.env`:
```env
OBMP_DATA_ROOT=/var/openbmp
DOCKER_HOST_IP=10.40.40.202
EXABGP_LOCAL_IP=10.40.40.202
EXABGP_LOCAL_AS=65100
EXABGP_PEER_AS=65020
EXABGP_PEER_1=10.100.0.100
EXABGP_PEER_2=10.100.0.200
```
Replace hardcoded IPs in `docker-compose.yml` (Kafka listener, ExaBGP env vars).
### A5. Telegraf config parameterization
Replace hardcoded gNMI addresses in `telegraf/telegraf.conf` with env var substitution. Pass `GNMI_TARGETS` from docker-compose.yml.
### A6. Fix InfluxDB datasource URL
`obmp-grafana/provisioning/datasources/influxdb-ds.yml`: replace `http://10.40.40.202:8086` with `http://obmp-influxdb:8086`.
---
## Track B: Multi-Lab CML Development
### B1. Dynamic ExaBGP multi-peer support
**File**: `exabgp/startup.sh`
Accept `EXABGP_PEERS` env var (comma-separated `ip:as:description`), generate N neighbor blocks. Keep `PEER_1`/`PEER_2` fallback.
### B2. CML API client module
**File**: `cml/cml_client.py` (new)
Python module using `virl2_client` SDK:
- Connect to CML server (creds from `inventory.yaml`)
- Upload node/image definitions
- Import/export topology YAML
- Start/stop/destroy labs
- Get node status
### B3. Topology template system
**File**: `cml/templates/xrd_rr.j2` (new)
Jinja2 templates for XRd startup config. Parameterize: hostname, loopback, link IPs, IS-IS NET, BGP AS, neighbor IPs, BMP target.
### B4. CLI deployment tool
**File**: `cml/deploy.py` (new)
```bash
python3 cml/deploy.py --env cml-lab1 status
python3 cml/deploy.py --env cml-lab1 upload-images
python3 cml/deploy.py --env cml-lab2 create
python3 cml/deploy.py --env cml-lab2 start
python3 cml/deploy.py --env cml-lab2 destroy
```
### B5. Update build scripts with API push
`cml/build-cml-image.sh` and `cml/build-xrd-image.sh` get `--push <env-name>` flag.
---
## Track C: Production ISP Deployment
### C1. Multi-vendor NETCONF support
Current scripts assume IOS-XR NETCONF only. For Juniper RRs:
- `config_loader.py` provides `vendor` field per router
- NETCONF scripts branch on vendor for dialect differences (`device_params='iosxr'` vs `device_params='junos'`)
- Route diversity, BGP-LS config scripts get Junos templates alongside IOS-XR
### C2. Multi-vendor gNMI paths
Telegraf gNMI subscriptions currently use OpenConfig paths which work for both IOS-XR and Junos, but:
- Verify Juniper gNMI support on target hardware
- Add vendor-specific path overrides in `inventory.yaml` if needed
- Telegraf can subscribe to multiple targets with different configs via `[[inputs.gnmi]]` blocks
### C3. BMP considerations for production
- BMP collector (port 5000) accepts connections from any router — no changes needed
- Production routers need BMP config pushed (manual or via NETCONF automation)
- Consider: separate BMP server IDs per environment for dashboard filtering
- Juniper BMP config differs from IOS-XR — add Junos BMP config templates
### C4. Dashboard multi-environment awareness
- Add a Grafana template variable for environment filtering (by router name prefix or a tag)
- Consider a "Network Overview" dashboard that shows all environments side-by-side
- Existing dashboards work as-is — router dropdowns will show all BMP-reporting routers
### C5. Security hardening for production
- Move credentials out of `inventory.yaml` into environment variables or a secrets manager
- Authelia config: stronger passwords, TOTP enforcement, session timeouts
- PostgreSQL: restrict access, enable SSL
- Kafka: consider authentication if exposed beyond localhost
- BMP port: firewall to only accept connections from known router management IPs
### C6. Scalability considerations
- Monitor PostgreSQL disk usage and query performance with production-scale RIBs
- TimescaleDB compression policies for historical data (ip_rib_log, ls_*_log)
- Kafka topic partitioning if message throughput is high
- Consider read replicas or materialized views for heavy Grafana queries
---
## Track D: Packaging & Distribution
### D1. Configuration templates
- `inventory.yaml.example` — documented example with placeholder values
- `.env.example` — all environment variables with descriptions
### D2. Bootstrap script
`setup.sh` that:
- Creates required directories (`$OBMP_DATA_ROOT/authelia`, etc.)
- Copies example configs if originals don't exist
- Validates inventory.yaml syntax
- Generates Telegraf config from inventory
### D3. Published Docker images
Push custom images to a registry (Docker Hub or GHCR):
- `obmp-exabgp`
- `obmp-exabgp-ui`
- `obmp-traffic-gen`
- `obmp-traffic-gen-ui`
- `obmp-portal`
Replace `build:` with `image:` in docker-compose.yml (keep build as override).
### D4. Documentation
- `docs/quickstart.md` — 5-minute setup guide
- `docs/adding-a-lab.md` — how to add a CML lab environment
- `docs/production-deployment.md` — production hardening checklist
- `docs/architecture.md` — system diagram, data flow, port map
---
## Implementation Order
| Priority | Step | Track | Description |
|----------|------|-------|-------------|
| 1 | A1 | Foundation | Create `inventory.yaml` |
| 2 | A2 | Foundation | Create `config_loader.py` |
| 3 | A3 | Foundation | Refactor hardcoded Python scripts |
| 4 | A4 | Foundation | Parameterize `.env` + docker-compose |
| 5 | A5-A6 | Foundation | Telegraf + InfluxDB datasource fixes |
| 6 | B1 | CML Dev | Dynamic ExaBGP multi-peer |
| 7 | B2-B4 | CML Dev | CML API client + deploy CLI |
| 8 | C1 | Production | Multi-vendor NETCONF (Junos support) |
| 9 | C3 | Production | Junos BMP config templates |
| 10 | C5 | Production | Security hardening |
| 11 | D1-D2 | Packaging | Config templates + bootstrap script |
| 12 | D3 | Packaging | Publish Docker images to registry |
| 13 | D4 | Packaging | Documentation |
Steps 1-5 (Track A) unblock everything else. Steps 6-7 and 8-10 can proceed in parallel once the foundation is in place.
---
## Verification
1. **Config centralization**: Change a router IP in `inventory.yaml`, verify all scripts pick it up
2. **ExaBGP multi-peer**: Set 3+ peers, restart, verify BGP sessions establish
3. **CML API**: `deploy.py --env cml-lab1 status` connects and lists nodes
4. **BMP multi-source**: Router from lab 2 sends BMP, appears in `SELECT * FROM routers` and Grafana
5. **Junos support**: NETCONF script connects to a Juniper router, pushes config
6. **Production dry-run**: Point a test router from the ISP network at the collector, verify end-to-end
7. **Clean deploy**: Clone repo on a fresh host, run `setup.sh`, `docker compose up`, confirm stack starts
---
## Risks
- **Router name collisions**: Enforce unique hostnames across all environments
- **Address space overlap**: Each environment needs distinct management subnets
- **Juniper BMP differences**: Junos BMP implementation may differ in supported tables/TLVs — test early
- **Production scale**: 500K-route labs are slow; production full tables will stress PostgreSQL more
- **Credentials in inventory**: Must be gitignored; consider env var fallback for CI/CD

223
docs/backup-restore.md Normal file
View File

@ -0,0 +1,223 @@
# OpenBMP Backup & Restore
How to back up and restore the OpenBMP PostgreSQL database, what the backup
covers, and what it deliberately does not.
---
## What `scripts/pg-backup.sh` backs up
The script runs `pg_dump` inside the `obmp-psql` container and produces a
single timestamped, compressed, custom-format dump of the **entire `openbmp`
database**:
- All BMP/BGP operational tables — `routers`, `bgp_peers`, `ip_rib`,
`base_attrs`, `global_ip_rib`, `l3vpn_rib`, the `ls_*` link-state tables.
- All history / TimescaleDB hypertables — `ip_rib_log`, `peer_event_log`,
`stat_reports`, and the `stats_*` aggregate tables.
- Reference / enrichment data — `geo_ip`, `info_asn`, `info_route`,
`rpki_validator`, `pdb_exchange_peers`.
- Schema objects — table definitions, indexes, views, functions, triggers,
enum types, and the TimescaleDB hypertable configuration.
The dump is taken against a **live database**`pg_dump` uses an MVCC
snapshot, so no downtime and no service stop is required. It is written
atomically (to a `.partial` file, renamed on success) so an interrupted run
never leaves a dump that looks valid but is truncated.
Output: `${OBMP_DATA_ROOT:-/var/openbmp}/backups/openbmp-YYYYMMDD-HHMMSS.dump`
### TimescaleDB note
The OpenBMP database uses TimescaleDB hypertables (`ip_rib_log`,
`peer_event_log`, the `stats_*` tables, with compression policies).
**A `pg_dump` logical backup restores hypertables correctly** — the dump
captures the `_timescaledb_catalog` metadata, and on restore the hypertable
structure, chunks, and compression settings are recreated. No special flags
are needed for the dump. The only requirement is that the **restore target
has the TimescaleDB extension available** — which the `openbmp/postgres`
image provides, so restoring into a fresh `obmp-psql` works out of the box.
---
## Scheduling
Make the script executable once:
```bash
chmod +x scripts/pg-backup.sh
```
Add a cron entry (`crontab -e`) — daily at 02:30, logging to a file:
```cron
30 2 * * * OBMP_DATA_ROOT=/var/openbmp /home/user/obmp-docker/scripts/pg-backup.sh >> /var/openbmp/backups/pg-backup.log 2>&1
```
The cron user must be able to reach the Docker daemon — run it as a user in
the `docker` group, or as root. A systemd timer is an equally valid
alternative.
### Configuration
All settings are environment variables with sensible defaults:
| Variable | Default | Purpose |
|----------|---------|---------|
| `OBMP_DATA_ROOT` | `/var/openbmp` | Base data dir; backups go to `${OBMP_DATA_ROOT}/backups` |
| `OBMP_BACKUP_DIR` | (unset) | Explicit backup dir, overrides the default |
| `OBMP_PG_CONTAINER` | `obmp-psql` | Postgres container name |
| `OBMP_PG_DB` | `openbmp` | Database name |
| `OBMP_PG_USER` | `openbmp` | Database user |
| `OBMP_BACKUP_RETENTION_DAYS` | `14` | Dumps older than this are pruned each run |
Retention only prunes files matching the script's own `openbmp-*.dump`
naming pattern — nothing else in the directory is touched.
### Production recommendations
- **Copy dumps off-host.** A local backup does not survive host loss. Sync
the backup directory to object storage / a backup server (e.g. nightly
`rclone`, `restic`, or your existing ISP backup tooling).
- **Size the backup volume** — at production scale (~100150M NLRIs) the
dump can be tens of GB even compressed. See `docs/production-sizing.md`.
- **Test restores periodically** — an untested backup is not a backup.
- For tighter RPO than once-daily logical dumps, consider PostgreSQL
continuous archiving / PITR (WAL archiving + `pg_basebackup`). That is out
of scope for this script but worth planning for a production deployment.
---
## Restore procedure
This restores a dump into a **fresh, empty** `obmp-psql` database. Restoring
over a populated database risks conflicts — start clean.
### 1. Stop the writers
Stop the services that write to the database so nothing races the restore:
```bash
docker compose -p obmp stop psql-app collector
```
Leave `obmp-psql` running.
### 2. Recreate an empty database
Drop and recreate the `openbmp` database inside the running container:
```bash
docker exec -i obmp-psql psql -U openbmp -d postgres <<'EOSQL'
DROP DATABASE IF EXISTS openbmp;
CREATE DATABASE openbmp OWNER openbmp;
EOSQL
```
> Restoring into a **brand-new container**? Bring `obmp-psql` up first and let
> it initialize, but **do not** create the `config/init_db` trigger file —
> the schema comes from the dump, not from psql-app's first-run migration.
### 3. Restore the dump
Copy the dump into the container and run `pg_restore`:
```bash
DUMP=/var/openbmp/backups/openbmp-YYYYMMDD-HHMMSS.dump
docker cp "${DUMP}" obmp-psql:/tmp/restore.dump
docker exec -i obmp-psql \
pg_restore -U openbmp -d openbmp --no-owner --no-privileges \
--jobs=4 /tmp/restore.dump
docker exec obmp-psql rm -f /tmp/restore.dump
```
- `--no-owner --no-privileges` — the dump was created with the same flags;
objects are recreated owned by the connecting role.
- `--jobs=4` — parallel restore; raise it on a many-core host to speed up the
large `ip_rib` / `ip_rib_log` tables. Custom-format dumps support this.
- Some non-fatal warnings (e.g. about the TimescaleDB extension or existing
objects) are normal. A non-zero exit with only warnings is usually fine —
inspect the output before assuming failure.
Alternatively, stream the restore without `docker cp`:
```bash
docker exec -i obmp-psql pg_restore -U openbmp -d openbmp \
--no-owner --no-privileges < "${DUMP}"
```
(Streaming via stdin disables `--jobs` parallelism — use `docker cp` for
large dumps.)
### 4. Verify
```bash
docker exec -i obmp-psql psql -U openbmp -d openbmp -c "
SELECT (SELECT count(*) FROM routers) AS routers,
(SELECT count(*) FROM bgp_peers) AS peers,
(SELECT count(*) FROM ip_rib) AS rib_rows;"
```
Confirm hypertables came back:
```bash
docker exec -i obmp-psql psql -U openbmp -d openbmp -c "
SELECT hypertable_name FROM timescaledb_information.hypertables;"
```
### 5. Restart the writers
```bash
docker compose -p obmp start collector psql-app
```
The collector reconnects to the routers' BMP sessions and psql-app resumes
consuming from Kafka. Live state catches up from the routers.
---
## What is NOT covered
This backup is **PostgreSQL only**. The following are out of scope and need
their own handling:
- **Kafka data is transient.** The `obmp-kafka` topics are a short-retention
pipeline buffer (`KAFKA_LOG_RETENTION_MINUTES: 720` — 12 hours). They are
not a system of record and do not need backing up. After a restore, routers
re-send BMP and the pipeline refills naturally.
- **InfluxDB telemetry has its own backup.** The gNMI streaming-telemetry
data lives in `obmp-influxdb` (bucket `telemetry`), not in PostgreSQL.
`pg_dump` does not touch it. Back it up separately with the Influx CLI:
```bash
# Backup
docker exec obmp-influxdb influx backup /var/lib/influxdb2/backup \
--token "$INFLUXDB_ADMIN_TOKEN"
docker cp obmp-influxdb:/var/lib/influxdb2/backup \
/var/openbmp/backups/influxdb-$(date +%Y%m%d)
# Restore
docker cp /var/openbmp/backups/influxdb-YYYYMMDD \
obmp-influxdb:/var/lib/influxdb2/restore
docker exec obmp-influxdb influx restore /var/lib/influxdb2/restore \
--token "$INFLUXDB_ADMIN_TOKEN"
```
Telemetry is also less critical than BMP data (30-day retention,
data-plane counters) — back it up if you need historical telemetry to
survive a host loss; otherwise the 30-day window simply re-fills.
- **Grafana** — dashboards and datasources are provisioned from files in the
repo (`obmp-grafana/provisioning/` and `obmp-grafana/dashboards/`), so they
are already version-controlled in git. The Grafana database under
`${OBMP_DATA_ROOT}/grafana` (users, preferences, manually-created
dashboards, alert state) is *not* covered by this script — back up that
directory separately if it holds anything not reproducible from the repo.
- **Configuration & secrets**`.env`, `docker-compose.yml`, and the
`${OBMP_DATA_ROOT}/config` directory. Keep these in version control /
your secrets manager.

96
docs/production-sizing.md Normal file
View File

@ -0,0 +1,96 @@
# OpenBMP Production Sizing — 40 Full-Table-Edge Routers
Sizing guidance for deploying the OpenBMP stack against a production ISP
network of **40 full-table-edge routers** with gNMI streaming telemetry.
Derived from the OpenBMP `psql-app` sizing guidance and measured lab behavior.
## Workload assumptions
| Parameter | Value |
|-----------|-------|
| Monitored routers | 40, full-table edge |
| BMP RIB scope | Adj-RIB-In (see recommendation below) |
| Full feeds per router | ~23 eBGP peers carrying the full DFZ |
| Routes per full feed | ~1.2M (≈1M IPv4 + ~0.2M IPv6) |
| **Estimated total NLRIs** | **~100150M** in Adj-RIB-In |
| Telemetry | gNMI via Telegraf → InfluxDB, ~50200 interfaces/router, 10 s interval |
| History retention | `ip_rib_log` 4 weeks, LS logs 4 months, `peer_event_log` 1 year |
The NLRI estimate (40 × ~2.5 feeds × 1.2M) places this deployment at the top
of the OpenBMP `psql-app` guidance tier (150M NLRIs → 64 GB heap).
## BMP RIB scope — recommendation
**Deploy with Adj-RIB-In only.** It is the OpenBMP default, is what every
dashboard is built on, and captures the highest-value data — what each peer
advertises. Alternatives and their cost:
- **Loc-RIB** — adds a full post-best-path converged table per router
(~40 × 1.2M ≈ +48M NLRIs). Add later, selectively, only where best-path
analysis is needed; verify the IOS-XR release supports Loc-RIB BMP.
- **Adj-RIB-Out** — multiplies further (per advertised peer). Not recommended
for the initial deployment.
- **Post-policy Adj-RIB-In** — if inbound policy is restrictive this trims
volume meaningfully; with permissive import it is similar to pre-policy.
## Compute & memory
| Component | Lab today | Production target | Rationale |
|-----------|-----------|-------------------|-----------|
| **Total RAM** | 31 GB | **96128 GB** | psql-app heap 4864 GB + PostgreSQL shared_buffers/cache + Kafka 48 GB + InfluxDB + Grafana + collector |
| **CPU** | 8 cores | **1632 vCPU** | PostgreSQL is CPU-bound under full-table churn — lab psql already sustains ~287% (3 cores) at 18 routers |
| `psql-app` JVM heap (`MEM`) | 3 GB | **4864 GB** | OpenBMP guidance: 4 GB ≈ 10M NLRIs, 64 GB ≈ 150M NLRIs |
| `psql-app` container `mem_limit` | 4 GB | **heap + ~8 GB** | Set `PSQL_APP_MEM_LIMIT` above the JVM heap |
| `psql` container `mem_limit` | 6 GB | **4864 GB** | Set `PSQL_MEM_LIMIT`; PostgreSQL wants ~25% as `shared_buffers` and the rest for OS cache |
| `kafka` container `mem_limit` | 4 GB | **812 GB** | Set `KAFKA_MEM_LIMIT`; full-table initial dumps from 40 routers are bursty |
## Storage
| Store | Lab today | Production target | Notes |
|-------|-----------|-------------------|-------|
| **PostgreSQL** | 25 GB | **24 TB NVMe SSD** | `ip_rib` current state (~100150M rows) + `ip_rib_log` history (4-week retention, the dominant grower) + `base_attrs` + `geo_ip` (~7 GB fixed). OpenBMP guidance: 500 GB main + 1 TB TimescaleDB; add headroom. |
| **Kafka** | 0.2 GB | **100500 GB** | 12 h retention; sized for full-table initial-dump bursts × 40 routers |
| **InfluxDB (telemetry)** | minimal | **50200 GB** | 40 routers × ~50200 interfaces × 10 s gNMI × 30 d; compresses well |
| **Total** | — | **~35 TB fast NVMe** | Use NVMe; PostgreSQL random-IO under churn is the bottleneck on slow disks |
Put the PostgreSQL data directory and the TimescaleDB tablespace on NVMe.
`ip_rib_log` 4-week retention is the main storage tuning knob — revisit once
production update volume is measured.
## Architecture
A single host is viable only if large (**≥128 GB RAM, ≥32 vCPU, multi-TB
NVMe**). **Preferred: split services across hosts**
| Host | Services | Profile |
|------|----------|---------|
| **DB host** (heaviest) | postgres | — |
| **Pipeline host** | kafka, zookeeper, collector, psql-app | core |
| **Presentation host** | grafana, influxdb, telegraf, whois | core + telemetry |
Whichever layout: every service already carries a Compose `mem_limit` — raise
`PSQL_MEM_LIMIT` / `PSQL_APP_MEM_LIMIT` / `KAFKA_MEM_LIMIT` in `.env` for the
production hosts.
## PostgreSQL tuning
- `shared_buffers` ≈ 25% of host RAM; large `effective_cache_size`.
- Raise `work_mem` (dashboard aggregate queries) and `maintenance_work_mem`.
- `max_wal_size` already 10 GB — keep or raise for churn bursts.
- Enable parallel query (`max_parallel_workers_per_gather`).
- Aggressive autovacuum on churn tables (`ip_rib`, `base_attrs`, `ip_rib_log`)
— applied in the lab; persist these settings in production provisioning.
- TimescaleDB compression is already enabled on `ip_rib_log` and the `stats_*`
hypertables — keep it.
## Reference bill of materials (single-host option)
| Resource | Spec |
|----------|------|
| CPU | 32 vCPU |
| RAM | 128 GB |
| Storage | 4 TB NVMe SSD |
| Network | 1 GbE+ to the routers' BMP source network |
For the split-host option, divide per the architecture table — the DB host
takes the bulk of RAM and all of the fast storage.

488
docs/security-hardening.md Normal file
View File

@ -0,0 +1,488 @@
# OpenBMP Production Security Hardening
A prioritized checklist for hardening the OpenBMP Docker stack before exposing
it to a production ISP network of 40 full-table-edge routers. Work top to
bottom — items are ordered roughly by risk reduction per unit effort.
This document **recommends** changes. It does not modify `docker-compose.yml`
or any running service. Apply the changes in a maintenance window and test.
> Threat model in brief: the stack ingests BMP from production routers, stores
> the full DFZ in PostgreSQL, and exposes Grafana to operators. The crown
> jewels are (a) the database, (b) the Grafana admin plane, and (c) the BMP
> ingest port. Everything below protects one of those three.
---
## Priority 0 — Credentials (do this first)
Every service currently ships with the placeholder credential `openbmp` and
related defaults are committed in `docker-compose.yml`:
| Service | Setting | Current value |
|---------|---------|---------------|
| PostgreSQL | `POSTGRES_USER` / `POSTGRES_PASSWORD` | `openbmp` / `openbmp` |
| psql-app | `POSTGRES_PASSWORD` | `openbmp` |
| whois | `POSTGRES_PASSWORD` | `openbmp` |
| Grafana | `GF_SECURITY_ADMIN_PASSWORD` | `openbmp` |
| InfluxDB | `DOCKER_INFLUXDB_INIT_PASSWORD` | `openbmp123` |
| InfluxDB | `DOCKER_INFLUXDB_INIT_ADMIN_TOKEN` | `openbmp-telemetry-token` |
| Grafana datasource | `secureJsonData.password` | `openbmp` (in `openbmp-ds.yml`) |
### 0.1 Move every secret to `.env` (or a secrets manager)
`.env` is git-ignored. As a minimum, replace the hardcoded literals in
`docker-compose.yml` with `${VAR}` references and define them in `.env`:
```env
# .env — never commit this file
POSTGRES_PASSWORD=<long-random-string>
GF_SECURITY_ADMIN_PASSWORD=<long-random-string>
INFLUXDB_ADMIN_PASSWORD=<long-random-string>
INFLUXDB_ADMIN_TOKEN=<long-random-token>
```
```yaml
# docker-compose.yml (recommended edit — operator applies)
grafana:
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GF_SECURITY_ADMIN_PASSWORD:?set in .env}
psql:
environment:
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:?set in .env}
```
The `:?` form makes the stack fail fast if a secret is missing rather than
silently falling back to a default.
Generate strong values:
```bash
openssl rand -base64 32 # passwords
openssl rand -hex 32 # tokens
```
### 0.2 For a real production deployment, use a secrets manager
`.env` on disk is better than committed literals, but it is still a
plaintext file readable by anyone in the `docker` group. For production:
- **Docker Compose secrets** (`secrets:` block, files mounted at
`/run/secrets/...`) — the lowest-friction upgrade; keep the secret files
outside the repo, `chmod 600`, owned by root.
- **HashiCorp Vault**, **AWS Secrets Manager**, **Bitwarden Secrets**, or your
existing ISP secret store — inject at deploy time via a wrapper that renders
`.env` from the vault and shreds it after `docker compose up`.
Whatever the choice: rotate all six credentials above on first production
deploy — they have been in git history as `openbmp` and must be considered
compromised.
### 0.3 Rotate the Grafana datasource password in lockstep
`obmp-grafana/provisioning/datasources/openbmp-ds.yml` carries
`secureJsonData.password`. It is read at Grafana start. When you change the
PostgreSQL password, update this file too (it supports `$__file{}` and
env-var expansion: `password: $POSTGRES_PASSWORD`) and restart Grafana.
---
## Priority 1 — Network exposure / firewalling
The host currently publishes these ports to `0.0.0.0`: 5000 (BMP), 5432
(PostgreSQL), 9092 (Kafka), 3000 (Grafana), 8086 (InfluxDB), 4300 (whois),
9091 (Authelia). Most should not be world-reachable.
### 1.1 BMP collector (port 5000) — restrict to router management subnets
The collector accepts a BMP session from any source. A rogue BMP feed can
inject bogus routers/peers/prefixes into the database. Firewall it to the
router management subnets only.
`nftables` example (preferred on modern hosts):
```nft
# /etc/nftables.conf — adjust subnets to your router management ranges
table inet obmp {
chain input {
type filter hook input priority 0; policy accept;
# BMP ingest — only from router management subnets
tcp dport 5000 ip saddr { 10.100.0.0/24, 10.100.1.0/24 } accept
tcp dport 5000 drop
}
}
```
`iptables` equivalent:
```bash
iptables -A INPUT -p tcp --dport 5000 -s 10.100.0.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 5000 -s 10.100.1.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 5000 -j DROP
```
> Docker's `iptables` integration uses the `DOCKER-USER` chain for
> container-published ports. Put the rules above in `DOCKER-USER` so Docker
> does not bypass them:
> ```bash
> iptables -I DOCKER-USER -p tcp --dport 5000 -s 10.100.0.0/24 -j RETURN
> iptables -I DOCKER-USER -p tcp --dport 5000 -s 10.100.1.0/24 -j RETURN
> iptables -A DOCKER-USER -p tcp --dport 5000 -j DROP
> ```
### 1.2 PostgreSQL (5432), Kafka (9092), InfluxDB (8086), whois (4300)
None of these need to be reachable from outside the stack:
- **PostgreSQL** — only `psql-app`, `whois`, and `grafana` connect, all on the
Compose network. Bind the published port to loopback only, or drop the
`ports:` mapping entirely:
```yaml
# docker-compose.yml — psql service
ports:
- "127.0.0.1:5432:5432" # localhost only; or remove entirely
```
- **Kafka 9092** — see Priority 2.
- **InfluxDB 8086** — only Grafana and Telegraf use it; bind to loopback or
drop the mapping (Telegraf uses host networking and reaches it via
localhost; Grafana reaches it on the Compose network).
- **whois 4300** — expose only if you actually offer a public whois service;
otherwise bind to loopback.
For anything that genuinely must be reachable, restrict by source with the
firewall pattern from 1.1.
### 1.3 Grafana (3000) — keep it behind Authelia
Authelia already fronts Grafana (the `auth` profile + `GF_AUTH_PROXY_*`
settings). Make that the *only* path:
- Bind Grafana's published port to loopback: `127.0.0.1:3000:3000`, and let
the reverse proxy / Authelia terminate TLS and reach it internally.
- Do **not** leave port 3000 directly reachable — `GF_AUTH_PROXY_ENABLED=true`
trusts the `Remote-User` header, so any client that can reach 3000 directly
and set that header bypasses authentication entirely.
---
## Priority 2 — Kafka transport security
Kafka is currently **PLAINTEXT** and advertises a host-IP listener:
```yaml
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092,PLAINTEXT_HOST://${HOST_IP}:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
```
The `obmp-kafka:29092` listener is internal to the Compose network and is the
only one the collector and psql-app use. The `PLAINTEXT_HOST://...:9092`
listener exists only for outside access and is not needed by the core stack.
**Recommended (simplest, most secure): remove the host listener.** If nothing
outside the Compose network consumes Kafka, drop the `9092` port mapping and
the `PLAINTEXT_HOST` advertised listener so Kafka is reachable only on the
internal Docker network:
```yaml
kafka:
# remove the - "9092:9092" ports entry
environment:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
```
**If external Kafka access is genuinely required** (e.g. a separate analytics
consumer, or the split-host architecture in `production-sizing.md` where
Kafka and the DB are on different hosts), do **not** leave it PLAINTEXT on a
routed network. Enable SASL_SSL on the external listener:
```yaml
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://obmp-kafka:29092,SASL_SSL://${HOST_IP}:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,SASL_SSL:SASL_SSL
KAFKA_SASL_ENABLED_MECHANISMS: SCRAM-SHA-512
KAFKA_SSL_KEYSTORE_LOCATION: /etc/kafka/secrets/kafka.keystore.jks
KAFKA_SSL_KEYSTORE_PASSWORD: ${KAFKA_KEYSTORE_PASSWORD}
KAFKA_SSL_KEY_PASSWORD: ${KAFKA_KEY_PASSWORD}
KAFKA_SSL_TRUSTSTORE_LOCATION: /etc/kafka/secrets/kafka.truststore.jks
KAFKA_SSL_TRUSTSTORE_PASSWORD: ${KAFKA_TRUSTSTORE_PASSWORD}
```
Keep the internal `PLAINTEXT://obmp-kafka:29092` listener for the collector
and psql-app — intra-Compose traffic on a private bridge does not need TLS and
adding SASL there means re-configuring both clients. At minimum, never publish
a PLAINTEXT Kafka listener on an IP that routes beyond the host.
---
## Priority 3 — PostgreSQL hardening
### 3.1 Change the default `openbmp` / `openbmp` credentials
Covered in Priority 0. Note that `POSTGRES_USER`/`POSTGRES_PASSWORD` only take
effect when the data directory is initialized. To rotate on an existing
database, change the password in SQL and update every consumer:
```bash
docker exec -it obmp-psql psql -U openbmp -d openbmp \
-c "ALTER ROLE openbmp WITH PASSWORD '<new-strong-password>';"
```
Then update `POSTGRES_PASSWORD` for `psql-app` and `whois`, the
`secureJsonData.password` in `openbmp-ds.yml`, and restart those services.
### 3.2 Create a least-privilege role for Grafana
Grafana only needs to read. Do not let it connect as the owning role:
```sql
CREATE ROLE grafana_ro LOGIN PASSWORD '<strong-password>';
GRANT CONNECT ON DATABASE openbmp TO grafana_ro;
GRANT USAGE ON SCHEMA public TO grafana_ro;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO grafana_ro;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO grafana_ro;
```
Point `openbmp-ds.yml` at `grafana_ro`. This contains a Grafana compromise to
read-only and blocks SQL-panel writes.
### 3.3 Restrict `pg_hba.conf`
The default OpenBMP image is permissive (`host all all all md5` or similar).
Tighten it so only the stack's own subnet can connect, and require
`scram-sha-256`:
```conf
# pg_hba.conf (inside the obmp-psql container / mounted)
# TYPE DATABASE USER ADDRESS METHOD
local all all scram-sha-256
host openbmp openbmp 172.16.0.0/12 scram-sha-256 # Docker bridge range
host openbmp grafana_ro 172.16.0.0/12 scram-sha-256
hostssl openbmp openbmp 0.0.0.0/0 scram-sha-256 # only if remote DB host
# reject everything else
host all all 0.0.0.0/0 reject
```
Identify the actual Compose network subnet with
`docker network inspect obmp_default` and scope `ADDRESS` to it. Reload with
`docker exec obmp-psql psql -U openbmp -c "SELECT pg_reload_conf();"`.
> `scram-sha-256` requires `password_encryption = scram-sha-256` in
> `postgresql.conf` and that passwords were set/rotated *after* that change.
### 3.4 Enable SSL/TLS
The Grafana datasource already requests `sslmode: "require"` — but the server
must actually present a certificate. In `postgresql.conf`:
```conf
ssl = on
ssl_cert_file = '/var/lib/postgresql/server.crt'
ssl_key_file = '/var/lib/postgresql/server.key'
```
Generate a cert (self-signed is acceptable for an internal DB; use your
internal CA if you have one):
```bash
openssl req -new -x509 -days 825 -nodes -text \
-out server.crt -keyout server.key -subj "/CN=obmp-psql"
chmod 600 server.key # PostgreSQL refuses a world-readable key
```
Mount both files into the container's data directory. For the strongest
posture, move clients to `sslmode: verify-full` once a proper CA chain is in
place. This is most important if PostgreSQL runs on a separate host (the
split-host architecture in `production-sizing.md`) — intra-host Compose
traffic is lower-risk but TLS is still recommended.
### 3.5 Limit listen addresses
If PostgreSQL must accept connections from another host (split-host layout),
keep `listen_addresses` scoped — do not leave it at `*` if a single interface
suffices:
```conf
listen_addresses = 'localhost,172.18.0.1' # loopback + Docker bridge gateway
```
On a single-host deployment, drop the `5432` port mapping entirely (1.2) so
the listener is reachable only on the Compose network.
---
## Priority 4 — Drop `privileged: true` on the `psql` service
```yaml
psql:
privileged: true # <-- remove or replace
shm_size: 1536m
sysctls:
- net.ipv4.tcp_keepalive_intvl=30
- net.ipv4.tcp_keepalive_probes=5
- net.ipv4.tcp_keepalive_time=180
```
**Why it is a risk:** `privileged: true` gives the container *all* Linux
capabilities, disables seccomp/AppArmor confinement, and grants access to all
host devices. A compromise of PostgreSQL — the process most exposed to
untrusted route data — would then be a near-complete host compromise. This is
the single largest container-isolation gap in the stack.
**Why it is probably there:** PostgreSQL needs adequate shared memory and
benefits from the TCP keepalive `sysctls`. The compose file already sets
`shm_size: 1536m` and the `sysctls:` list explicitly — both of which Docker
applies *without* needing privileged mode. So `privileged: true` is most
likely a leftover, not a hard requirement.
**Recommended action — test without it:**
1. In a maintenance window, remove `privileged: true` and start the service.
2. Confirm PostgreSQL starts, the namespaced `sysctls` apply
(`docker exec obmp-psql sysctl net.ipv4.tcp_keepalive_time`), and shared
memory is honored (`docker exec obmp-psql cat /proc/meminfo | grep Shmem`,
and watch for `could not resize shared memory segment` errors in the log).
3. If everything is healthy, leave it removed.
If a specific capability turns out to be needed, add only that one instead of
going fully privileged:
```yaml
psql:
# privileged: true <-- removed
shm_size: 1536m
cap_drop:
- ALL
cap_add:
- CHOWN
- SETUID
- SETGID
- DAC_OVERRIDE # add only capabilities proven necessary by testing
sysctls:
- net.ipv4.tcp_keepalive_intvl=30
- net.ipv4.tcp_keepalive_probes=5
- net.ipv4.tcp_keepalive_time=180
```
The `sysctls:` block stays — those are namespaced and do not require
privileged mode.
---
## Priority 5 — Container hardening (defense in depth)
Apply across services after the higher-priority items. Test each service
individually — `read_only` in particular will surface paths a service writes
to that then need explicit `tmpfs` mounts.
### 5.1 `no-new-privileges`
Prevents a process inside a container from gaining privileges via setuid
binaries. Safe to apply to every service:
```yaml
security_opt:
- no-new-privileges:true
```
### 5.2 Drop capabilities
Most of these services need almost no Linux capabilities. Start from zero and
add back only what breaks:
```yaml
cap_drop:
- ALL
```
- `grafana`, `whois`, `portal`, `zookeeper` — typically run fine with
`cap_drop: [ALL]`.
- `collector`, `kafka`, `psql`, `psql-app` — drop ALL, then add back any
capability proven necessary (see Priority 4 for `psql`).
- `traffic-gen*` legitimately need `NET_RAW`/`NET_ADMIN` (Scapy) — leave those
`cap_add` entries; they are already minimal.
### 5.3 Read-only root filesystem
Make the root filesystem immutable where the service only writes to known
volumes:
```yaml
grafana:
read_only: true
tmpfs:
- /tmp
# /var/lib/grafana is already a bind mount — writes go there, not to rootfs
portal:
read_only: true # nginx:alpine static site; add tmpfs for nginx
tmpfs:
- /tmp
- /var/cache/nginx
- /var/run
```
`read_only` is straightforward for `grafana`, `portal`, and `whois`. It is
trickier for `psql`, `kafka`, and `zookeeper` (they write to data volumes but
also expect a writable rootfs in places) — test individually and add `tmpfs`
mounts for any write paths, or skip `read_only` for those and rely on
`cap_drop` + `no-new-privileges`.
### 5.4 Pin and scan images
Images are already version-pinned (`grafana:9.1.7`, `cp-kafka:7.1.1`,
`openbmp/postgres:2.2.1`, etc.) — good. Add periodic vulnerability scanning:
```bash
trivy image openbmp/postgres:2.2.1
trivy image grafana/grafana:9.1.7
```
Note Grafana 9.1.7 is old; review Grafana security advisories and plan an
upgrade path. Track CVEs for the pinned Confluent and OpenBMP images too.
### 5.5 Resource limits
Every service already has a `mem_limit`. For production also set `cpus:` (or
`deploy.resources.limits`) so a runaway query or ingest burst cannot starve
the host — this also mitigates local denial-of-service. See
`docs/production-sizing.md` for target values.
---
## Priority 6 — Authelia / access control
Authelia fronts Grafana (ROADMAP C5). For production:
- Enforce **TOTP / 2FA** for all operator accounts; do not allow `one_factor`
for the Grafana route.
- Set short session timeouts and an inactivity expiry in the Authelia config.
- Use strong, unique passwords; back the user store with your IdP / LDAP if
available rather than the file backend.
- Ensure Authelia's own secrets (`jwt_secret`, `session.secret`,
`storage.encryption_key`) are strong and stored as secrets, not literals.
- Confirm the reverse proxy strips any client-supplied `Remote-User` header
before Authelia sets it — otherwise the auth-proxy trust model is bypassable
(see 1.3).
---
## Quick checklist
- [ ] Rotate all six default credentials; remove literals from compose, move to `.env` / secrets manager
- [ ] Update `openbmp-ds.yml` datasource password to match
- [ ] Firewall BMP port 5000 to router management subnets (`DOCKER-USER` chain)
- [ ] Bind 5432 / 8086 / 4300 to loopback or drop the port mappings
- [ ] Bind Grafana 3000 to loopback; reach it only via Authelia
- [ ] Remove the Kafka `PLAINTEXT_HOST` listener + 9092 mapping (or enable SASL_SSL if external access needed)
- [ ] Create `grafana_ro` least-privilege DB role; repoint the datasource
- [ ] Tighten `pg_hba.conf`; require `scram-sha-256`
- [ ] Enable PostgreSQL `ssl = on` with a server certificate
- [ ] Test removing `privileged: true` from `psql`; replace with specific `cap_add` if needed
- [ ] Add `security_opt: [no-new-privileges:true]` to all services
- [ ] Add `cap_drop: [ALL]` and add back only required capabilities
- [ ] Add `read_only: true` + `tmpfs` to `grafana` / `portal` / `whois`
- [ ] Add `cpus:` limits per service
- [ ] Scan images with `trivy`; plan a Grafana upgrade off 9.1.7
- [ ] Enforce TOTP and short sessions in Authelia

View File

@ -41,6 +41,7 @@
<AnnounceForm v-else-if="activeTab === 'inject'" @routes-changed="fetchRoutes" />
<PeerStatus v-else-if="activeTab === 'peers'" :peers="peers" />
<ChurnControl v-else-if="activeTab === 'churn'" />
<FullTable v-else-if="activeTab === 'full-table'" @routes-changed="fetchRoutes" />
</div>
</main>
</div>
@ -63,6 +64,7 @@ import RouteTable from './components/RouteTable.vue'
import AnnounceForm from './components/AnnounceForm.vue'
import PeerStatus from './components/PeerStatus.vue'
import ChurnControl from './components/ChurnControl.vue'
import FullTable from './components/FullTable.vue'
const health = ref(null)
const routes = ref([])
@ -75,6 +77,7 @@ const tabs = [
{ id: 'inject', label: 'Inject' },
{ id: 'peers', label: 'Peers' },
{ id: 'churn', label: 'Churn' },
{ id: 'full-table', label: 'Full Table' },
]
async function fetchHealth() {

View File

@ -1,4 +1,4 @@
const BASE = '/api'
const BASE = '/exabgp/api'
async function req(method, path, body) {
const opts = { method, headers: { 'Content-Type': 'application/json' } }
@ -18,4 +18,7 @@ export const api = {
announce: payload => req('POST', '/announce', payload),
withdraw: prefixes => req('POST', '/withdraw', { prefixes }),
withdrawAll: () => req('POST', '/withdraw/all'),
fullTableStart: (count, batchSize) => req('POST', '/full-table/start', { count, batch_size: batchSize }),
fullTableStatus: () => req('GET', '/full-table/status'),
fullTableStop: () => req('POST', '/full-table/stop'),
}

View File

@ -0,0 +1,477 @@
<template>
<div class="full-table">
<h2 class="section-title">Full Table Injection</h2>
<p class="section-desc">
Inject a realistic IPv4 routing table into ExaBGP for stress testing.
Routes are generated with varied AS paths, prefix lengths, and communities matching real DFZ distribution.
</p>
<div class="config-card">
<!-- Level selector -->
<div class="form-group">
<label>Table Size</label>
<div class="level-grid">
<button
v-for="level in levels"
:key="level.count"
class="level-btn"
:class="{ selected: selectedCount === level.count }"
:disabled="injecting"
@click="selectedCount = level.count"
>
<span class="level-count">{{ level.label }}</span>
<span class="level-desc">{{ level.desc }}</span>
</button>
</div>
</div>
<!-- Custom count -->
<div class="form-group">
<label>Custom Count</label>
<input
v-model.number="selectedCount"
type="number"
min="100"
max="950000"
step="1000"
:disabled="injecting"
class="mono-input"
/>
</div>
<!-- Action buttons -->
<div class="action-row">
<button v-if="!injecting" class="btn-start" @click="startInjection" :disabled="!selectedCount">
<span>&#9654;</span> Inject {{ formatNum(selectedCount) }} Routes
</button>
<button v-else class="btn-stop" @click="stopInjection">
<span>&#9632;</span> Stop Injection
</button>
<button
v-if="!injecting && lastCompleted"
class="btn-withdraw"
@click="withdrawAll"
:disabled="withdrawing"
>
{{ withdrawing ? 'Withdrawing...' : 'Withdraw All' }}
</button>
</div>
</div>
<!-- Status display -->
<div v-if="injecting || statusMsg" class="status-card">
<div class="status-header">
<span class="status-dot" :class="injecting ? 'dot-active' : 'dot-idle'"></span>
<span class="status-text">{{ statusMsg || 'Idle' }}</span>
</div>
<!-- Progress bar -->
<div v-if="state.total > 0" class="progress-section">
<div class="progress-labels">
<span>{{ formatNum(state.injected) }} / {{ formatNum(state.total) }}</span>
<span>{{ state.progress_pct || 0 }}%</span>
</div>
<div class="progress-track">
<div class="progress-fill" :style="{ width: (state.progress_pct || 0) + '%' }"></div>
</div>
</div>
<!-- Stats row -->
<div v-if="state.total > 0" class="stats-row">
<div class="stat-item">
<span class="stat-label">Rate</span>
<span class="stat-val">{{ formatNum(state.rate_pps || 0) }}/s</span>
</div>
<div class="stat-item">
<span class="stat-label">Elapsed</span>
<span class="stat-val">{{ state.elapsed_sec || 0 }}s</span>
</div>
<div class="stat-item">
<span class="stat-label">Active Routes</span>
<span class="stat-val">{{ formatNum(state.active_routes || 0) }}</span>
</div>
</div>
<!-- Error -->
<div v-if="state.error" class="inject-error">{{ state.error }}</div>
</div>
</div>
</template>
<script setup>
import { ref, onUnmounted } from 'vue'
import { api } from '../api.js'
const emit = defineEmits(['routes-changed'])
const levels = [
{ count: 1000, label: '1K', desc: 'Quick test' },
{ count: 10000, label: '10K', desc: 'Light load' },
{ count: 50000, label: '50K', desc: 'Medium load' },
{ count: 100000, label: '100K', desc: 'Stress test' },
{ count: 500000, label: '500K', desc: 'Heavy load' },
{ count: 900000, label: '900K', desc: 'Full DFZ' },
]
const selectedCount = ref(10000)
const injecting = ref(false)
const statusMsg = ref('')
const lastCompleted = ref(false)
const withdrawing = ref(false)
const state = ref({})
let pollTimer = null
function formatNum(n) {
if (n == null) return '0'
return Number(n).toLocaleString()
}
async function startInjection() {
try {
statusMsg.value = 'Starting injection...'
injecting.value = true
lastCompleted.value = false
state.value = {}
await api.fullTableStart(selectedCount.value, 1000)
startPolling()
} catch (e) {
statusMsg.value = `Start failed: ${e.message}`
injecting.value = false
}
}
async function stopInjection() {
try {
await api.fullTableStop()
statusMsg.value = 'Stop requested...'
} catch (e) {
statusMsg.value = `Stop failed: ${e.message}`
}
}
async function withdrawAll() {
withdrawing.value = true
try {
const data = await api.withdrawAll()
statusMsg.value = `Withdrew ${data.count} routes`
lastCompleted.value = false
state.value = {}
emit('routes-changed')
} catch (e) {
statusMsg.value = `Withdraw failed: ${e.message}`
} finally {
withdrawing.value = false
}
}
function startPolling() {
stopPolling()
pollStatus()
pollTimer = setInterval(pollStatus, 2000)
}
function stopPolling() {
if (pollTimer) {
clearInterval(pollTimer)
pollTimer = null
}
}
async function pollStatus() {
try {
const data = await api.fullTableStatus()
state.value = data
if (data.active) {
statusMsg.value = `Injecting: ${formatNum(data.injected)} / ${formatNum(data.total)} (${data.rate_pps || 0}/s)`
} else if (data.error) {
statusMsg.value = `Error: ${data.error}`
injecting.value = false
stopPolling()
} else if (data.injected > 0) {
statusMsg.value = `Complete: ${formatNum(data.injected)} routes in ${data.elapsed_sec}s (${data.rate_pps}/s)`
injecting.value = false
lastCompleted.value = true
stopPolling()
emit('routes-changed')
}
} catch (e) {
// keep polling
}
}
onUnmounted(() => {
stopPolling()
})
</script>
<style scoped>
.full-table {
display: flex;
flex-direction: column;
gap: 18px;
max-width: 680px;
}
.section-title {
font-size: 14px;
font-weight: 600;
color: var(--muted);
text-transform: uppercase;
letter-spacing: 0.08em;
}
.section-desc {
color: var(--muted);
font-size: 13px;
line-height: 1.6;
margin-top: -8px;
}
.config-card {
background: var(--card-bg);
border: 1px solid var(--border);
border-radius: var(--radius);
padding: 20px;
display: flex;
flex-direction: column;
gap: 18px;
}
.form-group {
display: flex;
flex-direction: column;
gap: 6px;
}
label {
font-size: 12px;
font-weight: 600;
color: var(--muted);
text-transform: uppercase;
letter-spacing: 0.05em;
}
.level-grid {
display: grid;
grid-template-columns: repeat(3, 1fr);
gap: 8px;
}
.level-btn {
display: flex;
flex-direction: column;
align-items: center;
gap: 2px;
padding: 12px 8px;
background: var(--bg);
border: 1px solid var(--border);
border-radius: var(--radius);
color: var(--text);
transition: all 0.15s;
}
.level-btn:hover:not(:disabled) {
border-color: var(--accent);
background: rgba(79, 156, 249, 0.08);
}
.level-btn.selected {
border-color: var(--accent);
background: rgba(79, 156, 249, 0.15);
box-shadow: 0 0 0 1px var(--accent);
}
.level-count {
font-size: 18px;
font-weight: 700;
font-family: 'Cascadia Code', 'Fira Code', 'Consolas', monospace;
color: var(--accent);
}
.level-desc {
font-size: 11px;
color: var(--muted);
}
.mono-input {
font-family: 'Cascadia Code', 'Fira Code', 'Consolas', monospace;
background: var(--bg);
color: var(--text);
border: 1px solid var(--border);
border-radius: var(--radius);
padding: 7px 10px;
font-size: 13px;
outline: none;
max-width: 200px;
}
.mono-input:focus {
border-color: var(--accent);
}
.action-row {
display: flex;
gap: 10px;
padding-top: 4px;
border-top: 1px solid var(--border);
}
.btn-start {
padding: 9px 22px;
background: rgba(72, 187, 120, 0.15);
color: #48bb78;
border: 1px solid rgba(72, 187, 120, 0.3);
font-weight: 700;
font-size: 14px;
display: flex;
align-items: center;
gap: 7px;
}
.btn-start:hover:not(:disabled) {
background: rgba(72, 187, 120, 0.25);
}
.btn-stop {
padding: 9px 22px;
background: rgba(252, 129, 129, 0.15);
color: #fc8181;
border: 1px solid rgba(252, 129, 129, 0.3);
font-weight: 700;
font-size: 14px;
display: flex;
align-items: center;
gap: 7px;
animation: pulse-border 1.5s ease-in-out infinite;
}
.btn-stop:hover {
background: rgba(252, 129, 129, 0.25);
}
.btn-withdraw {
padding: 9px 18px;
background: rgba(246, 173, 85, 0.15);
color: #f6ad55;
border: 1px solid rgba(246, 173, 85, 0.3);
font-weight: 600;
font-size: 13px;
}
.btn-withdraw:hover:not(:disabled) {
background: rgba(246, 173, 85, 0.25);
}
.status-card {
background: var(--card-bg);
border: 1px solid var(--border);
border-radius: var(--radius);
padding: 16px 18px;
display: flex;
flex-direction: column;
gap: 12px;
}
.status-header {
display: flex;
align-items: center;
gap: 10px;
}
.status-dot {
width: 10px;
height: 10px;
border-radius: 50%;
flex-shrink: 0;
}
.dot-active {
background: #48bb78;
box-shadow: 0 0 8px #48bb78;
animation: pulse-dot 1s ease-in-out infinite;
}
.dot-idle {
background: var(--muted);
}
.status-text {
font-size: 14px;
font-weight: 600;
color: var(--text);
font-family: 'Cascadia Code', 'Fira Code', 'Consolas', monospace;
}
.progress-section {
display: flex;
flex-direction: column;
gap: 5px;
}
.progress-labels {
display: flex;
justify-content: space-between;
font-size: 11px;
color: var(--muted);
}
.progress-track {
height: 6px;
background: var(--border);
border-radius: 3px;
overflow: hidden;
}
.progress-fill {
height: 100%;
background: var(--accent);
border-radius: 3px;
transition: width 0.5s ease;
}
.stats-row {
display: flex;
gap: 24px;
}
.stat-item {
display: flex;
flex-direction: column;
gap: 2px;
}
.stat-label {
font-size: 10px;
color: var(--muted);
text-transform: uppercase;
letter-spacing: 0.05em;
}
.stat-val {
font-size: 15px;
font-weight: 700;
color: var(--text);
font-family: 'Cascadia Code', 'Fira Code', 'Consolas', monospace;
}
.inject-error {
font-size: 12px;
color: #fc8181;
padding: 6px 10px;
background: rgba(252, 129, 129, 0.08);
border-radius: 4px;
border: 1px solid rgba(252, 129, 129, 0.2);
}
@keyframes pulse-dot {
0%, 100% { opacity: 1; }
50% { opacity: 0.5; }
}
@keyframes pulse-border {
0%, 100% { border-color: rgba(252, 129, 129, 0.3); }
50% { border-color: rgba(252, 129, 129, 0.6); }
}
</style>

View File

@ -2,6 +2,7 @@ import { defineConfig } from 'vite'
import vue from '@vitejs/plugin-vue'
export default defineConfig({
base: '/exabgp/',
plugins: [vue()],
server: {
proxy: {

View File

@ -48,12 +48,16 @@ peer_states = {}
# ExaBGP command helpers
# ---------------------------------------------------------------------------
_quiet_mode = False
def _send(cmd: str):
"""Write a command to ExaBGP via stdout."""
with _stdout_lock:
sys.stdout.write(cmd + '\n')
sys.stdout.flush()
log.info('→ ExaBGP: %s', cmd)
if not _quiet_mode:
log.info('→ ExaBGP: %s', cmd)
def _build_announce(prefix, next_hop='self', as_path=None, communities=None, med=None, local_pref=None):
@ -162,7 +166,22 @@ def api_withdraw_all():
# ---------------------------------------------------------------------------
sys.path.insert(0, '/exabgp')
from scenarios import SCENARIOS
from scenarios import SCENARIOS, generate_full_internet
# ---------------------------------------------------------------------------
# Full-table background injection
# ---------------------------------------------------------------------------
_injection_state = {
'active': False,
'total': 0,
'injected': 0,
'elapsed_sec': 0,
'rate_pps': 0,
'error': None,
'stop_requested': False,
}
_injection_lock = threading.Lock()
@app.route('/scenarios', methods=['GET'])
@ -223,6 +242,131 @@ def get_peers():
return jsonify({'peers': peer_states})
# ---------------------------------------------------------------------------
# Full-table injection endpoints
# ---------------------------------------------------------------------------
def _injection_worker(count, batch_size):
"""Background thread: generate and inject full internet table."""
global _quiet_mode
try:
_quiet_mode = True # suppress per-route logging
log.info('Generating %d full-table prefixes...', count)
routes = generate_full_internet(count)
with _injection_lock:
_injection_state['total'] = len(routes)
log.info('Generated %d routes, starting injection at batch_size=%d', len(routes), batch_size)
start_time = time.time()
for i, route in enumerate(routes):
with _injection_lock:
if _injection_state['stop_requested']:
log.info('Injection stopped by user at %d/%d', i, len(routes))
break
prefix = route['prefix']
announce_route(
prefix,
next_hop=route.get('next_hop', 'self'),
as_path=route.get('as_path', []),
communities=route.get('communities', []),
med=route.get('med'),
local_pref=route.get('local_pref'),
)
# Update progress periodically (every batch_size routes)
if (i + 1) % batch_size == 0:
elapsed = time.time() - start_time
with _injection_lock:
_injection_state['injected'] = i + 1
_injection_state['elapsed_sec'] = round(elapsed, 1)
_injection_state['rate_pps'] = round((i + 1) / elapsed, 1) if elapsed > 0 else 0
log.info('Injection progress: %d/%d (%.0f/s)',
i + 1, len(routes), (i + 1) / elapsed if elapsed > 0 else 0)
elapsed = time.time() - start_time
with _injection_lock:
_injection_state['injected'] = min(i + 1, len(routes))
_injection_state['elapsed_sec'] = round(elapsed, 1)
_injection_state['rate_pps'] = round(_injection_state['injected'] / elapsed, 1) if elapsed > 0 else 0
_injection_state['active'] = False
log.info('Injection complete: %d routes in %.1fs (%.0f/s)',
_injection_state['injected'], elapsed,
_injection_state['injected'] / elapsed if elapsed > 0 else 0)
except Exception as e:
log.error('Injection error: %s', e)
with _injection_lock:
_injection_state['error'] = str(e)
_injection_state['active'] = False
finally:
_quiet_mode = False
@app.route('/full-table/start', methods=['POST'])
def start_full_table():
"""Start background injection of a full IPv4 routing table.
POST body (all optional):
count: Number of prefixes (default 900000)
batch_size: Progress update interval (default 1000)
"""
with _injection_lock:
if _injection_state['active']:
return jsonify({
'error': 'Injection already in progress',
'state': dict(_injection_state),
}), 409
data = request.get_json(force=True) if request.data else {}
count = int(data.get('count', 900000))
batch_size = int(data.get('batch_size', 1000))
with _injection_lock:
_injection_state.update({
'active': True,
'total': count,
'injected': 0,
'elapsed_sec': 0,
'rate_pps': 0,
'error': None,
'stop_requested': False,
})
t = threading.Thread(target=_injection_worker, args=(count, batch_size), daemon=True)
t.start()
log.info('Started full-table injection: %d prefixes', count)
return jsonify({
'status': 'started',
'count': count,
'message': f'Generating and injecting {count} prefixes in background. GET /full-table/status to track progress.',
})
@app.route('/full-table/status', methods=['GET'])
def full_table_status():
"""Get current full-table injection progress."""
with _injection_lock:
state = dict(_injection_state)
if state['total'] > 0:
state['progress_pct'] = round(state['injected'] / state['total'] * 100, 1)
else:
state['progress_pct'] = 0
state['active_routes'] = len(active_routes)
return jsonify(state)
@app.route('/full-table/stop', methods=['POST'])
def stop_full_table():
"""Stop an in-progress full-table injection."""
with _injection_lock:
if not _injection_state['active']:
return jsonify({'error': 'No injection in progress'}), 400
_injection_state['stop_requested'] = True
return jsonify({'status': 'stop_requested', 'injected_so_far': _injection_state['injected']})
# ---------------------------------------------------------------------------
# ExaBGP event loop (main thread)
# ---------------------------------------------------------------------------

View File

@ -12,6 +12,9 @@ Usage:
inject.py withdraw-all
inject.py scenario <name>
inject.py withdraw-scenario <name>
inject.py full-table [--count N] [--follow] # inject full IPv4 table (background)
inject.py full-table-status # show injection progress
inject.py full-table-stop # stop injection
inject.py churn [--count N] [--interval SEC] # cycle announce/withdraw for ip_rib_log population
inject.py monitor # live-refresh terminal view
@ -29,8 +32,8 @@ import requests
API = os.environ.get('EXABGP_API', 'http://localhost:5050')
def _post(path, data=None):
r = requests.post(f'{API}{path}', json=data or {}, timeout=10)
def _post(path, data=None, timeout=10):
r = requests.post(f'{API}{path}', json=data or {}, timeout=timeout)
r.raise_for_status()
return r.json()
@ -174,6 +177,101 @@ def cmd_withdraw_scenario(args):
print(f"Withdrew scenario '{args.name}': {data['count']} routes withdrawn")
def cmd_full_table(args):
"""Inject a full IPv4 routing table for stress testing."""
count = args.count
print(f"Starting full-table injection: {count} prefixes")
print("This generates routes in background. Use 'inject.py full-table-status' to track.\n")
data = _post('/full-table/start', {'count': count, 'batch_size': args.batch_size}, timeout=120)
print(data.get('message', 'Started'))
if args.follow:
print()
_follow_injection()
def cmd_full_table_status(args):
"""Show full-table injection progress."""
data = _get('/full-table/status')
active = data.get('active', False)
total = data.get('total', 0)
injected = data.get('injected', 0)
pct = data.get('progress_pct', 0)
rate = data.get('rate_pps', 0)
elapsed = data.get('elapsed_sec', 0)
error = data.get('error')
active_routes = data.get('active_routes', 0)
if error:
print(f"ERROR: {error}")
elif active:
bar_len = 40
filled = int(bar_len * pct / 100)
bar = '#' * filled + '-' * (bar_len - filled)
print(f"[{bar}] {pct:.1f}%")
print(f" Injected: {injected:,} / {total:,} ({rate:.0f} routes/s)")
print(f" Elapsed: {elapsed:.0f}s")
print(f" Active routes in ExaBGP: {active_routes:,}")
elif total > 0:
print(f"Injection complete: {injected:,} / {total:,} routes in {elapsed:.0f}s ({rate:.0f}/s)")
print(f"Active routes in ExaBGP: {active_routes:,}")
else:
print("No injection running or completed.")
print(f"Active routes: {active_routes:,}")
def cmd_full_table_stop(args):
"""Stop an in-progress full-table injection."""
try:
data = _post('/full-table/stop')
print(f"Stop requested. Injected so far: {data.get('injected_so_far', '?'):,}")
except requests.exceptions.HTTPError as e:
if e.response.status_code == 400:
print("No injection in progress.")
else:
raise
def _follow_injection():
"""Poll injection status until complete."""
import shutil
lines_printed = 0
try:
while True:
data = _get('/full-table/status')
active = data.get('active', False)
total = data.get('total', 0)
injected = data.get('injected', 0)
pct = data.get('progress_pct', 0)
rate = data.get('rate_pps', 0)
elapsed = data.get('elapsed_sec', 0)
active_routes = data.get('active_routes', 0)
# Move cursor up to overwrite
if lines_printed > 0:
print(f"\033[{lines_printed}A", end='')
bar_len = 40
filled = int(bar_len * pct / 100)
bar = '#' * filled + '-' * (bar_len - filled)
output_lines = [
f" [{bar}] {pct:.1f}%",
f" Injected: {injected:,} / {total:,} ({rate:.0f} routes/s) elapsed: {elapsed:.0f}s",
f" Active routes: {active_routes:,}",
]
print('\n'.join(output_lines))
lines_printed = len(output_lines)
if not active:
print(f"\nDone! {injected:,} routes injected in {elapsed:.0f}s")
break
time.sleep(2)
except KeyboardInterrupt:
print("\n\nFollowing stopped (injection continues in background).")
def cmd_churn(args):
"""
Cycle announce/withdraw on the 'churn' scenario to generate ip_rib_log
@ -236,6 +334,17 @@ def main():
p = sub.add_parser('withdraw-scenario', help='Withdraw a named scenario')
p.add_argument('name')
p = sub.add_parser('full-table', help='Inject full IPv4 routing table (background)')
p.add_argument('--count', type=int, default=900000, metavar='N',
help='Number of prefixes to inject (default: 900000)')
p.add_argument('--batch-size', type=int, default=1000, metavar='N',
help='Progress update interval (default: 1000)')
p.add_argument('--follow', '-f', action='store_true',
help='Follow progress until complete')
sub.add_parser('full-table-status', help='Show full-table injection progress')
sub.add_parser('full-table-stop', help='Stop full-table injection')
p = sub.add_parser('churn', help='Cycle announce/withdraw to populate ip_rib_log')
p.add_argument('--count', type=int, default=0, metavar='N',
help='Number of cycles (0 = infinite)')
@ -255,6 +364,9 @@ def main():
'withdraw-all': cmd_withdraw_all,
'scenario': cmd_scenario,
'withdraw-scenario': cmd_withdraw_scenario,
'full-table': cmd_full_table,
'full-table-status': cmd_full_table_status,
'full-table-stop': cmd_full_table_stop,
'churn': cmd_churn,
}

View File

@ -0,0 +1,658 @@
#!/usr/bin/env python3
"""
Route Diversity Configuration Script
=====================================
Adds loopbacks, static routes, route-policies, and BGP redistribution
to R9K-01 through R9K-07 to create locally-originated routes that
produce meaningful RR Loc-RIB diffs between CORE-01 and CORE-02.
IS-IS topology (natural asymmetry no metric tuning needed):
CORE-01 R9K-01, R9K-02, R9K-03, R9K-04, R9K-05
CORE-02 R9K-05, R9K-06, R9K-07
R9K-04 R9K-06 (cross-link)
R9K-05 dual-homed to both COREs
Overlapping prefixes from CORE-01-side and CORE-02-side routers produce
next-hop diffs because each RR picks the client with lowest IGP cost.
Address plan:
Loopbacks: 10.110.{router_id}.{1,2}/32 (unique per router)
Overlap LBs: 10.110.{100-103}.1/32 (shared across router pairs)
Static routes: 10.111.{router_id}.0/24 (unique per router)
Overlap statics: 10.111.{100-103}.0/24 (shared across router pairs)
Usage:
python3 route_diversity_config.py # apply all config
python3 route_diversity_config.py --verify-only # just check current state
python3 route_diversity_config.py --rollback # remove all added config
"""
from ncclient import manager
import xml.etree.ElementTree as ET
import sys
import argparse
# YANG namespaces
IFMGR_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-ifmgr-cfg'
IPV4IO_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-ipv4-io-cfg'
STATIC_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-ip-static-cfg'
BGP_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-ipv4-bgp-cfg'
ISIS_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-clns-isis-cfg'
RPL_NS = 'http://cisco.com/ns/yang/Cisco-IOS-XR-policy-repository-cfg'
# ──────────────────────────────────────────────────────────────────────
# Router definitions
# ──────────────────────────────────────────────────────────────────────
ROUTERS = {
'R9K-01': {
'mgmt': '10.100.0.1',
'loopbacks': [
('Loopback10', '10.110.1.1', '255.255.255.255'),
('Loopback11', '10.110.1.2', '255.255.255.255'),
('Loopback100', '10.110.100.1', '255.255.255.255'), # overlap with R9K-06
('Loopback103', '10.110.103.1', '255.255.255.255'), # overlap with R9K-04, R9K-07
],
'statics': [
('10.111.1.0', 24, 100), # unique, tag=100 → LP=200
('10.111.100.0', 24, 100), # overlap with R9K-06 (tag=200)
('10.111.103.0', 24, 100), # overlap with R9K-04, R9K-07 (same tag)
],
},
'R9K-02': {
'mgmt': '10.100.0.2',
'loopbacks': [
('Loopback10', '10.110.2.1', '255.255.255.255'),
('Loopback11', '10.110.2.2', '255.255.255.255'),
('Loopback101', '10.110.101.1', '255.255.255.255'), # overlap with R9K-07
],
'statics': [
('10.111.2.0', 24, 100),
('10.111.101.0', 24, 100), # overlap with R9K-07 (tag=300)
],
},
'R9K-03': {
'mgmt': '10.100.0.3',
'loopbacks': [
('Loopback10', '10.110.3.1', '255.255.255.255'),
('Loopback11', '10.110.3.2', '255.255.255.255'),
('Loopback102', '10.110.102.1', '255.255.255.255'), # overlap with R9K-05
],
'statics': [
('10.111.3.0', 24, 100),
('10.111.102.0', 24, 100), # overlap with R9K-04 (tag=200)
],
},
'R9K-04': {
'mgmt': '10.100.0.4',
'loopbacks': [
('Loopback10', '10.110.4.1', '255.255.255.255'),
('Loopback11', '10.110.4.2', '255.255.255.255'),
('Loopback103', '10.110.103.1', '255.255.255.255'), # overlap with R9K-01, R9K-07
],
'statics': [
('10.111.4.0', 24, 200), # tag=200 → LP=150
('10.111.102.0', 24, 200), # overlap with R9K-03 (tag=100)
('10.111.103.0', 24, 100), # overlap with R9K-01, R9K-07 (same tag)
],
},
'R9K-05': {
'mgmt': '10.100.0.5',
'loopbacks': [
('Loopback10', '10.110.5.1', '255.255.255.255'),
('Loopback11', '10.110.5.2', '255.255.255.255'),
('Loopback102', '10.110.102.1', '255.255.255.255'), # overlap with R9K-03
],
'statics': [
('10.111.5.0', 24, 100),
],
},
'R9K-06': {
'mgmt': '10.100.0.6',
'loopbacks': [
('Loopback10', '10.110.6.1', '255.255.255.255'),
('Loopback11', '10.110.6.2', '255.255.255.255'),
('Loopback100', '10.110.100.1', '255.255.255.255'), # overlap with R9K-01
],
'statics': [
('10.111.6.0', 24, 200), # tag=200 → LP=150
('10.111.100.0', 24, 200), # overlap with R9K-01 (tag=100)
],
},
'R9K-07': {
'mgmt': '10.100.0.7',
'loopbacks': [
('Loopback10', '10.110.7.1', '255.255.255.255'),
('Loopback11', '10.110.7.2', '255.255.255.255'),
('Loopback101', '10.110.101.1', '255.255.255.255'), # overlap with R9K-02
('Loopback103', '10.110.103.1', '255.255.255.255'), # overlap with R9K-01, R9K-04
],
'statics': [
('10.111.7.0', 24, 300), # tag=300 → LP=100
('10.111.101.0', 24, 300), # overlap with R9K-02 (tag=100)
('10.111.103.0', 24, 100), # overlap with R9K-01, R9K-04 (same tag)
],
},
}
# ──────────────────────────────────────────────────────────────────────
# Route-policy (RPL text blob)
# ──────────────────────────────────────────────────────────────────────
ROUTE_POLICY_NAME = 'REDIST-TO-BGP'
ROUTE_POLICY_BODY = """\
route-policy REDIST-TO-BGP
if tag is 100 then
set local-preference 200
set med 50
set community (65020:100) additive
pass
elseif tag is 200 then
set local-preference 150
set med 100
set community (65020:200) additive
pass
elseif tag is 300 then
set local-preference 100
set med 200
set community (65020:300) additive
pass
else
set local-preference 100
pass
endif
end-policy
"""
# ──────────────────────────────────────────────────────────────────────
# XML builders
# ──────────────────────────────────────────────────────────────────────
def loopback_xml(name, addr, mask):
"""Create a loopback interface with an IPv4 address."""
return f"""
<config>
<interface-configurations xmlns="{IFMGR_NS}">
<interface-configuration>
<active>act</active>
<interface-name>{name}</interface-name>
<interface-virtual/>
<ipv4-network xmlns="{IPV4IO_NS}">
<addresses>
<primary>
<address>{addr}</address>
<netmask>{mask}</netmask>
</primary>
</addresses>
</ipv4-network>
</interface-configuration>
</interface-configurations>
</config>
"""
def static_route_xml(prefix, prefix_len, tag):
"""Create a static route to Null0 with a tag."""
return f"""
<config>
<router-static xmlns="{STATIC_NS}">
<default-vrf>
<address-family>
<vrfipv4>
<vrf-unicast>
<vrf-prefixes>
<vrf-prefix>
<prefix>{prefix}</prefix>
<prefix-length>{prefix_len}</prefix-length>
<vrf-route>
<vrf-next-hop-table>
<vrf-next-hop-interface-name>
<interface-name>Null0</interface-name>
<tag>{tag}</tag>
</vrf-next-hop-interface-name>
</vrf-next-hop-table>
</vrf-route>
</vrf-prefix>
</vrf-prefixes>
</vrf-unicast>
</vrfipv4>
</address-family>
</default-vrf>
</router-static>
</config>
"""
def route_policy_xml(name, body):
"""Create/replace a route-policy (RPL text blob)."""
return f"""
<config>
<routing-policy xmlns="{RPL_NS}">
<route-policies>
<route-policy>
<route-policy-name>{name}</route-policy-name>
<rpl-route-policy>{body}</rpl-route-policy>
</route-policy>
</route-policies>
</routing-policy>
</config>
"""
def isis_passive_xml(intf_name):
"""Add a loopback to IS-IS instance 1 (passive by default for loopbacks)."""
return f"""
<config>
<isis xmlns="{ISIS_NS}">
<instances>
<instance>
<instance-name>1</instance-name>
<interfaces>
<interface>
<interface-name>{intf_name}</interface-name>
<running/>
<interface-afs>
<interface-af>
<af-name>ipv4</af-name>
<saf-name>unicast</saf-name>
<interface-af-data/>
</interface-af>
</interface-afs>
</interface>
</interfaces>
</instance>
</instances>
</isis>
</config>
"""
def bgp_redistribute_xml():
"""Configure redistribute connected + static with REDIST-TO-BGP policy."""
return f"""
<config>
<bgp xmlns="{BGP_NS}">
<instance>
<instance-name>default</instance-name>
<instance-as>
<as>0</as>
<four-byte-as>
<as>65020</as>
<bgp-running/>
<default-vrf>
<global>
<global-afs>
<global-af>
<af-name>ipv4-unicast</af-name>
<enable/>
<connected-routes>
<route-policy-name>{ROUTE_POLICY_NAME}</route-policy-name>
</connected-routes>
<static-routes>
<route-policy-name>{ROUTE_POLICY_NAME}</route-policy-name>
</static-routes>
</global-af>
</global-afs>
</global>
</default-vrf>
</four-byte-as>
</instance-as>
</instance>
</bgp>
</config>
"""
# ──────────────────────────────────────────────────────────────────────
# Rollback XML builders (delete operations)
# ──────────────────────────────────────────────────────────────────────
NC_NS = 'urn:ietf:params:xml:ns:netconf:base:1.0'
def delete_loopback_xml(name):
return f"""
<config>
<interface-configurations xmlns="{IFMGR_NS}">
<interface-configuration xmlns:nc="{NC_NS}" nc:operation="delete">
<active>act</active>
<interface-name>{name}</interface-name>
</interface-configuration>
</interface-configurations>
</config>
"""
def delete_static_route_xml(prefix, prefix_len):
return f"""
<config>
<router-static xmlns="{STATIC_NS}">
<default-vrf>
<address-family>
<vrfipv4>
<vrf-unicast>
<vrf-prefixes>
<vrf-prefix xmlns:nc="{NC_NS}" nc:operation="delete">
<prefix>{prefix}</prefix>
<prefix-length>{prefix_len}</prefix-length>
</vrf-prefix>
</vrf-prefixes>
</vrf-unicast>
</vrfipv4>
</address-family>
</default-vrf>
</router-static>
</config>
"""
def delete_bgp_redistribute_xml():
return f"""
<config>
<bgp xmlns="{BGP_NS}">
<instance>
<instance-name>default</instance-name>
<instance-as>
<as>0</as>
<four-byte-as>
<as>65020</as>
<bgp-running/>
<default-vrf>
<global>
<global-afs>
<global-af>
<af-name>ipv4-unicast</af-name>
<enable/>
<connected-routes xmlns:nc="{NC_NS}" nc:operation="delete"/>
<static-routes xmlns:nc="{NC_NS}" nc:operation="delete"/>
</global-af>
</global-afs>
</global>
</default-vrf>
</four-byte-as>
</instance-as>
</instance>
</bgp>
</config>
"""
def delete_isis_interface_xml(intf_name):
return f"""
<config>
<isis xmlns="{ISIS_NS}">
<instances>
<instance>
<instance-name>1</instance-name>
<interfaces>
<interface xmlns:nc="{NC_NS}" nc:operation="delete">
<interface-name>{intf_name}</interface-name>
</interface>
</interfaces>
</instance>
</instances>
</isis>
</config>
"""
def delete_route_policy_xml(name):
return f"""
<config>
<routing-policy xmlns="{RPL_NS}">
<route-policies>
<route-policy xmlns:nc="{NC_NS}" nc:operation="delete">
<route-policy-name>{name}</route-policy-name>
</route-policy>
</route-policies>
</routing-policy>
</config>
"""
# ──────────────────────────────────────────────────────────────────────
# Configuration functions
# ──────────────────────────────────────────────────────────────────────
def nc_connect(mgmt_ip):
"""Open NETCONF session."""
return manager.connect(
host=mgmt_ip,
port=830,
username='webui',
password='cisco',
hostkey_verify=False,
device_params={'name': 'iosxr'},
timeout=30,
)
def configure_router(label, cfg):
"""Apply full route-diversity config to a single router."""
mgmt_ip = cfg['mgmt']
print(f"\n{''*60}")
print(f" Configuring {label} ({mgmt_ip})")
print(f"{''*60}")
lb_names = [lb[0] for lb in cfg['loopbacks']]
static_prefixes = [f"{s[0]}/{s[1]}" for s in cfg['statics']]
print(f" Loopbacks: {', '.join(lb_names)}")
print(f" Statics: {', '.join(static_prefixes)}")
try:
with nc_connect(mgmt_ip) as m:
# Phase 1: Route-policy (must exist before BGP references it)
print(f" → Creating route-policy {ROUTE_POLICY_NAME}...")
m.edit_config(target='candidate', config=route_policy_xml(ROUTE_POLICY_NAME, ROUTE_POLICY_BODY))
# Phase 2: Loopback interfaces
for name, addr, mask in cfg['loopbacks']:
print(f" → Creating {name} ({addr})...")
m.edit_config(target='candidate', config=loopback_xml(name, addr, mask))
# Phase 3: Static routes
for prefix, plen, tag in cfg['statics']:
print(f" → Static {prefix}/{plen} → Null0 tag={tag}...")
m.edit_config(target='candidate', config=static_route_xml(prefix, plen, tag))
# Phase 4: IS-IS passive on new loopbacks
for name, _, _ in cfg['loopbacks']:
print(f" → IS-IS passive: {name}...")
m.edit_config(target='candidate', config=isis_passive_xml(name))
# Phase 5: BGP redistribution
print(f" → BGP redistribute connected + static...")
m.edit_config(target='candidate', config=bgp_redistribute_xml())
# Phase 6: Commit
print(f" → Committing...")
m.commit()
print(f"{label} done.")
return True
except Exception as e:
print(f" ✗ ERROR on {label}: {e}")
return False
def rollback_router(label, cfg):
"""Remove all route-diversity config from a single router."""
mgmt_ip = cfg['mgmt']
print(f"\n{''*60}")
print(f" Rolling back {label} ({mgmt_ip})")
print(f"{''*60}")
try:
with nc_connect(mgmt_ip) as m:
# Remove BGP redistribution first (references the policy)
print(f" → Removing BGP redistribute...")
try:
m.edit_config(target='candidate', config=delete_bgp_redistribute_xml())
except Exception as e:
print(f" (skip — may not exist: {e})")
# Remove IS-IS interfaces
for name, _, _ in cfg['loopbacks']:
print(f" → Removing IS-IS interface {name}...")
try:
m.edit_config(target='candidate', config=delete_isis_interface_xml(name))
except Exception as e:
print(f" (skip: {e})")
# Remove static routes
for prefix, plen, _ in cfg['statics']:
print(f" → Removing static {prefix}/{plen}...")
try:
m.edit_config(target='candidate', config=delete_static_route_xml(prefix, plen))
except Exception as e:
print(f" (skip: {e})")
# Remove loopbacks
for name, _, _ in cfg['loopbacks']:
print(f" → Removing {name}...")
try:
m.edit_config(target='candidate', config=delete_loopback_xml(name))
except Exception as e:
print(f" (skip: {e})")
# Remove route-policy
print(f" → Removing route-policy {ROUTE_POLICY_NAME}...")
try:
m.edit_config(target='candidate', config=delete_route_policy_xml(ROUTE_POLICY_NAME))
except Exception as e:
print(f" (skip: {e})")
print(f" → Committing rollback...")
m.commit()
print(f"{label} rolled back.")
return True
except Exception as e:
print(f" ✗ ERROR rolling back {label}: {e}")
return False
def verify_router(label, cfg):
"""Check if route-diversity config is present on a router."""
mgmt_ip = cfg['mgmt']
try:
with nc_connect(mgmt_ip) as m:
# Check loopbacks
filt_intf = f"""<filter>
<interface-configurations xmlns="{IFMGR_NS}"/>
</filter>"""
r_intf = str(m.get_config(source='running', filter=filt_intf))
found_lbs = []
for name, _, _ in cfg['loopbacks']:
if name in r_intf:
found_lbs.append(name)
# Check route-policy
filt_rpl = f"""<filter>
<routing-policy xmlns="{RPL_NS}"/>
</filter>"""
r_rpl = str(m.get_config(source='running', filter=filt_rpl))
has_policy = ROUTE_POLICY_NAME in r_rpl
# Check BGP redistribute
filt_bgp = f"""<filter>
<bgp xmlns="{BGP_NS}">
<instance><instance-name>default</instance-name></instance>
</bgp>
</filter>"""
r_bgp = str(m.get_config(source='running', filter=filt_bgp))
has_redist_connected = 'connected-routes' in r_bgp and ROUTE_POLICY_NAME in r_bgp
has_redist_static = 'static-routes' in r_bgp and ROUTE_POLICY_NAME in r_bgp
total_lbs = len(cfg['loopbacks'])
lb_str = f"{len(found_lbs)}/{total_lbs}"
pol = '' if has_policy else ''
rc = '' if has_redist_connected else ''
rs = '' if has_redist_static else ''
ok = len(found_lbs) == total_lbs and has_policy and has_redist_connected and has_redist_static
status = 'OK' if ok else 'INCOMPLETE'
print(f" {label:8s} LBs={lb_str:5s} Policy={pol} Redist-C={rc} Redist-S={rs} [{status}]")
except Exception as e:
print(f" {label:8s} verify error: {e}")
# ──────────────────────────────────────────────────────────────────────
# Main
# ──────────────────────────────────────────────────────────────────────
def main():
parser = argparse.ArgumentParser(description='Route Diversity Configuration for RR Diff Analysis')
parser.add_argument('--verify-only', action='store_true', help='Only verify current state')
parser.add_argument('--rollback', action='store_true', help='Remove all added config')
args = parser.parse_args()
print("Route Diversity Configuration Script")
print("=" * 60)
print(f"Targets: {len(ROUTERS)} routers ({', '.join(ROUTERS.keys())})")
print()
if args.verify_only:
print("Verify-only mode")
print('-' * 60)
print(f" {'Router':8s} {'LBs':5s} {'Policy':6s} {'Redist-C':8s} {'Redist-S':8s} Status")
for label, cfg in ROUTERS.items():
verify_router(label, cfg)
return
if args.rollback:
print("ROLLBACK mode — removing all route-diversity config")
print('-' * 60)
results = []
for label, cfg in ROUTERS.items():
ok = rollback_router(label, cfg)
results.append((label, ok))
failed = [l for l, ok in results if not ok]
print()
if failed:
print(f"FAILED rollback: {', '.join(failed)}")
sys.exit(1)
else:
print("All routers rolled back successfully.")
return
# Apply mode
results = []
for label, cfg in ROUTERS.items():
ok = configure_router(label, cfg)
results.append((label, ok))
# Post-apply verification
print(f"\n{'='*60}")
print("Post-apply verification")
print('=' * 60)
print(f" {'Router':8s} {'LBs':5s} {'Policy':6s} {'Redist-C':8s} {'Redist-S':8s} Status")
for label, cfg in ROUTERS.items():
verify_router(label, cfg)
failed = [l for l, ok in results if not ok]
print()
if failed:
print(f"FAILED: {', '.join(failed)}")
sys.exit(1)
else:
total_lbs = sum(len(c['loopbacks']) for c in ROUTERS.values())
total_statics = sum(len(c['statics']) for c in ROUTERS.values())
print(f"All routers configured successfully.")
print(f" {total_lbs} loopbacks + {total_statics} static routes created")
print()
print("Wait ~60s for BGP convergence and BMP collection, then verify:")
print()
print(" # Check new prefixes in OpenBMP")
print(" docker exec -i obmp-psql psql -U openbmp -d openbmp -c \\")
print(" \"SELECT prefix::text, COUNT(*) FROM ip_rib")
print(" WHERE (prefix::text LIKE '10.110.%' OR prefix::text LIKE '10.111.%')")
print(" AND iswithdrawn = false GROUP BY prefix ORDER BY prefix;\"")
if __name__ == '__main__':
main()

View File

@ -363,10 +363,178 @@ _HIJACK_ROUTES = [
]
# ---------------------------------------------------------------------------
# Scenario: te_community_steering
# Routes tagged with TE communities representing different "colors" for
# community-based TE policy steering. Shows how communities drive path
# selection when routers apply route-policy based on community values.
# ---------------------------------------------------------------------------
_TE_COMMUNITY_ROUTES = [
# Red paths (community 65020:100) — high-priority, low-latency
_r('10.210.0.0/24', [65100, 65020], communities=['65020:100'], med=10),
_r('10.210.1.0/24', [65100, 65020], communities=['65020:100'], med=10),
_r('10.210.2.0/24', [65100, 65020], communities=['65020:100'], med=10),
_r('10.210.3.0/24', [65100, 65020], communities=['65020:100'], med=10),
_r('10.210.4.0/24', [65100, 65020], communities=['65020:100'], med=10),
# Blue paths (community 65020:200) — bulk transfer, cost-optimized
_r('10.220.0.0/24', [65100, 65020, 3356], communities=['65020:200'], med=100),
_r('10.220.1.0/24', [65100, 65020, 3356], communities=['65020:200'], med=100),
_r('10.220.2.0/24', [65100, 65020, 3356], communities=['65020:200'], med=100),
_r('10.220.3.0/24', [65100, 65020, 3356], communities=['65020:200'], med=100),
_r('10.220.4.0/24', [65100, 65020, 3356], communities=['65020:200'], med=100),
# Green paths (community 65020:300) — backup/diverse paths
_r('10.230.0.0/24', [65100, 65020, 1299, 6762], communities=['65020:300'], med=200),
_r('10.230.1.0/24', [65100, 65020, 1299, 6762], communities=['65020:300'], med=200),
_r('10.230.2.0/24', [65100, 65020, 1299, 6762], communities=['65020:300'], med=200),
_r('10.230.3.0/24', [65100, 65020, 1299, 6762], communities=['65020:300'], med=200),
_r('10.230.4.0/24', [65100, 65020, 1299, 6762], communities=['65020:300'], med=200),
]
# ---------------------------------------------------------------------------
# Scenario: origin_shift
# Simulates an origin AS change: prefixes initially associated with
# well-known origin ASNs are re-announced with a different origin.
# Use: load internet_sample first, then load origin_shift to see the
# origin_as column change in ip_rib_log (visible on Anomaly dashboard).
# ---------------------------------------------------------------------------
_ORIGIN_SHIFT_ROUTES = [
# These prefixes overlap with internet_sample but have different origin ASNs
_r('8.8.8.0/24', [65100, 64999], communities=['65100:origin-shift']), # was 15169 (Google)
_r('1.1.1.0/24', [65100, 64998], communities=['65100:origin-shift']), # was 13335 (Cloudflare)
_r('9.9.9.0/24', [65100, 64997], communities=['65100:origin-shift']), # was 19281 (Quad9)
_r('208.67.222.0/24', [65100, 64996], communities=['65100:origin-shift']), # was 36692 (OpenDNS)
_r('156.154.70.0/24', [65100, 64995], communities=['65100:origin-shift']), # was 19318 (Neustar)
]
# ---------------------------------------------------------------------------
# Scenario: path_diversity
# Multiple announcements of the same prefix with different AS paths,
# MEDs, and communities. Demonstrates best-path selection:
# - Shorter AS path wins (unless local-pref overrides)
# - Lower MED preferred among paths from same neighbor AS
# - Communities tag paths for policy identification
# ---------------------------------------------------------------------------
_PATH_DIVERSITY_ROUTES = [
# Prefix 1: 3 paths with varying length and MED
_r('10.250.0.0/24', [65100, 174], communities=['65100:path-a'], med=50),
_r('10.250.0.0/24', [65100, 174, 3356], communities=['65100:path-b'], med=100),
_r('10.250.0.0/24', [65100, 174, 3356, 15169], communities=['65100:path-c'], med=150),
# Prefix 2: paths with same length but different MED
_r('10.250.1.0/24', [65100, 1299, 15169], communities=['65100:low-med'], med=10),
_r('10.250.1.0/24', [65100, 3356, 15169], communities=['65100:high-med'], med=500),
# Prefix 3: local-pref override (higher local-pref wins over shorter path)
_r('10.250.2.0/24', [65100, 2914], communities=['65100:low-lp'], local_pref=50),
_r('10.250.2.0/24', [65100, 2914, 7018], communities=['65100:high-lp'], local_pref=200),
# Prefix 4: transit diversity
_r('10.250.3.0/24', [65100, 174, 32934], communities=['65100:via-cogent']),
_r('10.250.3.0/24', [65100, 3356, 32934], communities=['65100:via-lumen']),
_r('10.250.3.0/24', [65100, 2914, 32934], communities=['65100:via-ntt']),
]
# ---------------------------------------------------------------------------
# Registry
# ---------------------------------------------------------------------------
# ---------------------------------------------------------------------------
# Full Internet Table Generator
# Generates realistic-looking IPv4 prefixes across the routable address space
# with varied AS paths, prefix lengths, origins, and communities.
# Configurable count: 10K (quick test) to 900K+ (full table stress test).
# ---------------------------------------------------------------------------
# Well-known transit ASNs for realistic path construction
_TRANSIT_ASNS = [174, 701, 1299, 2914, 3257, 3356, 6461, 6762, 7018, 3491, 5400, 1239]
# Realistic origin ASNs (mix of large providers and small networks)
_ORIGIN_POOL = [
13335, 15169, 16509, 8075, 20940, 32934, 714, 54113, 13414, 7922,
36459, 46489, 14618, 16276, 24940, 47541, 35916, 49981, 9808, 4134,
4837, 9121, 12322, 3320, 6830, 5511, 1273, 6939, 4766, 9318,
23693, 38001, 45102, 58453, 10026, 18881, 28573, 7738, 26599, 8151,
11888, 17676, 4713, 7545, 9299, 50304, 51167, 60068, 41095, 34984,
]
# IANA-allocated first octets for routable IPv4 (subset for realism)
_ROUTABLE_FIRST_OCTETS = list(range(1, 56)) + list(range(57, 127)) + list(range(128, 224))
def generate_full_internet(count=900000):
"""Generate a realistic full IPv4 routing table.
Distributes prefixes across the IPv4 address space with realistic
prefix lengths (/8 through /24) and varied AS paths.
Args:
count: Number of prefixes to generate (default 900K).
Returns:
List of route dicts.
"""
import random
rng = random.Random(42) # deterministic for reproducibility
routes = []
generated = set()
# Prefix length distribution (approximates real DFZ):
# /24: ~55%, /23: ~8%, /22: ~7%, /21: ~5%, /20: ~5%,
# /19: ~4%, /18: ~3%, /17: ~2%, /16: ~5%, /15-/8: ~6%
prefix_len_weights = {
24: 55, 23: 8, 22: 7, 21: 5, 20: 5,
19: 4, 18: 3, 17: 2, 16: 5, 15: 2,
14: 1, 13: 1, 12: 1, 11: 0.5, 10: 0.3,
9: 0.1, 8: 0.1,
}
plen_choices = list(prefix_len_weights.keys())
plen_weights = list(prefix_len_weights.values())
# AS path length distribution: 1-hop: 5%, 2-hop: 30%, 3-hop: 40%, 4-hop: 20%, 5-hop: 5%
path_len_weights = [5, 30, 40, 20, 5]
while len(routes) < count:
# Pick a routable first octet weighted by allocation density
first = rng.choice(_ROUTABLE_FIRST_OCTETS)
plen = rng.choices(plen_choices, weights=plen_weights, k=1)[0]
# Generate random prefix within this /8
if plen <= 8:
prefix = f'{first}.0.0.0/{plen}'
elif plen <= 16:
second = rng.randint(0, 255) & (0xFF << (16 - plen))
prefix = f'{first}.{second}.0.0/{plen}'
elif plen <= 24:
second = rng.randint(0, 255)
third = rng.randint(0, 255) & (0xFF << (24 - plen))
prefix = f'{first}.{second}.{third}.0/{plen}'
else:
continue
if prefix in generated:
continue
generated.add(prefix)
# Build realistic AS path
path_len = rng.choices([1, 2, 3, 4, 5], weights=path_len_weights, k=1)[0]
origin = rng.choice(_ORIGIN_POOL) if rng.random() < 0.3 else (64512 + rng.randint(0, 65535 - 64512))
transits = rng.sample(_TRANSIT_ASNS, min(path_len - 1, len(_TRANSIT_ASNS)))
as_path = [65100] + transits[:path_len - 1] + [origin]
# Occasionally add communities (~20% of routes)
communities = []
if rng.random() < 0.2:
communities.append(f'65100:{rng.choice([100, 200, 300, 400, 500])}')
routes.append(_r(prefix, as_path, communities=communities or None))
return routes
SCENARIOS = {
'internet_sample': {
'description': 'Partial internet table (~80 IPv4 + 14 IPv6 prefixes with realistic AS paths)',
@ -404,4 +572,16 @@ SCENARIOS = {
'description': '10 prefixes announced as if directly originated by AS 65100 — simulates a prefix hijack (community 65100:hijack)',
'routes': _HIJACK_ROUTES,
},
'te_community_steering': {
'description': 'Routes tagged with TE communities for color-based steering (65020:100=red, 65020:200=blue, 65020:300=green)',
'routes': _TE_COMMUNITY_ROUTES,
},
'origin_shift': {
'description': '5 prefixes with changed origin AS — simulates origin migration/hijack for anomaly detection',
'routes': _ORIGIN_SHIFT_ROUTES,
},
'path_diversity': {
'description': 'Same prefixes with different AS paths and MEDs — demonstrates best-path selection and path diversity',
'routes': _PATH_DIVERSITY_ROUTES,
},
}

View File

@ -3,16 +3,23 @@ set -e
LOCAL_IP=${EXABGP_LOCAL_IP:-10.40.40.202}
LOCAL_AS=${EXABGP_LOCAL_AS:-65100}
PEER_AS=${EXABGP_PEER_AS:-65020}
PEER_1=${EXABGP_PEER_1:-10.100.0.100}
PEER_2=${EXABGP_PEER_2:-10.100.0.200}
API_PORT=${EXABGP_API_PORT:-5050}
# Peer list — ";"-separated entries of "ip:peer_as:description".
# Default reproduces the original single-lab (AS 65020) config.
EXABGP_PEERS=${EXABGP_PEERS:-10.100.0.100:65020:CML-R9K-CORE-01;10.100.0.200:65020:CML-R9K-CORE-02}
echo "================================================================"
echo " ExaBGP Route Injector"
echo " Local: ${LOCAL_IP} AS${LOCAL_AS}"
echo " Peers: ${PEER_1}, ${PEER_2} (AS${PEER_AS})"
echo " API: http://0.0.0.0:${API_PORT}"
echo " Peers:"
IFS=';' read -ra PEER_ENTRIES <<< "$EXABGP_PEERS"
for entry in "${PEER_ENTRIES[@]}"; do
[ -z "$entry" ] && continue
IFS=':' read -r p_ip p_as p_desc <<< "$entry"
echo " - ${p_ip} AS${p_as} (${p_desc})"
done
echo "================================================================"
# Generate ExaBGP 5.x env file — ExaBGP looks here based on pip install prefix
@ -22,41 +29,30 @@ sed -i 's/drop = true/drop = false/' /usr/local/etc/exabgp/exabgp.env
sed -i 's/cli = true/cli = false/' /usr/local/etc/exabgp/exabgp.env
sed -i "s/destination = 'stdout'/destination = 'stderr'/" /usr/local/etc/exabgp/exabgp.env
# Generate exabgp.conf from environment
# Generate exabgp.conf — one neighbor block per peer-list entry
cat > /tmp/exabgp.conf << EOF
process api {
run /usr/local/bin/python3 /exabgp/api/server.py;
encoder text;
}
EOF
neighbor ${PEER_1} {
for entry in "${PEER_ENTRIES[@]}"; do
[ -z "$entry" ] && continue
IFS=':' read -r p_ip p_as p_desc <<< "$entry"
cat >> /tmp/exabgp.conf << EOF
neighbor ${p_ip} {
router-id ${LOCAL_IP};
local-address ${LOCAL_IP};
local-as ${LOCAL_AS};
peer-as ${PEER_AS};
description "CML-R9K-CORE-01";
hold-time 90;
family {
ipv4 unicast;
}
api {
processes [ api ];
neighbor-changes;
}
}
neighbor ${PEER_2} {
router-id ${LOCAL_IP};
local-address ${LOCAL_IP};
local-as ${LOCAL_AS};
peer-as ${PEER_AS};
description "CML-R9K-CORE-02";
peer-as ${p_as};
description "${p_desc}";
hold-time 90;
family {
ipv4 unicast;
ipv6 unicast;
}
api {
@ -65,5 +61,6 @@ neighbor ${PEER_2} {
}
}
EOF
done
exec exabgp server /tmp/exabgp.conf

158
gnmi/gnmi_grpc_config.py Normal file
View File

@ -0,0 +1,158 @@
#!/usr/bin/env python3
"""
gNMI gRPC Configuration Script
===============================
Enables gRPC dial-in telemetry on all 9 IOS-XR routers so that
Telegraf (or any gNMI collector) can subscribe to streaming
telemetry data.
What this script applies per router:
- gRPC server on port 57400 with TLS disabled
Uses SSH/CLI (paramiko) instead of NETCONF because IOS-XR 24.3.1
rejects the NETCONF edit-config for gRPC with "Need to enable GRPC first".
Router targets:
CORE-01 (10.100.0.100)
CORE-02 (10.100.0.200)
R9K-01 (10.100.0.1) through R9K-07 (10.100.0.7)
"""
import paramiko
import time
import sys
ROUTERS = [
('10.100.0.100', 'CORE-01'),
('10.100.0.200', 'CORE-02'),
('10.100.0.1', 'R9K-01'),
('10.100.0.2', 'R9K-02'),
('10.100.0.3', 'R9K-03'),
('10.100.0.4', 'R9K-04'),
('10.100.0.5', 'R9K-05'),
('10.100.0.6', 'R9K-06'),
('10.100.0.7', 'R9K-07'),
]
USERNAME = 'webui'
PASSWORD = 'cisco'
GRPC_PORT = 57400
CONFIG_COMMANDS = [
'configure terminal',
'grpc',
f'port {GRPC_PORT}',
'no-tls',
'commit',
'end',
]
def configure_router(mgmt_ip, label):
"""Apply gRPC configuration via SSH CLI."""
print(f"\n{''*60}")
print(f" Configuring {label} ({mgmt_ip})")
print(f"{''*60}")
print(f" Applying: gRPC port={GRPC_PORT} no-tls")
try:
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(mgmt_ip, username=USERNAME, password=PASSWORD, timeout=10)
shell = client.invoke_shell()
time.sleep(1)
shell.recv(65535) # clear banner
for cmd in CONFIG_COMMANDS:
shell.send(cmd + '\n')
time.sleep(1.5)
output = shell.recv(65535).decode()
client.close()
if 'error' in output.lower() or 'fail' in output.lower():
print(f" ✗ ERROR on {label}: {output.strip()}")
return False
print(f"{label} done.")
return True
except Exception as e:
print(f" ✗ ERROR on {label}: {e}")
return False
def verify_router(mgmt_ip, label):
"""Verify gRPC configuration via SSH."""
try:
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(mgmt_ip, username=USERNAME, password=PASSWORD, timeout=10)
shell = client.invoke_shell()
time.sleep(1)
shell.recv(65535)
shell.send('show running-config grpc\n')
time.sleep(3)
output = shell.recv(65535).decode()
client.close()
has_port = f'port {GRPC_PORT}' in output
has_notls = 'no-tls' in output
p = '' if has_port else ''
t = '' if has_notls else ''
status = 'OK' if (has_port and has_notls) else 'INCOMPLETE'
print(f" {label:8s} grpc-port={p} no-tls={t} [{status}]")
return has_port and has_notls
except Exception as e:
print(f" {label:8s} verify error: {e}")
return False
def main():
print("gNMI gRPC Configuration Script")
print("================================")
print(f"Targets: all {len(ROUTERS)} routers")
print()
results = []
for mgmt_ip, label in ROUTERS:
ok = configure_router(mgmt_ip, label)
results.append((mgmt_ip, label, ok))
# Verification pass
print(f"\n{'='*60}")
print("Post-apply verification")
print('='*60)
print(f" {'Router':8s} {'gRPC Port':9s} {'No-TLS':6s} Status")
all_ok = True
for mgmt_ip, label, applied_ok in results:
if applied_ok:
if not verify_router(mgmt_ip, label):
all_ok = False
else:
print(f" {label:8s} skipped (apply failed)")
all_ok = False
failed = [label for _, label, ok in results if not ok]
print()
if failed:
print(f"FAILED: {', '.join(failed)}")
sys.exit(1)
elif all_ok:
print("All routers configured successfully.")
print()
print(f"gRPC is now listening on port {GRPC_PORT} (no TLS) on all routers.")
print("Next: start Telegraf with gNMI input plugin to begin collecting telemetry.")
else:
print("Some routers may have incomplete configuration. Check output above.")
sys.exit(1)
if __name__ == '__main__':
main()

View File

@ -1,237 +1,59 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "OpenBMP navigation hub. Start at the NOC Overview, then drill into the operational dashboards.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 1,
"links": [],
"id": null,
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 3,
"w": 24,
"x": 0,
"y": 0
},
"id": 6,
"links": [],
"gridPos": {"h": 6,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {
"content": "# OpenBMP\n\n*Select a dashboard*",
"content": "# OpenBMP\n\nBGP Monitoring Protocol analytics. **Start here → [NOC Overview](/d/obmp-noc-overview/noc-overview)** — the at-a-glance health view.\n\n| Tier | What it covers |\n|------|----------------|\n| **NOC Overview** | Is the network healthy right now? Routers, peers, flaps, churn, RPKI, topology |\n| **Operations** | Router & peer inventory, per-router and per-peer detail, session health |\n| **Routing** | Prefix explorer, top talkers & churn, AS-path, communities, RPKI security |\n| **Link State** | IGP topology, nodes, links, TE & Segment Routing |\n| **L3VPN** | VPNv4/VPNv6 RIB and prefix history |\n| **Telemetry** | gNMI interface utilization, errors, BMP+telemetry correlation |\n| **Reference** | Database schema map, RR Loc-RIB diff |\n\nUse the **OBMP Dashboards** dropdown (top-right) or the panels below to navigate.",
"mode": "markdown"
},
"pluginVersion": "8.5.4",
"pluginVersion": "9.1.7",
"type": "text"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 18,
"w": 4,
"x": 0,
"y": 3
},
"id": 7,
"links": [],
"options": {
"maxItems": 41,
"query": "",
"showHeadings": true,
"showRecentlyViewed": false,
"showSearch": true,
"showStarred": false,
"tags": [
"obmp-tops"
]
},
"pluginVersion": "8.5.4",
"tags": [],
"title": "Tops",
"gridPos": {"h": 16,"w": 8,"x": 0,"y": 6},
"id": 2,
"options": {"maxItems": 100,"showHeadings": false,"showRecentlyViewed": false,"showSearch": true,"showStarred": false,"query": "","tags": []},
"pluginVersion": "9.1.7",
"title": "All Dashboards",
"type": "dashlist"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 18,
"w": 5,
"x": 4,
"y": 3
},
"id": 8,
"links": [],
"options": {
"maxItems": 41,
"query": "",
"showHeadings": true,
"showRecentlyViewed": false,
"showSearch": true,
"showStarred": false,
"tags": [
"obmp-base"
]
},
"pluginVersion": "8.5.4",
"tags": [],
"title": "Base",
"type": "dashlist"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 18,
"w": 5,
"x": 9,
"y": 3
},
"id": 4,
"links": [],
"options": {
"maxItems": 41,
"query": "",
"showHeadings": true,
"showRecentlyViewed": false,
"showSearch": true,
"showStarred": false,
"tags": [
"obmp-history"
]
},
"pluginVersion": "8.5.4",
"tags": [],
"title": "Prefix History",
"type": "dashlist"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 18,
"w": 5,
"x": 14,
"y": 3
},
"id": 9,
"links": [],
"options": {
"maxItems": 41,
"query": "",
"showHeadings": true,
"showRecentlyViewed": false,
"showSearch": true,
"showStarred": false,
"tags": [
"obmp-l3vpn"
]
},
"pluginVersion": "8.5.4",
"tags": [],
"title": "L3VPN",
"type": "dashlist"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"gridPos": {
"h": 18,
"w": 5,
"x": 19,
"y": 3
},
"gridPos": {"h": 16,"w": 8,"x": 8,"y": 6},
"id": 3,
"links": [],
"options": {
"maxItems": 41,
"query": "",
"showHeadings": true,
"showRecentlyViewed": false,
"showSearch": true,
"showStarred": false,
"tags": [
"obmp-linkstate"
]
},
"pluginVersion": "8.5.4",
"tags": [],
"title": "Link State",
"options": {"maxItems": 20,"showHeadings": false,"showRecentlyViewed": true,"showSearch": false,"showStarred": false,"query": "","tags": []},
"pluginVersion": "9.1.7",
"title": "Recently Viewed",
"type": "dashlist"
},
{
"gridPos": {"h": 16,"w": 8,"x": 16,"y": 6},
"id": 4,
"options": {"maxItems": 20,"showHeadings": false,"showRecentlyViewed": false,"showSearch": false,"showStarred": true,"query": "","tags": []},
"pluginVersion": "9.1.7",
"title": "Starred",
"type": "dashlist"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [],
"templating": {
"list": []
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"tags": ["obmp","obmp-nav"],
"templating": {"list": []},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "",
"title": "OBMP-Home",
"uid": "obmp-home",
"version": 1,
"version": 2,
"weekStart": ""
}
}

View File

@ -0,0 +1,207 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Network-operations overview — answers 'is the network healthy right now?' at a glance. Counts come from stats_* aggregate tables so the dashboard stays fast at production scale.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"liveNow": true,
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total routers reporting BMP to the collector.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Routers\" FROM routers","refId": "A"}],
"title": "Routers Monitored",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routers whose BMP session is not up. Should be 0.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "red","value": 1}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 3,"y": 0},
"id": 2,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Routers Down\" FROM routers WHERE state != 'up'","refId": "A"}],
"title": "Routers Down",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP peers currently up (pre-policy Adj-RIB-In sessions).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 6,"y": 0},
"id": 3,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Peers Up\" FROM bgp_peers WHERE isprepolicy = true AND state = 'up'","refId": "A"}],
"title": "Peers Up",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP peers that went down within the selected time range. Investigate any non-zero value. (Removed/decommissioned peers fall outside the range and are not counted.)",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "red","value": 1}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 9,"y": 0},
"id": 4,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Peers Down\" FROM bgp_peers WHERE isprepolicy = true AND state != 'up' AND $__timeFilter(timestamp)","refId": "A"}],
"title": "Peers Down",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Peer session down-events in the last hour. Sustained flapping needs investigation.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 5}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 12,"y": 0},
"id": 5,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Flaps (1h)\" FROM peer_event_log WHERE state = 'down' AND timestamp > NOW() - INTERVAL '1 hour'","refId": "A"}],
"title": "Flap Events (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total BGP updates across all peers in the last 5 minutes (from stats_chg_bypeer).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 15,"y": 0},
"id": 6,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE(SUM(updates),0) AS \"RIB Updates (5m)\" FROM stats_chg_bypeer WHERE interval_time > NOW() - INTERVAL '5 minutes'","refId": "A"}],
"title": "RIB Updates (5m)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routes whose origin AS conflicts with a covering ROA (RPKI-invalid). Potential hijacks or misconfigurations.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "red","value": 1}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 18,"y": 0},
"id": 7,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"RPKI Invalid\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND EXISTS (SELECT 1 FROM rpki_validator rv WHERE rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as)\n AND NOT EXISTS (SELECT 1 FROM rpki_validator rv WHERE rv.prefix >>= r.prefix AND rv.origin_as = ba.origin_as AND r.prefix_len <= rv.prefix_len_max)","refId": "A"}],
"title": "RPKI Invalid Routes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP-LS link and node changes in the last hour. A spike indicates topology instability.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 20}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 3,"x": 21,"y": 0},
"id": 8,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time,\n (SELECT count(*) FROM ls_links_log WHERE timestamp > NOW() - INTERVAL '1 hour')\n + (SELECT count(*) FROM ls_nodes_log WHERE timestamp > NOW() - INTERVAL '1 hour') AS \"LS Changes (1h)\"","refId": "A"}],
"title": "LS Topology Changes (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Per-peer session state over the selected range. Any gap is a flap.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"custom": {"fillOpacity": 70,"lineWidth": 0,"spanNulls": false},
"mappings": [{"options": {"0": {"color": "red","index": 1,"text": "DOWN"},"1": {"color": "green","index": 0,"text": "UP"}},"type": "value"}],
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]}
}
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 4},
"id": 9,
"options": {"alignValue": "left","legend": {"displayMode": "list","placement": "bottom","showLegend": false},"mergeValues": true,"rowHeight": 0.9,"showValue": "never","tooltip": {"mode": "single"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(e.timestamp,'1m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n CASE WHEN e.state = 'up' THEN 1 ELSE 0 END AS \"value\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nORDER BY 1, 2","refId": "A"}],
"title": "Peer Session State",
"type": "state-timeline"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP update vs withdraw rate across all peers (from stats_chg_bypeer).",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisCenteredZero": false,"axisColorMode": "text","axisLabel": "","axisPlacement": "auto","barAlignment": 0,"drawStyle": "line","fillOpacity": 20,"gradientMode": "none","lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"scaleDistribution": {"type": "linear"},"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"},"thresholdsStyle": {"mode": "off"}},"unit": "short"},
"overrides": [{"matcher": {"id": "byName","options": "Withdraws"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]},{"matcher": {"id": "byName","options": "Updates"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]}]
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 4},
"id": 10,
"options": {"legend": {"calcs": ["sum"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "none"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(interval_time,'5m'),\n SUM(updates) AS \"Updates\",\n SUM(withdraws) AS \"Withdraws\"\nFROM stats_chg_bypeer\nWHERE $__timeFilter(interval_time)\nGROUP BY 1\nORDER BY 1","refId": "A"}],
"title": "BGP Update Rate",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Peers that went down within the selected time range. Empty is healthy. Widen the time range to see longer-standing issues. Click a peer to open Peer Detail.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 0,"text": "DOWN"}},"type": "value"}]}]},
{"matcher": {"id": "byName","options": "Peer"},"properties": [{"id": "links","value": [{"title": "Open Peer Detail","url": "/d/obmp-peer-detail/peer-detail?var-peer_hash=${__data.fields[\"peer_hash_id\"]}"}]}]},
{"matcher": {"id": "byName","options": "peer_hash_id"},"properties": [{"id": "custom.hidden","value": true}]}
]
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 13},
"id": 11,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Last Change"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n p.hash_id AS peer_hash_id,\n COALESCE(p.name, p.peer_addr::text) AS \"Peer\",\n p.peer_addr AS \"Address\",\n p.peer_as AS \"AS\",\n p.state AS \"State\",\n p.timestamp AS \"Last Change\",\n p.error_text AS \"Reason\"\nFROM bgp_peers p\nWHERE p.isprepolicy = true AND p.state != 'up' AND $__timeFilter(p.timestamp)\nORDER BY p.timestamp DESC","refId": "A"}],
"title": "Peers Down",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Most-churned prefixes in the last hour (from stats_chg_byprefix). Click a prefix to open Prefix Explorer.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "Total Changes"},"properties": [{"id": "custom.displayMode","value": "gradient-gauge"},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 50},{"color": "red","value": 500}]}}]},
{"matcher": {"id": "byName","options": "Prefix"},"properties": [{"id": "links","value": [{"title": "Open in Prefix Explorer","url": "/d/prefix-hist/prefix-explorer?var-prefix=${__value.text}"}]}]}
]
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 13},
"id": 12,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Total Changes"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n (host(prefix) || '/' || prefix_len) AS \"Prefix\",\n SUM(updates) AS \"Updates\",\n SUM(withdraws) AS \"Withdraws\",\n SUM(updates + withdraws) AS \"Total Changes\"\nFROM stats_chg_byprefix\nWHERE interval_time > NOW() - INTERVAL '1 hour'\nGROUP BY prefix, prefix_len\nORDER BY \"Total Changes\" DESC\nLIMIT 25","refId": "A"}],
"title": "Top Churning Prefixes (1h)",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routes whose observed origin AS conflicts with a covering ROA — potential hijacks or leaks.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "Status"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"Invalid": {"color": "red","index": 0}},"type": "value"}]}]}]
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 22},
"id": 13,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n r.prefix AS \"Prefix\",\n ba.origin_as AS \"Observed Origin AS\",\n rv.origin_as AS \"Authorized AS (ROA)\",\n 'Invalid' AS \"Status\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nJOIN rpki_validator rv ON rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND NOT EXISTS (SELECT 1 FROM rpki_validator rv2 WHERE rv2.prefix >>= r.prefix AND rv2.origin_as = ba.origin_as AND r.prefix_len <= rv2.prefix_len_max)\nORDER BY r.prefix\nLIMIT 50","refId": "A"}],
"title": "RPKI Invalid Routes — Potential Hijacks",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Recent BGP-LS link changes — topology churn over the selected range.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "Action"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"updated": {"color": "blue","index": 0},"withdrawn": {"color": "orange","index": 1}},"type": "value"}]}]}]
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 22},
"id": 14,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Time"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n timestamp AS \"Time\",\n COALESCE(interface_addr::text, '') AS \"Local\",\n COALESCE(neighbor_addr::text, '') AS \"Neighbor\",\n CASE WHEN iswithdrawn THEN 'withdrawn' ELSE 'updated' END AS \"Action\"\nFROM ls_links_log\nWHERE $__timeFilter(timestamp)\nORDER BY timestamp DESC\nLIMIT 50","refId": "A"}],
"title": "Recent LS Topology Changes",
"type": "table"
}
],
"refresh": "1m",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","obmp-nav","noc","overview"],
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "NOC Overview",
"uid": "obmp-noc-overview",
"version": 1
}

View File

@ -0,0 +1,217 @@
{
"uid": "obmp-learn-07",
"title": "Database Schema Map",
"schemaVersion": 39,
"tags": [
"obmp-learning",
"obmp",
"obmp-nav",
"reference"
],
"editable": true,
"time": {
"from": "now-6h",
"to": "now"
},
"templating": {
"list": []
},
"panels": [
{
"id": 1,
"title": "Table Row Counts",
"type": "table",
"gridPos": {
"h": 12,
"w": 8,
"x": 0,
"y": 0
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"rawSql": "SELECT 'routers' as table_name, count(*) as rows FROM routers\nUNION ALL SELECT 'collectors', count(*) FROM collectors\nUNION ALL SELECT 'bgp_peers', count(*) FROM bgp_peers\nUNION ALL SELECT 'peer_event_log', count(*) FROM peer_event_log\nUNION ALL SELECT 'base_attrs', count(*) FROM base_attrs\nUNION ALL SELECT 'ip_rib', count(*) FROM ip_rib\nUNION ALL SELECT 'ip_rib_log', count(*) FROM ip_rib_log\nUNION ALL SELECT 'l3vpn_rib', count(*) FROM l3vpn_rib\nUNION ALL SELECT 'global_ip_rib', count(*) FROM global_ip_rib\nUNION ALL SELECT 'ls_nodes', count(*) FROM ls_nodes\nUNION ALL SELECT 'ls_links', count(*) FROM ls_links\nUNION ALL SELECT 'ls_prefixes', count(*) FROM ls_prefixes\nUNION ALL SELECT 'ls_nodes_log', count(*) FROM ls_nodes_log\nUNION ALL SELECT 'ls_links_log', count(*) FROM ls_links_log\nUNION ALL SELECT 'ls_prefixes_log', count(*) FROM ls_prefixes_log\nUNION ALL SELECT 'rpki_validator', count(*) FROM rpki_validator\nUNION ALL SELECT 'info_asn', count(*) FROM info_asn\nUNION ALL SELECT 'info_route', count(*) FROM info_route\nUNION ALL SELECT 'stat_reports', count(*) FROM stat_reports\nUNION ALL SELECT 'geo_ip', count(*) FROM geo_ip\nORDER BY table_name",
"format": "table"
}
]
},
{
"id": 2,
"title": "Table Relationships",
"type": "text",
"gridPos": {
"h": 12,
"w": 8,
"x": 8,
"y": 0
},
"options": {
"mode": "markdown",
"content": "## Entity Relationships\n\n### BMP Core Chain\n```\ncollectors\n \u2514\u2500\u2500 routers (collector_hash_id)\n \u2514\u2500\u2500 bgp_peers (router_hash_id)\n \u251c\u2500\u2500 ip_rib (peer_hash_id)\n \u251c\u2500\u2500 ip_rib_log (peer_hash_id)\n \u251c\u2500\u2500 l3vpn_rib (peer_hash_id)\n \u251c\u2500\u2500 ls_nodes (peer_hash_id)\n \u251c\u2500\u2500 ls_links (peer_hash_id)\n \u251c\u2500\u2500 ls_prefixes (peer_hash_id)\n \u251c\u2500\u2500 peer_event_log (peer_hash_id)\n \u2514\u2500\u2500 stat_reports (peer_hash_id)\n```\n\n### Path Attributes\n```\nip_rib \u2500\u2500(base_attr_hash_id)\u2500\u2500\u25ba base_attrs\n \u2502 \u251c\u2500\u2500 as_path (bigint[])\n \u2502 \u251c\u2500\u2500 origin_as\n \u2502 \u251c\u2500\u2500 next_hop\n \u2502 \u251c\u2500\u2500 med / local_pref\n \u2502 \u251c\u2500\u2500 community_list[]\n \u2502 \u251c\u2500\u2500 ext_community_list[]\n \u2502 \u2514\u2500\u2500 large_community_list[]\n \u2502\n \u2514\u2500\u2500(prefix)\u2500\u2500\u25ba global_ip_rib\n \u251c\u2500\u2500 rpki_origin_as\n \u251c\u2500\u2500 irr_origin_as\n \u2514\u2500\u2500 num_peers\n```\n\n### Link-State Topology\n```\nls_nodes \u25c4\u2500\u2500 ls_links (local_node_hash_id, remote_node_hash_id)\nls_nodes \u25c4\u2500\u2500 ls_prefixes (local_node_hash_id)\n```\n\n### Reference Data\n```\nrpki_validator \u2500\u2500(prefix, origin_as)\u2500\u2500\u25ba validates ip_rib\ninfo_asn \u2500\u2500(asn)\u2500\u2500\u25ba enriches base_attrs.origin_as\ninfo_route \u2500\u2500(prefix)\u2500\u2500\u25ba enriches ip_rib.prefix\ngeo_ip \u2500\u2500(ip)\u2500\u2500\u25ba geolocates routers, peers\n```"
}
},
{
"id": 3,
"title": "BMP Core Tables",
"type": "text",
"gridPos": {
"h": 8,
"w": 8,
"x": 16,
"y": 0
},
"options": {
"mode": "markdown",
"content": "## BMP Core Tables\n\n| Table | Purpose | Key Columns |\n|-------|---------|-------------|\n| **routers** | BMP-monitored routers | hash_id, name, ip_address, router_as, state, bgp_id |\n| **collectors** | BMP collector instances | hash_id, admin_id, name, ip_address, router_count |\n| **bgp_peers** | BGP sessions per router | hash_id, router_hash_id, peer_addr, peer_as, state, isl3vpnpeer |\n| **peer_event_log** | Session state history (TimescaleDB) | peer_hash_id, state, timestamp, bmp_reason, bgp_err_code |\n| **stat_reports** | BMP statistics messages | peer_hash_id, prefixes_rejected, num_routes_adj_rib_in, num_routes_local_rib |\n| **users** | Access control | username, password, type (admin/oper) |"
}
},
{
"id": 4,
"title": "RIB & Path Attribute Tables",
"type": "text",
"gridPos": {
"h": 8,
"w": 8,
"x": 16,
"y": 8
},
"options": {
"mode": "markdown",
"content": "## RIB & Path Attribute Tables\n\n| Table | Purpose | Key Columns |\n|-------|---------|-------------|\n| **base_attrs** | BGP path attributes | hash_id, as_path[], as_path_count, origin_as, next_hop, med, local_pref, community_list[], ext_community_list[], large_community_list[], cluster_list, originator_id |\n| **ip_rib** | IPv4/IPv6 unicast RIB | hash_id, peer_hash_id, prefix, prefix_len, origin_as, iswithdrawn, labels, path_id |\n| **ip_rib_log** | RIB change history (TimescaleDB) | peer_hash_id, prefix, prefix_len, origin_as, iswithdrawn, timestamp |\n| **l3vpn_rib** | L3VPN/MPLS VPN routes | hash_id, peer_hash_id, rd, prefix, labels, ext_community_list[] |\n| **l3vpn_rib_log** | L3VPN change history (TimescaleDB) | peer_hash_id, rd, prefix, iswithdrawn, timestamp |\n| **global_ip_rib** | Aggregated prefix summary | prefix, recv_origin_as, rpki_origin_as, irr_origin_as, num_peers |"
}
},
{
"id": 5,
"title": "Link-State Tables",
"type": "text",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 12
},
"options": {
"mode": "markdown",
"content": "## Link-State Tables (BGP-LS / RFC 7752)\n\n| Table | Purpose | Key Columns |\n|-------|---------|-------------|\n| **ls_nodes** | IS-IS/OSPF nodes | hash_id, peer_hash_id, igp_router_id, name, protocol, asn, sr_capabilities, isis_area_id |\n| **ls_links** | IS-IS/OSPF links + TE/SR | hash_id, local/remote_node_hash_id, interface_addr, neighbor_addr, igp_metric, **te_def_metric**, **max_link_bw**, **max_resv_bw**, **unreserved_bw**, **admin_group**, **srlg**, **sr_adjacency_sids**, **peer_node_sid**, **protection_type**, **mpls_proto_mask** |\n| **ls_prefixes** | IS-IS/OSPF prefixes | hash_id, local_node_hash_id, prefix, metric, sr_prefix_sids, igp_flags |\n| **ls_nodes_log** | Node change history (TimescaleDB) | Same as ls_nodes + timestamp |\n| **ls_links_log** | Link change history (TimescaleDB) | Same as ls_links + timestamp |\n| **ls_prefixes_log** | Prefix change history (TimescaleDB) | Same as ls_prefixes + timestamp |\n\n**Bold columns** = TE/SR fields not used by any existing dashboard"
}
},
{
"id": 6,
"title": "Statistics Tables",
"type": "text",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 12
},
"options": {
"mode": "markdown",
"content": "## Statistics Tables (TimescaleDB Hypertables)\n\n| Table | Purpose | Key Columns |\n|-------|---------|-------------|\n| **stat_reports** | BMP stat messages | peer_hash_id, prefixes_rejected, known_dup_prefixes, num_routes_adj_rib_in |\n| **stats_chg_byprefix** | Per-prefix churn stats | interval_time, peer_hash_id, prefix, updates, withdraws |\n| **stats_chg_byasn** | Per-ASN churn stats | interval_time, peer_hash_id, origin_as, updates, withdraws |\n| **stats_chg_bypeer** | Per-peer churn stats | interval_time, peer_hash_id, updates, withdraws |\n| **stats_peer_rib** | Per-peer RIB size | interval_time, peer_hash_id, v4_prefixes, v6_prefixes |\n| **stats_peer_update_counts** | Update rate statistics | interval_time, peer_hash_id, advertise_avg/min/max, withdraw_avg/min/max |\n| **stats_ip_origins** | Per-ASN prefix counts | interval_time, asn, v4_prefixes, v6_prefixes, v4_with_rpki, v4_with_irr |"
}
},
{
"id": 7,
"title": "Reference & Enrichment Tables",
"type": "text",
"gridPos": {
"h": 6,
"w": 12,
"x": 0,
"y": 20
},
"options": {
"mode": "markdown",
"content": "## Reference & Enrichment Tables\n\n| Table | Purpose | Key Columns |\n|-------|---------|-------------|\n| **rpki_validator** | RPKI ROAs | prefix, prefix_len, prefix_len_max, origin_as |\n| **info_asn** | ASN WHOIS/IRR data | asn, as_name, org_name, country, source |\n| **info_route** | Route IRR data | prefix, prefix_len, origin_as, descr, source |\n| **geo_ip** | IP geolocation (DB-IP) | ip, country, city, latitude, longitude, isp_name |\n| **pdb_exchange_peers** | PeeringDB IXP data | ix_name, peer_name, peer_asn, speed, peer_ipv4/ipv6 |"
}
},
{
"id": 8,
"title": "Views Quick Reference",
"type": "text",
"gridPos": {
"h": 6,
"w": 12,
"x": 12,
"y": 20
},
"options": {
"mode": "markdown",
"content": "## Database Views\n\n| View | Joins | Purpose |\n|------|-------|---------|\n| **v_peers** | bgp_peers + routers + info_asn | Complete peer info with router name and ASN details |\n| **v_ip_routes** | ip_rib + bgp_peers + base_attrs + routers | Full route detail with path attributes |\n| **v_ip_routes_geo** | v_ip_routes + geo_ip | Routes with geolocation |\n| **v_ip_routes_history** | ip_rib_log + base_attrs + bgp_peers + routers | Historical route changes with attributes |\n| **v_l3vpn_routes** | l3vpn_rib + bgp_peers + base_attrs + routers | L3VPN routes with path attributes |\n| **v_l3vpn_routes_history** | l3vpn_rib_log + base_attrs + bgp_peers + routers | Historical L3VPN changes |\n| **v_ls_nodes** | ls_nodes + base_attrs + bgp_peers + routers | Link-state nodes with peer/router info |\n| **v_ls_links** | ls_links + ls_nodes(x2) + routers | Links with local/remote node names + TE fields |\n| **v_ls_prefixes** | ls_prefixes + ls_nodes + routers | LS prefixes with originating node info |\n\n### Enum Types\n- **opstate**: up, down\n- **ls_proto**: IS-IS_L1, IS-IS_L2, OSPFv2, OSPFv3, Direct, Static\n- **ospf_route_type**: Intra, Inter, Ext-1, Ext-2, NSSA-1, NSSA-2\n- **ls_mpls_proto_mask**: MPLS protocol bitmask"
}
},
{
"id": 9,
"title": "LinkState Column Details",
"type": "table",
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 26
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"rawSql": "SELECT column_name, data_type, \n CASE \n WHEN column_name IN ('admin_group','max_link_bw','max_resv_bw','unreserved_bw','te_def_metric','protection_type','srlg','sr_adjacency_sids','peer_node_sid','mpls_proto_mask') THEN 'TE/SR'\n WHEN column_name IN ('hash_id','peer_hash_id','base_attr_hash_id','local_node_hash_id','remote_node_hash_id') THEN 'FK/Key'\n ELSE 'Core'\n END as category\nFROM information_schema.columns \nWHERE table_name = 'ls_links' AND table_schema = 'public'\nORDER BY ordinal_position",
"format": "table"
}
]
},
{
"id": 10,
"title": "ip_rib Column Details",
"type": "table",
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 26
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"refId": "A",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"rawSql": "SELECT column_name, data_type,\n CASE \n WHEN column_name IN ('hash_id','peer_hash_id','base_attr_hash_id') THEN 'FK/Key'\n ELSE 'Core'\n END as category\nFROM information_schema.columns \nWHERE table_name = 'ip_rib' AND table_schema = 'public'\nORDER BY ordinal_position",
"format": "table"
}
]
}
],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
]
}

View File

@ -1,160 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "AS path length distribution and analysis. Teaches how BGP AS paths reflect internet topology and how to detect anomalies like route leaks or AS path prepending.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Internet routes typically have 2-5 hops. A /32 or /24 appearing with only 1-hop AS path from an unexpected ASN is a classic hijack indicator. Routes with 10+ hops may indicate prepending.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"fillOpacity": 80,"gradientMode": "none","lineWidth": 0},
"unit": "short"
}
},
"gridPos": {"h": 10,"w": 12,"x": 0,"y": 0},
"id": 1,
"options": {"barRadius": 0,"barWidth": 0.7,"groupWidth": 0.7,"legend": {"calcs": [],"displayMode": "list","placement": "bottom"},"orientation": "auto","tooltip": {"mode": "single"},"xTickLabelRotation": 0,"xTickLabelSpacing": 200},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n ba.as_path_count AS \"AS Path Length (hops)\",\n COUNT(*) AS \"Prefix Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false\n AND r.isipv4 = true\n AND ba.as_path_count > 0\nGROUP BY ba.as_path_count\nORDER BY ba.as_path_count",
"refId": "A"
}
],
"title": "AS Path Length Distribution (Active IPv4 Routes)",
"type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Average AS path length on the internet is ~4-5 hops. Your lab has shorter paths since ExaBGP is a single eBGP hop away.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 5},{"color": "red","value": 8}]},
"unit": "short",
"decimals": 1
}
},
"gridPos": {"h": 5,"w": 6,"x": 12,"y": 0},
"id": 2,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(AVG(ba.as_path_count)::numeric, 1) AS \"Avg AS Path Length\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true AND ba.as_path_count > 0",
"refId": "A"
}
],
"title": "Average AS Path Length",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Routes with only 1-hop AS path are directly connected or possibly hijacked. In your lab, ExaBGP injects routes starting with AS 65100.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 5},{"color": "red","value": 20}]},
"unit": "short"
}
},
"gridPos": {"h": 5,"w": 6,"x": 18,"y": 0},
"id": 3,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n COUNT(*) AS \"Direct (1-hop) Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true AND ba.as_path_count = 1",
"refId": "A"
}
],
"title": "1-Hop Routes (Direct/Possible Hijack)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: The longest paths reveal the most AS-level hops in your network. AS path prepending intentionally lengthens paths to make a route less preferred.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "AS Path Length"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 5},{"color": "red","value": 10}]}}]},
{"matcher": {"id": "byName","options": "AS Path"},"properties": [{"id": "custom.width","value": 400}]}
]
},
"gridPos": {"h": 10,"w": 24,"x": 0,"y": 10},
"id": 4,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "AS Path Length"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n r.prefix AS \"Prefix\",\n ba.as_path_count AS \"AS Path Length\",\n ba.as_path::text AS \"AS Path\",\n ba.origin_as AS \"Origin AS\",\n ba.next_hop AS \"Next Hop\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nORDER BY ba.as_path_count DESC\nLIMIT 30",
"refId": "A"
}
],
"title": "Longest AS Paths (Top 30)",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Origin AS is the rightmost ASN in the AS path — the network that first originated the prefix. Most internet prefixes are originated by their owning organization.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "Route Count"},"properties": [{"id": "custom.displayMode","value": "lcd-gauge"},{"id": "custom.width","value": 200}]}
]
},
"gridPos": {"h": 12,"w": 12,"x": 0,"y": 20},
"id": 5,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Route Count"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n ba.origin_as AS \"Origin AS\",\n COALESCE(ia.as_name, 'Unknown') AS \"AS Name\",\n COUNT(*) AS \"Route Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nLEFT JOIN info_asn ia ON ia.asn = ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.origin_as, ia.as_name\nORDER BY COUNT(*) DESC\nLIMIT 20",
"refId": "A"
}
],
"title": "Top Origin ASNs by Route Count",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: A transit AS (appearing frequently in AS paths but not as origin) is a carrier. The most frequent transit ASNs in your lab correspond to simulated Tier-1 carriers (174=Cogent, 3356=Lumen, 1299=Telia, etc.)",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"fillOpacity": 80,"lineWidth": 0},"unit": "short"}
},
"gridPos": {"h": 12,"w": 12,"x": 12,"y": 20},
"id": 6,
"options": {"barRadius": 0,"barWidth": 0.7,"groupWidth": 0.7,"legend": {"calcs": [],"displayMode": "list","placement": "bottom"},"orientation": "horizontal","tooltip": {"mode": "single"},"xTickLabelRotation": 0,"xTickLabelSpacing": 200},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n asn_val AS \"Transit ASN\",\n COUNT(*) AS \"Appearances in AS Paths\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nCROSS JOIN LATERAL unnest(ba.as_path) AS asn_val\nWHERE r.iswithdrawn = false AND asn_val != ba.origin_as\nGROUP BY asn_val\nORDER BY COUNT(*) DESC\nLIMIT 15",
"refId": "A"
}
],
"title": "Most Common Transit ASNs",
"type": "barchart"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","as-path","topology"],
"time": {"from": "now-1h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "AS Path Analysis",
"uid": "obmp-learn-03",
"version": 1
}

View File

@ -1,201 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","target": {"limit": 100,"matchAny": false,"tags": [],"type": "dashboard"},"type": "dashboard"}]},
"description": "Explore BGP path attributes: communities, MED, local-pref and how they influence routing policy decisions.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"datasource": {"type": "datasource","uid": "grafana"},
"gridPos": {"h": 8,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {
"content": "## BGP Path Attributes — What They Mean\n\n### BGP Communities (RFC 1997)\nCommunities are 32-bit tags attached to routes, written as **ASN:value** (e.g., `65000:100`). They carry policy signals between routers and ASes.\n\n**Well-known communities:**\n| Community | Decimal | Meaning |\n|-----------|---------|----------|\n| `65535:0` | NO_EXPORT | Do not advertise outside this AS or confederation |\n| `65535:1` | NO_ADVERTISE | Do not advertise to any peer |\n| `65535:666` | BLACKHOLE | Drop traffic destined for this prefix (RFC 7999) |\n\nPrivate communities (e.g., `65001:200`) are operator-defined — they may encode region, customer tier, or traffic-engineering intent.\n\n### Local Preference (local-pref)\n- **Scope:** iBGP only — never sent to eBGP peers.\n- **Effect:** Higher local-pref wins. Default is **100**.\n- **Use case:** Prefer one upstream provider over another for all outbound traffic.\n\n### Multi-Exit Discriminator (MED)\n- **Scope:** Sent to directly connected eBGP peers to influence *inbound* traffic.\n- **Effect:** Lower MED wins (when comparing routes from the same AS).\n- **Use case:** Tell a peer which of your links to prefer when sending traffic to you.\n\n> **Tip:** Use the panels below to explore what communities and attributes are actually present in the current RIB. Run `inject.py attributes` to load routes with varied communities and MED values.",
"mode": "markdown"
},
"title": "BGP Attribute Reference — Communities, Local-Pref, MED",
"type": "text"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Each row is a unique community string (format ASN:value) seen across all active routes. High route counts for a community mean many routes share that policy tag. Look for well-known communities: 65535:0 (NO_EXPORT), 65535:1 (NO_ADVERTISE), 65535:666 (BLACKHOLE).",
"fieldConfig": {
"defaults": {"color": {"mode": "thresholds"},"custom": {"align": "auto","displayMode": "auto"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}},
"overrides": [
{"matcher": {"id": "byName","options": "Routes Tagged"},"properties": [{"id": "custom.displayMode","value": "lcd-gauge"},{"id": "color","value": {"mode": "thresholds"}},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "blue","value": null},{"color": "green","value": 10},{"color": "yellow","value": 100}]}}]}
]
},
"gridPos": {"h": 11,"w": 12,"x": 0,"y": 8},
"id": 2,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Routes Tagged"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n comm AS \"Community\",\n COUNT(*) AS \"Routes Tagged\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nCROSS JOIN LATERAL unnest(ba.community_list) AS comm\nWHERE r.iswithdrawn = false AND ba.community_list IS NOT NULL\nGROUP BY comm\nORDER BY COUNT(*) DESC\nLIMIT 30",
"refId": "A"
}
],
"title": "Top BGP Communities in Current RIB",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Routes with notable BGP attributes — tagged with communities or using non-default local-pref / MED values. These routes carry explicit policy information. Examine the Communities column for operator-defined tags and the Local Pref column to see traffic engineering decisions.",
"fieldConfig": {
"defaults": {"color": {"mode": "thresholds"},"custom": {"align": "auto","displayMode": "auto"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}},
"overrides": [
{"matcher": {"id": "byName","options": "Local Pref"},"properties": [{"id": "custom.displayMode","value": "color-text"},{"id": "color","value": {"mode": "thresholds"}},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 101},{"color": "red","value": 200}]}}]},
{"matcher": {"id": "byName","options": "MED"},"properties": [{"id": "custom.displayMode","value": "color-text"},{"id": "color","value": {"mode": "thresholds"}},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 100}]}}]}
]
},
"gridPos": {"h": 11,"w": 12,"x": 12,"y": 8},
"id": 3,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n r.prefix::text AS \"Prefix\",\n ba.origin_as AS \"Origin AS\",\n ba.community_list::text AS \"Communities\",\n ba.local_pref AS \"Local Pref\",\n ba.med AS \"MED\",\n ba.as_path_count AS \"Path Length\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND (ba.community_list IS NOT NULL OR ba.med IS NOT NULL OR ba.local_pref IS NOT NULL)\nORDER BY r.prefix\nLIMIT 100",
"refId": "A"
}
],
"title": "Routes with Notable Attributes",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: MED (Multi-Exit Discriminator) is used to influence inbound traffic from a directly connected AS. Lower MED is preferred. If most routes show 'Not Set', MED is not being used for traffic engineering. A single dominant MED value means a simple policy; many different values indicate fine-grained control.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"fillOpacity": 80,"lineWidth": 0},
"unit": "short"
}
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 19},
"id": 4,
"options": {"barRadius": 0.1,"barWidth": 0.6,"groupWidth": 0.7,"legend": {"displayMode": "list","placement": "bottom"},"orientation": "auto","text": {},"tooltip": {"mode": "single"},"xTickLabelRotation": -30,"xTickLabelSpacing": 100},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n COALESCE(ba.med::text, 'Not Set') AS \"MED Value\",\n COUNT(*) AS \"Route Count\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.med\nORDER BY ba.med NULLS LAST\nLIMIT 20",
"refId": "A"
}
],
"title": "MED Value Distribution",
"type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Local preference is an iBGP attribute — it never crosses AS boundaries. Default is 100. Routes with local-pref above 100 are preferred over the default path; below 100 they are used as last-resort. Non-100 values indicate active traffic-engineering policy. Run 'inject.py attributes' to inject routes with varied local-pref values.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"fillOpacity": 80,"lineWidth": 0},
"unit": "short"
}
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 19},
"id": 5,
"options": {"barRadius": 0.1,"barWidth": 0.6,"groupWidth": 0.7,"legend": {"displayMode": "list","placement": "bottom"},"orientation": "auto","text": {},"tooltip": {"mode": "single"},"xTickLabelRotation": -30,"xTickLabelSpacing": 100},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n COALESCE(ba.local_pref::text, 'Not Set') AS \"Local Pref\",\n COUNT(*) AS \"Route Count\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.local_pref\nORDER BY ba.local_pref DESC NULLS LAST\nLIMIT 20",
"refId": "A"
}
],
"title": "Local Preference Value Distribution",
"type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: This count tells you how widely BGP communities are used in your network. A value of 0 means no community tagging — communities are an opt-in feature. Run 'inject.py attributes' to add routes with community strings.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null},{"color": "green","value": 1}]},
"unit": "short",
"mappings": []
}
},
"gridPos": {"h": 5,"w": 8,"x": 0,"y": 28},
"id": 6,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(*) AS \"Routes with Communities\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false\n AND ba.community_list IS NOT NULL\n AND array_length(ba.community_list, 1) > 0",
"refId": "A"
}
],
"title": "Routes with Communities",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: The number of distinct community strings seen across all active routes. A diverse set indicates fine-grained policy tagging. A single value means one uniform policy tag is applied.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null},{"color": "green","value": 1},{"color": "yellow","value": 50}]},
"unit": "short",
"mappings": []
}
},
"gridPos": {"h": 5,"w": 8,"x": 8,"y": 28},
"id": 7,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(DISTINCT comm) AS \"Unique Communities\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nCROSS JOIN LATERAL unnest(ba.community_list) AS comm\nWHERE r.iswithdrawn = false",
"refId": "A"
}
],
"title": "Unique Community Values",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Routes with a local-pref other than the default (100) have been explicitly policy-engineered. A high count here means your network actively uses local-pref to prefer specific paths. A value of 0 means all paths are at default preference.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 100},{"color": "red","value": 1000}]},
"unit": "short",
"mappings": []
}
},
"gridPos": {"h": 5,"w": 8,"x": 16,"y": 28},
"id": 8,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(*) AS \"Custom Local-Pref Routes\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false\n AND ba.local_pref IS NOT NULL\n AND ba.local_pref != 100",
"refId": "A"
}
],
"title": "Routes with Non-Default Local-Pref",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","communities","attributes","policy"],
"time": {"from": "now-1h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "BGP Attribute Explorer",
"uid": "obmp-learn-06",
"version": 1
}

View File

@ -1,152 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","target": {"limit": 100,"matchAny": false,"tags": [],"type": "dashboard"},"type": "dashboard"}]},
"description": "Prefix stability analysis and route churn visualization. Teaches how to identify unstable routes and understand BGP churn.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: This chart shows BGP advertisements and withdrawals bucketed per hour. A healthy network has steady low churn. Spikes in withdrawals indicate route instability events — link failures, IBGP reconvergence, or policy changes. Run 'inject.py churn' to generate synthetic churn data and observe it here.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"drawStyle": "bars","fillOpacity": 60,"lineWidth": 1,"spanNulls": false,"stacking": {"group": "A","mode": "none"}},
"unit": "short"
},
"overrides": [
{"matcher": {"id": "byName","options": "Advertisements"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]},
{"matcher": {"id": "byName","options": "Withdrawals"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]}
]
},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {"legend": {"calcs": ["sum","max"],"displayMode": "list","placement": "bottom"},"tooltip": {"mode": "multi"}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(timestamp,'1h'),\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) AS \"Advertisements\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) AS \"Withdrawals\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "Advertisements vs Withdrawals Rate (per hour)",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: A prefix with more than 30 updates per day is considered unstable — it is flapping or being re-announced frequently. The Stability column categorizes each prefix. Run 'inject.py churn' to generate churn data and observe it here. Sort by 'Total Updates' to find the most problematic prefixes.",
"fieldConfig": {
"defaults": {"color": {"mode": "thresholds"},"custom": {"align": "auto","displayMode": "auto"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}},
"overrides": [
{"matcher": {"id": "byName","options": "Stability"},"properties": [{"id": "custom.displayMode","value": "color-text"},{"id": "mappings","value": [{"options": {"Very Stable": {"color": "green","index": 0},"Stable": {"color": "blue","index": 1},"Moderate": {"color": "yellow","index": 2},"Unstable": {"color": "red","index": 3}},"type": "value"}]}]},
{"matcher": {"id": "byName","options": "Total Updates"},"properties": [{"id": "custom.displayMode","value": "lcd-gauge"},{"id": "color","value": {"mode": "thresholds"}},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 7},{"color": "red","value": 30}]}}]}
]
},
"gridPos": {"h": 12,"w": 24,"x": 0,"y": 9},
"id": 2,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Total Updates"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n prefix::text AS \"Prefix\",\n COUNT(*) AS \"Total Updates\",\n SUM(CASE WHEN iswithdrawn THEN 1 ELSE 0 END) AS \"Withdrawals\",\n SUM(CASE WHEN NOT iswithdrawn THEN 1 ELSE 0 END) AS \"Announcements\",\n MAX(timestamp) AS \"Last Change\",\n CASE\n WHEN COUNT(*) = 1 THEN 'Very Stable'\n WHEN COUNT(*) <= 7 THEN 'Stable'\n WHEN COUNT(*) <= 30 THEN 'Moderate'\n ELSE 'Unstable'\n END AS \"Stability\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY prefix\nORDER BY \"Total Updates\" DESC\nLIMIT 100",
"refId": "A"
}
],
"title": "Top Churning Prefixes",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: This bar chart shows how many prefixes fall into each stability tier. In a healthy network, the vast majority of prefixes should be 'Very Stable' (only announced once during the window). A large 'Unstable' bar is a red flag. Run 'inject.py churn' to shift prefixes into the Unstable tier.",
"fieldConfig": {
"defaults": {
"color": {"mode": "fixed","fixedColor": "blue"},
"custom": {"fillOpacity": 80,"lineWidth": 0},
"unit": "short"
},
"overrides": [
{"matcher": {"id": "byName","options": "1. Very Stable (1 update)"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]},
{"matcher": {"id": "byName","options": "2. Stable (2-7 updates)"},"properties": [{"id": "color","value": {"fixedColor": "blue","mode": "fixed"}}]},
{"matcher": {"id": "byName","options": "3. Moderate (8-30 updates)"},"properties": [{"id": "color","value": {"fixedColor": "yellow","mode": "fixed"}}]},
{"matcher": {"id": "byName","options": "4. Unstable (31+ updates)"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]}
]
},
"gridPos": {"h": 9,"w": 14,"x": 0,"y": 21},
"id": 3,
"options": {"barRadius": 0.1,"barWidth": 0.6,"groupWidth": 0.7,"legend": {"displayMode": "list","placement": "bottom"},"orientation": "auto","text": {},"tooltip": {"mode": "single"},"xTickLabelRotation": 0,"xTickLabelSpacing": 200},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n CASE\n WHEN cnt = 1 THEN '1. Very Stable (1 update)'\n WHEN cnt <= 7 THEN '2. Stable (2-7 updates)'\n WHEN cnt <= 30 THEN '3. Moderate (8-30 updates)'\n ELSE '4. Unstable (31+ updates)'\n END AS \"Stability Tier\",\n COUNT(*) AS \"Prefix Count\"\nFROM (\n SELECT prefix, COUNT(*) as cnt\n FROM ip_rib_log\n WHERE $__timeFilter(timestamp)\n GROUP BY prefix\n) sub\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "Prefix Distribution by Stability Tier",
"type": "barchart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: This is the single most churning prefix in the selected time range. If a prefix appears here repeatedly across time ranges, it may warrant investigation — check the AS path and peers announcing it.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null}]},
"unit": "string",
"mappings": []
}
},
"gridPos": {"h": 5,"w": 10,"x": 14,"y": 21},
"id": 4,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "center","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {"titleSize": 14,"valueSize": 18}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, prefix::text AS \"Most Churned Prefix\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY prefix\nORDER BY COUNT(*) DESC\nLIMIT 1",
"refId": "A"
}
],
"title": "Most Churned Prefix",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: This counts how many distinct prefixes had at least one update event in the selected time window. During a normal steady state this number should be low. After a major routing event (e.g., upstream link failure) you may see thousands of prefixes change simultaneously.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 500},{"color": "red","value": 2000}]},
"unit": "short",
"mappings": []
}
},
"gridPos": {"h": 4,"w": 10,"x": 14,"y": 26},
"id": 5,
"options": {"colorMode": "background","graphMode": "area","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT prefix) AS \"Prefixes with Updates\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)",
"refId": "A"
}
],
"title": "Total Unique Prefixes with Updates",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","churn","stability"],
"time": {"from": "now-24h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Route Churn & Stability Score",
"uid": "obmp-learn-05",
"version": 1
}

View File

@ -1,144 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "BGP peer session health, uptime, and flap analysis. Teaches session stability and how to diagnose flapping peers.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: A healthy BGP mesh shows all peers UP continuously. Any gap in the UP state represents a session flap — investigate the reset reason.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"custom": {"fillOpacity": 70,"lineWidth": 0,"spanNulls": false},
"mappings": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}],
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]}
}
},
"gridPos": {"h": 8,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {"alignValue": "left","legend": {"displayMode": "list","placement": "bottom"},"mergeValues": true,"rowHeight": 0.9,"showValue": "auto","tooltip": {"mode": "single"}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(e.timestamp,'1m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n CASE WHEN e.state = 'up' THEN 1 ELSE 0 END AS \"value\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nORDER BY 1, 2",
"refId": "A"
}
],
"title": "Peer Session State Timeline",
"type": "state-timeline"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Current state of all BGP peers. Learn: 'bmp_reason' tells you why BMP reporting stopped. 'bgp_err_code' shows BGP NOTIFICATION error codes.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]},
{"matcher": {"id": "byName","options": "Peer"},"properties": [{"id": "custom.width","value": 200}]},
{"matcher": {"id": "byName","options": "AS"},"properties": [{"id": "custom.width","value": 80}]}
]
},
"gridPos": {"h": 12,"w": 24,"x": 0,"y": 8},
"id": 2,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": false,"displayName": "State"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n COALESCE(p.name, p.peer_addr::text) AS \"Peer\",\n p.peer_addr AS \"Address\",\n p.peer_as AS \"AS\",\n p.state AS \"State\",\n p.timestamp AS \"Last State Change\",\n p.error_text AS \"Last Error\",\n p.local_hold_time AS \"Hold Time\"\nFROM bgp_peers p\nWHERE p.isprepolicy = true\nORDER BY p.state, p.peer_addr",
"refId": "A"
}
],
"title": "Current Peer State",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Flap count = number of times a peer went from UP to DOWN. A peer flapping more than 2 times per hour needs investigation.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "Flap Count"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 5}]}}]}
]
},
"gridPos": {"h": 10,"w": 24,"x": 0,"y": 20},
"id": 3,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Flap Count"}]},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n COALESCE(p.name, p.peer_addr::text) AS \"Peer\",\n p.peer_addr AS \"Address\",\n p.peer_as AS \"AS\",\n COUNT(CASE WHEN e.state = 'down' THEN 1 END) AS \"Flap Count\",\n MIN(e.timestamp) AS \"First Event\",\n MAX(e.timestamp) AS \"Last Event\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nGROUP BY p.name, p.peer_addr, p.peer_as\nORDER BY \"Flap Count\" DESC",
"refId": "A"
}
],
"title": "Peer Flap Analysis",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "yellow","value": 50},{"color": "green","value": 90}]},"unit": "percent","max": 100,"min": 0}},
"gridPos": {"h": 8,"w": 8,"x": 0,"y": 30},
"id": 4,
"options": {"orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"showThresholdLabels": false,"showThresholdMarkers": true,"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(100.0 * SUM(CASE WHEN state = 'up' THEN 1 ELSE 0 END) / NULLIF(COUNT(*),0), 1) AS \"Mesh Health %\"\nFROM bgp_peers WHERE isprepolicy = true",
"refId": "A"
}
],
"title": "Overall Peer Mesh Health",
"type": "gauge"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]},"unit": "short","mappings": [{"options": {"0": {"color": "red","index": 0,"text": "DOWN"}},"type": "value"}]}},
"gridPos": {"h": 8,"w": 8,"x": 8,"y": 30},
"id": 5,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n SUM(CASE WHEN state = 'up' THEN 1 ELSE 0 END) AS \"Peers UP\"\nFROM bgp_peers WHERE isprepolicy = true",
"refId": "A"
}
],
"title": "Peers Currently UP",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 5}]},"unit": "short"}},
"gridPos": {"h": 8,"w": 8,"x": 16,"y": 30},
"id": 6,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n COUNT(CASE WHEN state = 'down' THEN 1 END) AS \"Flap Events (24h)\"\nFROM peer_event_log\nWHERE timestamp > NOW() - INTERVAL '24 hours' AND state = 'down'",
"refId": "A"
}
],
"title": "Flap Events (24h)",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","peers","flap"],
"time": {"from": "now-24h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Peer Session Health & Flap Analysis",
"uid": "obmp-learn-02",
"version": 1
}

View File

@ -1,150 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "RPKI (Resource Public Key Infrastructure) validation status. Teaches BGP routing security and how RPKI prevents prefix hijacks by validating route origin.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"content": "## What is RPKI?\n\nRPKI (Resource Public Key Infrastructure) is a cryptographic security framework for BGP routing. It lets IP address holders publish **Route Origin Authorizations (ROAs)** stating which ASNs are authorized to originate their prefixes.\n\n### RPKI Validation States\n| State | Meaning |\n|-------|----------|\n| **Valid** | The route's origin AS matches a ROA for this prefix |\n| **Invalid** | A ROA exists but the origin AS or prefix length does NOT match — this route is potentially a hijack |\n| **NotFound** | No ROA exists for this prefix/origin — unprotected, can't be validated |\n\n### How to read this dashboard\n- **Valid %** should be as high as possible (target: 100%)\n- **Invalid routes** are critical — they indicate either a misconfiguration or a prefix hijack\n- Routes with no RPKI data show as **NotFound** — they are not necessarily invalid, just unprotected\n\n> **Lab note:** The RPKI validator table is populated by a cron job in psql-app every 2 hours. If the table shows 0 rows, wait for the cron to run or check `ENABLE_RPKI=1` in docker-compose.yml.",
"datasource": {"type": "datasource","uid": "grafana"},
"gridPos": {"h": 10,"w": 8,"x": 0,"y": 0},
"id": 1,
"options": {"content": "## What is RPKI?\n\nRPKI (Resource Public Key Infrastructure) is a cryptographic security framework for BGP routing. It lets IP address holders publish **Route Origin Authorizations (ROAs)** stating which ASNs are authorized to originate their prefixes.\n\n### RPKI Validation States\n| State | Meaning |\n|-------|----------|\n| **Valid** | The route's origin AS matches a ROA for this prefix |\n| **Invalid** | A ROA exists but the origin AS or prefix length does NOT match — this route is potentially a hijack |\n| **NotFound** | No ROA exists for this prefix/origin — unprotected, can't be validated |\n\n### How to read this dashboard\n- **Valid %** should be as high as possible (target: 100%)\n- **Invalid routes** are critical — they indicate either a misconfiguration or a prefix hijack\n- Routes with no RPKI data show as **NotFound** — they are not necessarily invalid, just unprotected\n\n> **Lab note:** The RPKI validator table is populated by a cron job in psql-app every 2 hours. If the table shows 0 rows, wait for the cron to run or check `ENABLE_RPKI=1` in docker-compose.yml.","mode": "markdown"},
"pluginVersion": "9.1.7",
"title": "RPKI Learning Guide",
"type": "text"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total ROAs (Route Origin Authorizations) loaded from the RPKI validator. If 0, the cron job has not yet run.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "yellow","value": 1},{"color": "green","value": 100000}]},
"unit": "short"
}
},
"gridPos": {"h": 5,"w": 4,"x": 8,"y": 0},
"id": 2,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"RPKI ROAs Loaded\" FROM rpki_validator",
"refId": "A"
}
],
"title": "RPKI ROAs Loaded",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routes with a matching valid ROA — origin AS and prefix length both match.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]},
"unit": "short"
}
},
"gridPos": {"h": 5,"w": 4,"x": 12,"y": 0},
"id": 3,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"Valid Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nJOIN rpki_validator rv ON rv.prefix >>= r.prefix AND rv.origin_as = ba.origin_as AND r.prefix_len <= rv.prefix_len_max\nWHERE r.iswithdrawn = false AND r.isipv4 = true",
"refId": "A"
}
],
"title": "RPKI Valid Routes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Routes where a ROA exists but the origin AS does NOT match — high-priority investigation needed.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "red","value": 1}]},
"unit": "short"
}
},
"gridPos": {"h": 5,"w": 4,"x": 16,"y": 0},
"id": 4,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"RPKI Invalid Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND EXISTS (\n SELECT 1 FROM rpki_validator rv\n WHERE rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as\n )\n AND NOT EXISTS (\n SELECT 1 FROM rpki_validator rv\n WHERE rv.prefix >>= r.prefix AND rv.origin_as = ba.origin_as AND r.prefix_len <= rv.prefix_len_max\n )",
"refId": "A"
}
],
"title": "RPKI Invalid Routes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: ExaBGP-injected routes (AS 65100) will be NotFound since they use synthetic ASNs not registered in RPKI. Real internet prefixes with valid ROAs will appear as Valid.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"hideFrom": {"legend": false,"tooltip": false,"viz": false}},
"mappings": []
},
"overrides": []
},
"gridPos": {"h": 10,"w": 10,"x": 0,"y": 10},
"id": 5,
"options": {"displayLabels": ["percent","name"],"legend": {"displayMode": "list","placement": "bottom"},"pieType": "donut","tooltip": {"mode": "single"}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n CASE\n WHEN rv_valid.prefix IS NOT NULL THEN 'Valid'\n WHEN rv_any.prefix IS NOT NULL THEN 'Invalid'\n ELSE 'NotFound'\n END AS \"RPKI Status\",\n COUNT(*) AS \"Route Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nLEFT JOIN rpki_validator rv_valid\n ON rv_valid.prefix >>= r.prefix AND rv_valid.origin_as = ba.origin_as AND r.prefix_len <= rv_valid.prefix_len_max\nLEFT JOIN rpki_validator rv_any\n ON rv_any.prefix >>= r.prefix AND rv_any.origin_as != ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "RPKI Validation Status Distribution",
"type": "piechart"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Prefixes that have a ROA but the observed origin AS does not match. These are the most security-critical routes — each one represents a potential hijack or misconfiguration.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "Status"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"Invalid": {"color": "red","index": 0},"Valid": {"color": "green","index": 1},"NotFound": {"color": "yellow","index": 2}},"type": "value"}]}]}
]
},
"gridPos": {"h": 14,"w": 14,"x": 10,"y": 10},
"id": 6,
"options": {"footer": {"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "table",
"rawSql": "SELECT\n r.prefix AS \"Prefix\",\n ba.origin_as AS \"Observed Origin AS\",\n rv.origin_as AS \"Authorized Origin AS (ROA)\",\n 'Invalid' AS \"Status\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nJOIN rpki_validator rv ON rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND NOT EXISTS (\n SELECT 1 FROM rpki_validator rv2\n WHERE rv2.prefix >>= r.prefix AND rv2.origin_as = ba.origin_as AND r.prefix_len <= rv2.prefix_len_max\n )\nORDER BY r.prefix\nLIMIT 50",
"refId": "A"
}
],
"title": "RPKI Invalid Routes — Potential Hijacks",
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","rpki","security"],
"time": {"from": "now-1h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "RPKI Validation Status",
"uid": "obmp-learn-04",
"version": 1
}

View File

@ -1,137 +0,0 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","target": {"limit": 100,"matchAny": false,"tags": [],"type": "dashboard"},"type": "dashboard"}]},
"description": "BGP update and withdrawal rates over time. Teaches what normal BGP traffic looks like and how to detect route churn or instability.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: A healthy network has far more advertisements than withdrawals. A withdrawal spike often signals a link failure or route flap.",
"fieldConfig": {
"defaults": {
"color": {"mode": "palette-classic"},
"custom": {"drawStyle": "bars","fillOpacity": 60,"lineWidth": 1,"spanNulls": false,"stacking": {"group": "A","mode": "none"}},
"unit": "short"
},
"overrides": [
{"matcher": {"id": "byName","options": "Advertisements"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]},
{"matcher": {"id": "byName","options": "Withdrawals"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]}
]
},
"gridPos": {"h": 10,"w": 24,"x": 0,"y": 0},
"id": 1,
"options": {"legend": {"calcs": ["sum","max"],"displayMode": "list","placement": "bottom"},"tooltip": {"mode": "multi"}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(timestamp,'5m'),\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) AS \"Advertisements\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) AS \"Withdrawals\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "BGP Updates Over Time — Advertisements vs Withdrawals",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 100},{"color": "red","value": 1000}]},"unit": "short","mappings": []}},
"gridPos": {"h": 5,"w": 6,"x": 0,"y": 10},
"id": 2,
"options": {"colorMode": "background","graphMode": "area","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"Total Updates (24h)\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Total Updates (24h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Learn: Withdrawal rate above 30% is unusual. Above 50% may indicate a route leak or oscillation event.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 20},{"color": "red","value": 50}]},"unit": "percent","max": 100}},
"gridPos": {"h": 5,"w": 6,"x": 6,"y": 10},
"id": 3,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(100.0 * SUM(CASE WHEN iswithdrawn THEN 1 ELSE 0 END) / NULLIF(COUNT(*),0), 1) AS \"Withdrawal Rate %\"\nFROM ip_rib_log\nWHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Withdrawal Rate % (24h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1000},{"color": "red","value": 10000}]},"unit": "short"}},
"gridPos": {"h": 5,"w": 6,"x": 12,"y": 10},
"id": 4,
"options": {"colorMode": "value","graphMode": "area","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT peer_hash_id) AS \"Active Peers\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '1 hour'",
"refId": "A"
}
],
"title": "Active Reporting Peers (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 500},{"color": "red","value": 2000}]},"unit": "short"}},
"gridPos": {"h": 5,"w": 6,"x": 18,"y": 10},
"id": 5,
"options": {"colorMode": "value","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"text": {}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT prefix) AS \"Unique Prefixes Updated (24h)\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Unique Prefixes Updated (24h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Updates per peer over time. Learn: Peers should have similar update rates. A peer with dramatically more updates may be experiencing instability or receiving a full BGP table with frequent changes.",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"drawStyle": "line","fillOpacity": 10,"lineWidth": 1,"spanNulls": false},"unit": "short"}
},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 15},
"id": 6,
"options": {"legend": {"calcs": [],"displayMode": "list","placement": "right"},"tooltip": {"mode": "multi"}},
"targets": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(s.interval_time,'30m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n SUM(s.advertise_avg + s.withdraw_avg) AS \"Updates\"\nFROM stats_peer_update_counts s\nJOIN bgp_peers p ON p.hash_id = s.peer_hash_id\nWHERE $__timeFilter(s.interval_time)\nGROUP BY 1, 2\nORDER BY 1",
"refId": "A"
}
],
"title": "Update Rate by Peer (30-min buckets)",
"type": "timeseries"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","learning","bgp","churn"],
"time": {"from": "now-24h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "BGP Update Rate & Churn",
"uid": "obmp-learn-01",
"version": 1
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,404 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "Combined view of BMP control-plane data (from PostgreSQL) and gNMI data-plane telemetry (from InfluxDB). Correlate BGP peer state with interface traffic patterns.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"templating": {
"list": [
{
"current": {},
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"definition": "from(bucket: \"telemetry\")\n |> range(start: -1h)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> keep(columns: [\"source\"])\n |> distinct(column: \"source\")\n |> sort()",
"hide": 0,
"includeAll": true,
"label": "Router",
"multi": true,
"name": "router",
"options": [],
"query": "import \"influxdata/influxdb/schema\"\nschema.tagValues(bucket: \"telemetry\", tag: \"source\", predicate: (r) => r._measurement == \"interface_counters\", start: -1h)",
"refresh": 2,
"regex": "",
"type": "query"
}
]
},
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Current BGP peer status from the OpenBMP PostgreSQL database. Shows peer address, name, and session state.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": true
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "state"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "mappings",
"value": [
{
"options": {
"down": {
"color": "red",
"index": 1,
"text": "DOWN"
},
"up": {
"color": "green",
"index": 0,
"text": "UP"
}
},
"type": "value"
}
]
}
]
},
{
"matcher": {
"id": "byName",
"options": "peer_addr"
},
"properties": [
{
"id": "custom.width",
"value": 160
}
]
},
{
"matcher": {
"id": "byName",
"options": "name"
},
"properties": [
{
"id": "custom.width",
"value": 200
}
]
}
]
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": false,
"displayName": "state"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n p.peer_addr,\n COALESCE(p.name, p.peer_addr::text) AS name,\n p.state,\n p.peer_as AS \"AS\",\n p.router_hash_id IS NOT NULL AS \"BMP Active\",\n p.timestamp AS \"Last State Change\"\nFROM bgp_peers p\nWHERE p.isprepolicy = true\nORDER BY p.state, p.peer_addr",
"refId": "A"
}
],
"title": "BGP Peer Status",
"type": "table"
},
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Interface traffic rates from gNMI streaming telemetry. Shows bytes per second for each interface across selected routers.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "Bps"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 10
},
"id": 2,
"options": {
"legend": {
"calcs": [
"mean",
"max"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r._field == \"in-octets\" or r._field == \"out-octets\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)\n |> map(fn: (r) => ({r with _value: if r._value < 0.0 then 0.0 else r._value}))",
"refId": "A"
}
],
"title": "Interface Traffic",
"type": "timeseries"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "BGP update activity over time from the OpenBMP PostgreSQL database. Shows peer event transitions and update counts for correlation with traffic patterns.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "bars",
"fillOpacity": 50,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "normal"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 20
},
"id": 3,
"options": {
"legend": {
"calcs": [
"sum"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(e.timestamp, '1m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n COUNT(*) AS \"value\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nGROUP BY 1, 2\nORDER BY 1",
"refId": "A"
}
],
"title": "BGP Update Activity",
"type": "timeseries"
},
{
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"gridPos": {
"h": 6,
"w": 24,
"x": 0,
"y": 30
},
"id": 4,
"options": {
"code": {
"language": "plaintext",
"showLineNumbers": false,
"showMiniMap": false
},
"content": "## Combined BMP + Telemetry View\n\nThis dashboard integrates two complementary data sources to provide a unified network monitoring view:\n\n### Control Plane (BMP via PostgreSQL)\n- **BGP Peer Status** -- Real-time BGP session state from BMP (OpenBMP)\n- **BGP Update Activity** -- Session transitions and update events from `peer_event_log`\n\n### Data Plane (gNMI via InfluxDB)\n- **Interface Traffic** -- Streaming telemetry byte rates collected via gNMI at 10-second intervals\n\n### Correlation Use Cases\n- A BGP peer flap (control plane) should correlate with a traffic shift on affected interfaces (data plane)\n- Sustained high interface utilization (data plane) may precede BGP session resets due to congestion\n- Compare the number of active BGP peers with interface traffic to validate routing convergence",
"mode": "markdown"
},
"title": "About",
"type": "text"
}
],
"schemaVersion": 39,
"style": "dark",
"tags": [
"obmp-telemetry",
"obmp",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "Combined BMP + Telemetry View",
"uid": "obmp-telem-03",
"version": 1
}

View File

@ -0,0 +1,491 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "Interface error and drop counters collected via gNMI streaming telemetry. Helps identify interfaces with packet loss or physical layer issues.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"templating": {
"list": [
{
"current": {},
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"definition": "from(bucket: \"telemetry\")\n |> range(start: -1h)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> keep(columns: [\"source\"])\n |> distinct(column: \"source\")\n |> sort()",
"hide": 0,
"includeAll": true,
"label": "Router",
"multi": true,
"name": "router",
"options": [],
"query": "import \"influxdata/influxdb/schema\"\nschema.tagValues(bucket: \"telemetry\", tag: \"source\", predicate: (r) => r._measurement == \"interface_counters\", start: -1h)",
"refresh": 2,
"regex": "",
"type": "query"
},
{
"current": {},
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"definition": "from(bucket: \"telemetry\")\n |> range(start: -1h)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> keep(columns: [\"name\"])\n |> distinct(column: \"name\")\n |> sort()",
"hide": 0,
"includeAll": true,
"label": "Interface",
"multi": true,
"name": "interface",
"options": [],
"query": "import \"influxdata/influxdb/schema\"\nschema.tagValues(bucket: \"telemetry\", tag: \"name\", predicate: (r) => r._measurement == \"interface_counters\" and r.source =~ /${router:regex}/, start: -1h)",
"refresh": 2,
"regex": "",
"type": "query"
}
]
},
"panels": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Interface error counters over time: input errors, output errors, and CRC errors. A rising trend indicates physical or configuration issues.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": [
"mean",
"max",
"last"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-errors\" or r._field == \"out-errors\" or r._field == \"in-fcs-errors\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)",
"refId": "A"
}
],
"title": "Interface Errors",
"type": "timeseries"
},
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Interface drop counters over time: input drops and output drops. Drops indicate congestion or queue overflow.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 10
},
"id": 2,
"options": {
"legend": {
"calcs": [
"mean",
"max",
"last"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-discards\" or r._field == \"out-discards\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)",
"refId": "A"
}
],
"title": "Interface Drops",
"type": "timeseries"
},
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Summary table showing the latest error and drop counter values per interface. Useful for quickly identifying problematic interfaces.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": true
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "in-errors"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "out-errors"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "in-discards"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "out-discards"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 100
}
]
}
}
]
}
]
},
"gridPos": {
"h": 12,
"w": 24,
"x": 0,
"y": 20
},
"id": 3,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "in-errors"
}
]
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-errors\" or r._field == \"out-errors\" or r._field == \"in-fcs-errors\" or r._field == \"in-discards\" or r._field == \"out-discards\")\n |> toFloat()\n |> last()\n |> pivot(rowKey: [\"_time\"], columnKey: [\"_field\"], valueColumn: \"_value\")\n |> keep(columns: [\"source\", \"name\", \"in-errors\", \"out-errors\", \"in-fcs-errors\", \"in-discards\", \"out-discards\"])\n |> sort(columns: [\"in-errors\"], desc: true)",
"refId": "A"
}
],
"title": "Error Summary Table",
"type": "table"
}
],
"schemaVersion": 39,
"style": "dark",
"tags": [
"obmp-telemetry",
"obmp",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "Interface Errors & Drops",
"uid": "obmp-telem-02",
"version": 1
}

View File

@ -0,0 +1,385 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "Interface utilization metrics collected via gNMI streaming telemetry from IOS-XR routers. Shows byte rates, packet rates, and top interfaces by traffic volume.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"templating": {
"list": [
{
"current": {},
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"definition": "from(bucket: \"telemetry\")\n |> range(start: -1h)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> keep(columns: [\"source\"])\n |> distinct(column: \"source\")\n |> sort()",
"hide": 0,
"includeAll": true,
"label": "Router",
"multi": true,
"name": "router",
"options": [],
"query": "import \"influxdata/influxdb/schema\"\nschema.tagValues(bucket: \"telemetry\", tag: \"source\", predicate: (r) => r._measurement == \"interface_counters\", start: -1h)",
"refresh": 2,
"regex": "",
"type": "query"
},
{
"current": {},
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"definition": "from(bucket: \"telemetry\")\n |> range(start: -1h)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> keep(columns: [\"name\"])\n |> distinct(column: \"name\")\n |> sort()",
"hide": 0,
"includeAll": true,
"label": "Interface",
"multi": true,
"name": "interface",
"options": [],
"query": "import \"influxdata/influxdb/schema\"\nschema.tagValues(bucket: \"telemetry\", tag: \"name\", predicate: (r) => r._measurement == \"interface_counters\" and r.source =~ /${router:regex}/, start: -1h)",
"refresh": 2,
"regex": "",
"type": "query"
}
]
},
"panels": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Rate of bytes received and sent per interface, computed as the derivative of cumulative counters. Unit: bytes per second.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "Bps"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": [
"mean",
"max"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-octets\" or r._field == \"out-octets\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)\n |> map(fn: (r) => ({r with _value: if r._value < 0.0 then 0.0 else r._value}))",
"refId": "A"
}
],
"title": "Input/Output Bytes Rate",
"type": "timeseries"
},
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Rate of packets received and sent per interface, computed as the derivative of cumulative counters. Unit: packets per second.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "never",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "pps"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 10
},
"id": 2,
"options": {
"legend": {
"calcs": [
"mean",
"max"
],
"displayMode": "table",
"placement": "bottom"
},
"tooltip": {
"mode": "multi",
"sort": "desc"
}
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-pkts\" or r._field == \"out-pkts\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)\n |> map(fn: (r) => ({r with _value: if r._value < 0.0 then 0.0 else r._value}))",
"refId": "A"
}
],
"title": "Input/Output Packets Rate",
"type": "timeseries"
},
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"description": "Top interfaces ranked by total bytes (received + sent) over the selected time range.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisLabel": "",
"axisPlacement": "auto",
"fillOpacity": 80,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineWidth": 1,
"scaleDistribution": {
"type": "linear"
},
"thresholdsStyle": {
"mode": "off"
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "decbytes"
}
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 20
},
"id": 3,
"options": {
"barRadius": 0,
"barWidth": 0.6,
"fullHighlight": false,
"groupWidth": 0.7,
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom"
},
"orientation": "horizontal",
"showValue": "auto",
"stacking": "none",
"tooltip": {
"mode": "single",
"sort": "none"
},
"xTickLabelRotation": 0
},
"targets": [
{
"datasource": {
"type": "influxdb",
"uid": "obmp_influxdb"
},
"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"interface_counters\")\n |> filter(fn: (r) => r.source =~ /${router:regex}/)\n |> filter(fn: (r) => r.name =~ /${interface:regex}/)\n |> filter(fn: (r) => r._field == \"in-octets\" or r._field == \"out-octets\")\n |> toFloat()\n |> derivative(unit: 1s, nonNegative: true)\n |> group(columns: [\"source\", \"name\", \"_field\"])\n |> sum()\n |> group(columns: [\"source\", \"name\"])\n |> sum()\n |> group()\n |> sort(columns: [\"_value\"], desc: true)\n |> limit(n: 20)",
"refId": "A"
}
],
"title": "Top Interfaces by Traffic",
"type": "barchart"
},
{
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"gridPos": {
"h": 4,
"w": 24,
"x": 0,
"y": 30
},
"id": 4,
"options": {
"code": {
"language": "plaintext",
"showLineNumbers": false,
"showMiniMap": false
},
"content": "## Interface Utilization Dashboard\n\nThis dashboard displays real-time interface utilization metrics collected via **gNMI streaming telemetry** from IOS-XR routers.\n\n- **Data source:** InfluxDB (Telegraf gNMI input plugin)\n- **YANG model:** OpenConfig (`openconfig-interfaces`)\n- **Subscription path:** `/interfaces/interface/state/counters`\n- **Sample interval:** 10 seconds\n\nUse the **Router** and **Interface** template variables at the top to filter the view.",
"mode": "markdown"
},
"title": "About",
"type": "text"
}
],
"schemaVersion": 39,
"style": "dark",
"tags": [
"obmp-telemetry",
"obmp",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "Interface Utilization",
"uid": "obmp-telem-01",
"version": 1
}

View File

@ -0,0 +1,78 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Per-container CPU, memory, and I/O for the OpenBMP stack — collected by the Telegraf docker input. Watch memory % to catch a container approaching its mem_limit before it OOM-crashes.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}],
"liveNow": false,
"panels": [
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Memory usage as a percentage of each container's mem_limit. Sustained values near 100% precede an OOM kill.",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 10,"lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"}},"unit": "percent","min": 0,"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "orange","value": 80},{"color": "red","value": 95}]}},
"overrides": []
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 0},
"id": 1,
"options": {"legend": {"calcs": ["max"],"displayMode": "table","placement": "right","showLegend": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"docker_container_mem\" and r._field == \"usage_percent\")\n |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)\n |> keep(columns: [\"_time\", \"_value\", \"container_name\"])\n |> group(columns: [\"container_name\"])","refId": "A"}],
"title": "Container Memory %",
"type": "timeseries"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "CPU usage per container (cpu-total). Can exceed 100% — that is multiple cores.",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 10,"lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"}},"unit": "percent","min": 0},
"overrides": []
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 0},
"id": 2,
"options": {"legend": {"calcs": ["max"],"displayMode": "table","placement": "right","showLegend": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"docker_container_cpu\" and r._field == \"usage_percent\" and r.cpu == \"cpu-total\")\n |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)\n |> keep(columns: [\"_time\", \"_value\", \"container_name\"])\n |> group(columns: [\"container_name\"])","refId": "A"}],
"title": "Container CPU %",
"type": "timeseries"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Absolute memory usage per container.",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisPlacement": "auto","drawStyle": "line","fillOpacity": 10,"lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"}},"unit": "bytes","min": 0},
"overrides": []
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 9},
"id": 3,
"options": {"legend": {"calcs": ["max"],"displayMode": "table","placement": "right","showLegend": true},"tooltip": {"mode": "multi","sort": "desc"}},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"docker_container_mem\" and r._field == \"usage\")\n |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)\n |> keep(columns: [\"_time\", \"_value\", \"container_name\"])\n |> group(columns: [\"container_name\"])","refId": "A"}],
"title": "Container Memory Usage",
"type": "timeseries"
},
{
"datasource": {"type": "influxdb","uid": "obmp_influxdb"},
"description": "Current memory pressure per container. Anything in orange/red is close to its mem_limit.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"},"unit": "percent"},
"overrides": [{"matcher": {"id": "byName","options": "Memory %"},"properties": [{"id": "custom.displayMode","value": "gradient-gauge"},{"id": "max","value": 100},{"id": "thresholds","value": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "orange","value": 80},{"color": "red","value": 95}]}}]}]
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 9},
"id": 4,
"options": {"showHeader": true,"sortBy": [{"desc": true,"displayName": "Memory %"}]},
"targets": [{"datasource": {"type": "influxdb","uid": "obmp_influxdb"},"query": "from(bucket: \"telemetry\")\n |> range(start: -5m)\n |> filter(fn: (r) => r._measurement == \"docker_container_mem\" and r._field == \"usage_percent\")\n |> last()\n |> keep(columns: [\"container_name\", \"_value\"])\n |> group()\n |> rename(columns: {_value: \"Memory %\", container_name: \"Container\"})\n |> sort(columns: [\"Memory %\"], desc: true)","refId": "A"}],
"title": "Current Memory % by Container",
"type": "table"
}
],
"refresh": "30s",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp", "obmp-nav", "telemetry", "resources"],
"time": {"from": "now-1h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Stack Resources",
"uid": "obmp-stack-resources",
"version": 1
}

View File

@ -24,25 +24,47 @@
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 3,
"id": null,
"iteration": 1654876929746,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
"aliasColors": {},
"breakPoint": "50%",
"combine": {
"label": "Others",
"threshold": 0
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"fontSize": "80%",
"format": "none",
"description": "IPv4 vs IPv6 prefix count advertised by this ASN.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 5,
@ -50,24 +72,50 @@
"y": 0
},
"id": 6,
"legend": {
"show": true,
"values": true
},
"legendType": "Under graph",
"links": [],
"maxDataPoints": 3,
"nullPointMode": "connected",
"pieType": "pie",
"strokeWidth": 1,
"options": {
"displayLabels": [
"value"
],
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "bottom",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"alias": "",
"format": "time_series",
"rawSql": "SELECT\n max(timestamp) as time,\n count(*) as \"ipv4\"\nFROM\n global_ip_rib\nWHERE\n recv_origin_as = [[asn_num]]\n and family(prefix) = 4\nGROUP BY prefix\n",
"refId": "A"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"alias": "",
"format": "time_series",
"rawSql": "SELECT\n max(timestamp) as time,\n count(*) as \"ipv6\"\nFROM\n global_ip_rib\nWHERE\n recv_origin_as = [[asn_num]]\n and family(prefix) = 6\nGROUP BY prefix\n",
@ -75,8 +123,7 @@
}
],
"title": "Advertised IP Addresses",
"type": "grafana-piechart-panel",
"valueName": "total"
"type": "piechart"
},
{
"datasource": {
@ -175,99 +222,91 @@
"type": "stat"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"description": "IPv4/IPv6 prefixes originated by this ASN over time, with RPKI/IRR coverage (from stats_ip_origins).",
"fieldConfig": {
"defaults": {
"links": []
"color": {
"mode": "palette-classic"
},
"custom": {
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 8,
"w": 15,
"x": 9,
"y": 0
},
"hiddenSeries": false,
"id": 14,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"hideEmpty": false,
"hideZero": false,
"max": true,
"min": true,
"rightSide": true,
"show": true,
"total": false,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
"legend": {
"calcs": [
"min",
"max",
"mean"
],
"displayMode": "table",
"placement": "right",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"percentage": false,
"pluginVersion": "8.5.4",
"pointradius": 5,
"points": true,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"alias": "",
"format": "time_series",
"rawSql": "SELECT\n $__time(interval_time),\n v4_prefixes,v6_prefixes,v4_with_rpki,v6_with_rpki,v4_with_irr,v6_with_irr\nFROM\n stats_ip_origins\nWHERE\n $__timeFilter(interval_time) and asn = [[asn_num]]\nORDER BY interval_time asc\n",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": "24h",
"timeRegions": [],
"title": "Originating Prefix Trend",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"mode": "time",
"show": true,
"values": []
},
"yaxes": [
{
"decimals": 0,
"format": "none",
"logBase": 1,
"show": true
},
{
"format": "short",
"logBase": 1,
"show": false
}
],
"yaxis": {
"align": false
}
"type": "timeseries"
},
{
"datasource": {
@ -984,32 +1023,31 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-base"
"obmp",
"obmp-nav",
"operations"
],
"templating": {
"list": [
{
"current": {
"selected": false,
"text": "714",
"value": "714"
},
"hide": 0,
"includeAll": false,
"label": "ASN",
"multi": false,
"name": "asn_num",
"type": "textbox",
"label": "Origin AS",
"description": "Enter an origin AS number \u2014 every panel shows that AS's prefixes, upstreams, and downstreams from the BMP RIB.",
"query": "13335",
"current": {
"text": "13335",
"value": "13335"
},
"options": [
{
"selected": true,
"text": "109",
"value": "109"
"text": "13335",
"value": "13335",
"selected": true
}
],
"query": "109",
"queryValue": "714",
"skipUrlSync": false,
"type": "custom"
"hide": 0,
"skipUrlSync": false
}
]
},

View File

@ -24,8 +24,10 @@
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 4,
"links": [],
"id": null,
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"liveNow": false,
"panels": [
{
@ -182,7 +184,25 @@
]
}
},
"overrides": []
"overrides": [
{
"matcher": {"id": "byName","options": "name"},
"properties": [
{"id": "links","value": [{"title": "Open Router Detail","url": "/d/obmp-router-detail/router-detail?var-router_hash=${__data.fields[\"hash_id\"]}"}]}
]
},
{
"matcher": {"id": "byName","options": "hash_id"},
"properties": [{"id": "custom.hidden","value": true}]
},
{
"matcher": {"id": "byName","options": "state"},
"properties": [
{"id": "custom.displayMode","value": "color-background"},
{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}
]
}
]
},
"gridPos": {
"h": 11,
@ -215,7 +235,7 @@
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select max(r.timestamp) as timestamp,r.name,max(ip_address) as ip_address,max(r.state) as state,\n count(*) as peers,max(description) as description, CASE WHEN max(r.state) = 'up' THEN 1 ELSE 0 END as stateBool\n from routers r\n JOIN bgp_peers p on (r.hash_id = p.router_hash_id)\n GROUP BY r.name;",
"rawSql": "select max(r.hash_id::text) as hash_id,max(r.timestamp) as timestamp,r.name,max(ip_address) as ip_address,max(r.state) as state,\n count(*) as peers,max(description) as description, CASE WHEN max(r.state) = 'up' THEN 1 ELSE 0 END as stateBool\n from routers r\n JOIN bgp_peers p on (r.hash_id = p.router_hash_id)\n GROUP BY r.name;",
"refId": "A",
"select": [
[
@ -397,7 +417,25 @@
]
}
},
"overrides": []
"overrides": [
{
"matcher": {"id": "byName","options": "PeerName"},
"properties": [
{"id": "links","value": [{"title": "Open Peer Detail","url": "/d/obmp-peer-detail/peer-detail?var-peer_hash=${__data.fields[\"peer_hash_id\"]}"}]}
]
},
{
"matcher": {"id": "byName","options": "peer_hash_id"},
"properties": [{"id": "custom.hidden","value": true}]
},
{
"matcher": {"id": "byName","options": "State"},
"properties": [
{"id": "custom.displayMode","value": "color-background"},
{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}
]
}
]
},
"gridPos": {
"h": 14,
@ -435,7 +473,7 @@
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": " SELECT\n max(RouterName) as \"RouterName\",\n max(PeerName) as \"PeerName\",\n max(PeerIP) as \"PeerIP\",\n max(PeerASN) as \"PeerASN\",\n max(peer_state) as \"State\",\n max(LastModified) as \"LastModified\",\n max(v4_prefixes) as \"IPv4 Prefixes\",\n max(v6_prefixes) as \"IPv6 Prefixes\",\n CASE WHEN max(peer_state) = 'up' THEN 1 ELSE 0 END as stateBool\nFROM v_peers p\n LEFT JOIN stats_peer_rib s ON (p.peer_hash_id = s.peer_hash_id\n AND s.interval_time >= now() - interval '20 minutes')\nGROUP BY p.peer_hash_id;\n",
"rawSql": " SELECT\n p.peer_hash_id as peer_hash_id,\n max(RouterName) as \"RouterName\",\n max(PeerName) as \"PeerName\",\n max(PeerIP) as \"PeerIP\",\n max(PeerASN) as \"PeerASN\",\n max(peer_state) as \"State\",\n max(LastModified) as \"LastModified\",\n max(v4_prefixes) as \"IPv4 Prefixes\",\n max(v6_prefixes) as \"IPv6 Prefixes\",\n CASE WHEN max(peer_state) = 'up' THEN 1 ELSE 0 END as stateBool\nFROM v_peers p\n LEFT JOIN stats_peer_rib s ON (p.peer_hash_id = s.peer_hash_id\n AND s.interval_time >= now() - interval '20 minutes')\nGROUP BY p.peer_hash_id;\n",
"refId": "A",
"select": [
[
@ -464,7 +502,9 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-base"
"obmp",
"obmp-nav",
"operations"
],
"templating": {
"list": []

View File

@ -0,0 +1,536 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "BGP peer session health, uptime, and flap analysis. Teaches session stability and how to diagnose flapping peers.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: A healthy BGP mesh shows all peers UP continuously. Any gap in the UP state represents a session flap \u2014 investigate the reset reason.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"fillOpacity": 70,
"lineWidth": 0,
"spanNulls": false
},
"mappings": [
{
"options": {
"down": {
"color": "red",
"index": 1,
"text": "DOWN"
},
"up": {
"color": "green",
"index": 0,
"text": "UP"
}
},
"type": "value"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
},
{
"color": "green",
"value": 1
}
]
}
}
},
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"alignValue": "left",
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"mergeValues": true,
"rowHeight": 0.9,
"showValue": "auto",
"tooltip": {
"mode": "single"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(e.timestamp,'1m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n CASE WHEN e.state = 'up' THEN 1 ELSE 0 END AS \"value\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nORDER BY 1, 2",
"refId": "A"
}
],
"title": "Peer Session State Timeline",
"type": "state-timeline"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Current state of all BGP peers. Learn: 'bmp_reason' tells you why BMP reporting stopped. 'bgp_err_code' shows BGP NOTIFICATION error codes.",
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "State"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "mappings",
"value": [
{
"options": {
"down": {
"color": "red",
"index": 1,
"text": "DOWN"
},
"up": {
"color": "green",
"index": 0,
"text": "UP"
}
},
"type": "value"
}
]
}
]
},
{
"matcher": {
"id": "byName",
"options": "Peer"
},
"properties": [
{
"id": "custom.width",
"value": 200
}
]
},
{
"matcher": {
"id": "byName",
"options": "AS"
},
"properties": [
{
"id": "custom.width",
"value": 80
}
]
}
]
},
"gridPos": {
"h": 12,
"w": 24,
"x": 0,
"y": 8
},
"id": 2,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": false,
"displayName": "State"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n COALESCE(p.name, p.peer_addr::text) AS \"Peer\",\n p.peer_addr AS \"Address\",\n p.peer_as AS \"AS\",\n p.state AS \"State\",\n p.timestamp AS \"Last State Change\",\n p.error_text AS \"Last Error\",\n p.local_hold_time AS \"Hold Time\"\nFROM bgp_peers p\nWHERE p.isprepolicy = true\nORDER BY p.state, p.peer_addr",
"refId": "A"
}
],
"title": "Current Peer State",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Flap count = number of times a peer went from UP to DOWN. A peer flapping more than 2 times per hour needs investigation.",
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Flap Count"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 5
}
]
}
}
]
}
]
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 20
},
"id": 3,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "Flap Count"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n COALESCE(p.name, p.peer_addr::text) AS \"Peer\",\n p.peer_addr AS \"Address\",\n p.peer_as AS \"AS\",\n COUNT(CASE WHEN e.state = 'down' THEN 1 END) AS \"Flap Count\",\n MIN(e.timestamp) AS \"First Event\",\n MAX(e.timestamp) AS \"Last Event\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE $__timeFilter(e.timestamp)\nGROUP BY p.name, p.peer_addr, p.peer_as\nORDER BY \"Flap Count\" DESC",
"refId": "A"
}
],
"title": "Peer Flap Analysis",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
},
{
"color": "yellow",
"value": 50
},
{
"color": "green",
"value": 90
}
]
},
"unit": "percent",
"max": 100,
"min": 0
}
},
"gridPos": {
"h": 8,
"w": 8,
"x": 0,
"y": 30
},
"id": 4,
"options": {
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"showThresholdLabels": false,
"showThresholdMarkers": true,
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(100.0 * SUM(CASE WHEN state = 'up' THEN 1 ELSE 0 END) / NULLIF(COUNT(*),0), 1) AS \"Mesh Health %\"\nFROM bgp_peers WHERE isprepolicy = true",
"refId": "A"
}
],
"title": "Overall Peer Mesh Health",
"type": "gauge"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
},
{
"color": "green",
"value": 1
}
]
},
"unit": "short",
"mappings": [
{
"options": {
"0": {
"color": "red",
"index": 0,
"text": "DOWN"
}
},
"type": "value"
}
]
}
},
"gridPos": {
"h": 8,
"w": 8,
"x": 8,
"y": 30
},
"id": 5,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n SUM(CASE WHEN state = 'up' THEN 1 ELSE 0 END) AS \"Peers UP\"\nFROM bgp_peers WHERE isprepolicy = true",
"refId": "A"
}
],
"title": "Peers Currently UP",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "red",
"value": 5
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 8,
"w": 8,
"x": 16,
"y": 30
},
"id": 6,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n COUNT(CASE WHEN state = 'down' THEN 1 END) AS \"Flap Events (24h)\"\nFROM peer_event_log\nWHERE timestamp > NOW() - INTERVAL '24 hours' AND state = 'down'",
"refId": "A"
}
],
"title": "Flap Events (24h)",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"peers",
"flap",
"obmp-nav"
],
"time": {
"from": "now-24h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "Peer Session Health & Flap Analysis",
"uid": "obmp-learn-02",
"version": 1
}

View File

@ -24,9 +24,11 @@
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 5,
"id": null,
"iteration": 1654877090626,
"links": [],
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"liveNow": false,
"panels": [
{
@ -49,45 +51,53 @@
"type": "text"
},
{
"circleMaxSize": "15",
"circleMinSize": 2,
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"esMetric": "Count",
"description": "Geolocation of the matched prefix (from the geo_ip table).",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"custom": {"hideFrom": {"legend": false,"tooltip": false,"viz": false}},
"mappings": [],
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"hideEmpty": false,
"hideZero": false,
"id": 17,
"initialZoom": "1",
"locationData": "table",
"mapCenter": "(0°, 0°)",
"mapCenterLatitude": 0,
"mapCenterLongitude": 0,
"maxDataPoints": 1,
"mouseWheelZoom": false,
"showLegend": false,
"stickyLabels": false,
"tableQueryOptions": {
"geohashField": "geohash",
"labelField": "name",
"latitudeField": "latitude",
"longitudeField": "longitude",
"metricField": "value",
"queryType": "coordinates"
"options": {
"basemap": {"config": {},"name": "Layer 0","type": "default"},
"controls": {"mouseWheelZoom": false,"showAttribution": true,"showDebug": false,"showMeasure": false,"showScale": false,"showZoom": true},
"layers": [
{
"config": {
"showLegend": false,
"style": {
"color": {"fixed": "red"},
"opacity": 0.7,
"rotation": {"fixed": 0,"max": 360,"min": -360,"mode": "mod"},
"size": {"fixed": 8,"max": 15,"min": 2},
"symbol": {"fixed": "img/icons/marker/circle.svg","mode": "fixed"},
"textConfig": {"fontSize": 12,"offsetX": 0,"offsetY": 0,"textAlign": "center","textBaseline": "middle"}
}
},
"location": {"latitude": "latitude","longitude": "longitude","mode": "coords"},
"name": "Prefix Location",
"tooltip": true,
"type": "markers"
}
],
"tooltip": {"mode": "details"},
"view": {"allLayers": true,"id": "zero","lat": 0,"lon": 0,"zoom": 1}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
@ -123,12 +133,8 @@
]
}
],
"thresholds": "0,10",
"title": "Prefix Location",
"type": "grafana-worldmap-panel",
"unitPlural": "",
"unitSingle": "",
"valueName": "current"
"type": "geomap"
},
{
"datasource": {
@ -317,12 +323,19 @@
"type": "piechart"
},
{
"columns": [],
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fontSize": "100%",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"custom": {"align": "auto","displayMode": "auto","inspect": false},
"mappings": [],
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}
},
"overrides": []
},
"gridPos": {
"h": 6,
"w": 24,
@ -331,53 +344,10 @@
},
"id": 12,
"links": [],
"scroll": true,
"showHeader": true,
"sort": {
"col": 0,
"desc": true
"options": {
"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},
"showHeader": true
},
"styles": [
{
"alias": "Time",
"align": "auto",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "date"
},
{
"alias": "",
"align": "auto",
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"mappingType": 1,
"pattern": "raw_output",
"preserveFormat": true,
"sanitize": false,
"thresholds": [],
"type": "string",
"unit": "short"
},
{
"alias": "",
"align": "auto",
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"decimals": 2,
"pattern": "/.*/",
"thresholds": [],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"alias": "",
@ -412,16 +382,22 @@
}
],
"title": "ASN Info",
"transform": "table",
"type": "table-old"
"type": "table"
},
{
"columns": [],
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fontSize": "100%",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"custom": {"align": "auto","displayMode": "auto","inspect": false},
"mappings": [],
"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null}]}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 24,
@ -430,75 +406,10 @@
},
"id": 13,
"links": [],
"scroll": true,
"showHeader": true,
"sort": {
"col": 0,
"desc": true
"options": {
"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},
"showHeader": true
},
"styles": [
{
"alias": "Time",
"align": "auto",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "date"
},
{
"alias": "",
"align": "auto",
"colorMode": "cell",
"colors": [
"#cca300",
"#e24d42",
"#9ac48a"
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"mappingType": 1,
"pattern": "irr_origin_as",
"thresholds": [
"0",
"1"
],
"type": "number",
"unit": "none"
},
{
"alias": "",
"align": "auto",
"colorMode": "cell",
"colors": [
"#cca300",
"#e24d42",
"#9ac48a"
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 0,
"mappingType": 1,
"pattern": "rpki_origin_as",
"thresholds": [
"0",
"1"
],
"type": "number",
"unit": "none"
},
{
"alias": "",
"align": "auto",
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"decimals": 2,
"pattern": "/.*/",
"thresholds": [],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"alias": "",
@ -533,8 +444,7 @@
}
],
"title": "Prefix Info",
"transform": "table",
"type": "table-old"
"type": "table"
},
{
"datasource": {
@ -761,7 +671,9 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-base"
"obmp",
"obmp-nav",
"operations"
],
"templating": {
"list": [

View File

@ -0,0 +1,200 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Per-peer drilldown — BGP session identity, state history, prefix counts, update/withdraw rate, recent events and negotiated capabilities for a single BGP peer.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Current BGP session state for this peer.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]},"mappings": [{"options": {"0": {"color": "red","index": 1,"text": "DOWN"},"1": {"color": "green","index": 0,"text": "UP"}},"type": "value"}],"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, CASE WHEN peer_state = 'up' THEN 1 ELSE 0 END AS \"Peer State\"\nFROM v_peers WHERE peer_hash_id = '$peer_hash'::uuid","refId": "A"}],
"title": "Peer State",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "IPv4 prefixes from this peer (latest stats_peer_rib interval).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 4,"y": 0},
"id": 2,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE((SELECT v4_prefixes FROM stats_peer_rib WHERE peer_hash_id = '$peer_hash'::uuid ORDER BY interval_time DESC LIMIT 1),0) AS \"IPv4 Prefixes\"","refId": "A"}],
"title": "IPv4 Prefixes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "IPv6 prefixes from this peer (latest stats_peer_rib interval).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 8,"y": 0},
"id": 3,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE((SELECT v6_prefixes FROM stats_peer_rib WHERE peer_hash_id = '$peer_hash'::uuid ORDER BY interval_time DESC LIMIT 1),0) AS \"IPv6 Prefixes\"","refId": "A"}],
"title": "IPv6 Prefixes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Updates received from this peer in the last hour (from stats_chg_bypeer).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 12,"y": 0},
"id": 4,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE(SUM(updates),0) AS \"Updates (1h)\"\nFROM stats_chg_bypeer\nWHERE peer_hash_id = '$peer_hash'::uuid AND interval_time > NOW() - INTERVAL '1 hour'","refId": "A"}],
"title": "Updates (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Withdraws received from this peer in the last hour (from stats_chg_bypeer).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 16,"y": 0},
"id": 5,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE(SUM(withdraws),0) AS \"Withdraws (1h)\"\nFROM stats_chg_bypeer\nWHERE peer_hash_id = '$peer_hash'::uuid AND interval_time > NOW() - INTERVAL '1 hour'","refId": "A"}],
"title": "Withdraws (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Session down-events for this peer in the last 24 hours.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 5}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 20,"y": 0},
"id": 6,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Flaps (24h)\"\nFROM peer_event_log\nWHERE peer_hash_id = '$peer_hash'::uuid AND state = 'down' AND timestamp > NOW() - INTERVAL '24 hours'","refId": "A"}],
"title": "Flap Events (24h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Identity and session parameters for the selected peer.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]}]
},
"gridPos": {"h": 5,"w": 24,"x": 0,"y": 4},
"id": 7,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n routername AS \"Router\",\n peername AS \"Peer\",\n host(peerip) AS \"Address\",\n peerasn AS \"Peer AS\",\n as_name AS \"AS Name\",\n peer_state AS \"State\",\n peerholdtime AS \"Hold Time\",\n table_name AS \"Table\",\n lastmodified AS \"Last Change\",\n lastdownmessage AS \"Last Down Message\"\nFROM v_peers\nWHERE peer_hash_id = '$peer_hash'::uuid","refId": "A"}],
"title": "Peer Info",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Session state over the selected range. Any gap to DOWN is a flap.",
"fieldConfig": {
"defaults": {
"color": {"mode": "thresholds"},
"custom": {"fillOpacity": 70,"lineWidth": 0,"spanNulls": false},
"mappings": [{"options": {"0": {"color": "red","index": 1,"text": "DOWN"},"1": {"color": "green","index": 0,"text": "UP"}},"type": "value"}],
"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]}
}
},
"gridPos": {"h": 7,"w": 24,"x": 0,"y": 9},
"id": 8,
"options": {"alignValue": "left","legend": {"displayMode": "list","placement": "bottom","showLegend": false},"mergeValues": true,"rowHeight": 0.9,"showValue": "auto","tooltip": {"mode": "single"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(e.timestamp,'1m'),\n 'Session' AS metric,\n CASE WHEN e.state = 'up' THEN 1 ELSE 0 END AS \"value\"\nFROM peer_event_log e\nWHERE e.peer_hash_id = '$peer_hash'::uuid AND $__timeFilter(e.timestamp)\nORDER BY 1","refId": "A"}],
"title": "Session State Timeline",
"type": "state-timeline"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP update vs withdraw rate for this peer (from stats_chg_bypeer).",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisCenteredZero": false,"axisColorMode": "text","axisLabel": "","axisPlacement": "auto","barAlignment": 0,"drawStyle": "line","fillOpacity": 20,"gradientMode": "none","lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"scaleDistribution": {"type": "linear"},"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"},"thresholdsStyle": {"mode": "off"}},"unit": "short"},
"overrides": [{"matcher": {"id": "byName","options": "Withdraws"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]},{"matcher": {"id": "byName","options": "Updates"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]}]
},
"gridPos": {"h": 9,"w": 12,"x": 0,"y": 16},
"id": 9,
"options": {"legend": {"calcs": ["sum"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "none"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(interval_time,'5m'),\n SUM(updates) AS \"Updates\",\n SUM(withdraws) AS \"Withdraws\"\nFROM stats_chg_bypeer\nWHERE peer_hash_id = '$peer_hash'::uuid AND $__timeFilter(interval_time)\nGROUP BY 1\nORDER BY 1","refId": "A"}],
"title": "Update / Withdraw Rate",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Prefix count from this peer over time (from stats_peer_rib).",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisCenteredZero": false,"axisColorMode": "text","axisLabel": "","axisPlacement": "auto","barAlignment": 0,"drawStyle": "line","fillOpacity": 20,"gradientMode": "none","lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"scaleDistribution": {"type": "linear"},"showPoints": "never","spanNulls": true,"stacking": {"group": "A","mode": "none"},"thresholdsStyle": {"mode": "off"}},"unit": "short"}
},
"gridPos": {"h": 9,"w": 12,"x": 12,"y": 16},
"id": 10,
"options": {"legend": {"calcs": ["last"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "none"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(interval_time,'5m'),\n MAX(v4_prefixes) AS \"IPv4 Prefixes\",\n MAX(v6_prefixes) AS \"IPv6 Prefixes\"\nFROM stats_peer_rib\nWHERE peer_hash_id = '$peer_hash'::uuid AND $__timeFilter(interval_time)\nGROUP BY 1\nORDER BY 1","refId": "A"}],
"title": "Prefix Count Trend",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Recent BGP session state changes for this peer.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]}]
},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 25},
"id": 11,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Time"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n e.timestamp AS \"Time\",\n e.state AS \"State\",\n e.bmp_reason AS \"BMP Reason\",\n e.bgp_err_code AS \"BGP Err Code\",\n e.bgp_err_subcode AS \"BGP Err Subcode\",\n e.error_text AS \"Reason\"\nFROM peer_event_log e\nWHERE e.peer_hash_id = '$peer_hash'::uuid AND $__timeFilter(e.timestamp)\nORDER BY e.timestamp DESC\nLIMIT 100","refId": "A"}],
"title": "Recent Peer Events",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP capabilities negotiated on this session.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto","cellOptions": {"type": "auto","wrapText": true}}},
"overrides": [
{"matcher": {"id": "byName","options": "Sent Capabilities"},"properties": [{"id": "custom.width","value": 600}]},
{"matcher": {"id": "byName","options": "Received Capabilities"},"properties": [{"id": "custom.width","value": 600}]}
]
},
"gridPos": {"h": 8,"w": 24,"x": 0,"y": 34},
"id": 12,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n sentcapabilities AS \"Sent Capabilities\",\n recvcapabilities AS \"Received Capabilities\"\nFROM v_peers\nWHERE peer_hash_id = '$peer_hash'::uuid","refId": "A"}],
"title": "Negotiated Capabilities",
"type": "table"
}
],
"refresh": "1m",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","obmp-nav","operations","peer"],
"templating": {
"list": [
{
"current": {},
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"definition": "select peername as __text, peer_hash_id as __value from v_peers where length(peername) > 0",
"hide": 0,
"includeAll": false,
"label": "Peer",
"multi": false,
"name": "peer_hash",
"options": [],
"query": "select peername as __text, peer_hash_id as __value from v_peers where length(peername) > 0",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"type": "query"
}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Peer Detail",
"uid": "obmp-peer-detail",
"version": 1
}

View File

@ -0,0 +1,170 @@
{
"annotations": {"list": [{"builtIn": 1,"datasource": {"type": "datasource","uid": "grafana"},"enable": true,"hide": true,"iconColor": "rgba(0, 211, 255, 1)","name": "Annotations & Alerts","type": "dashboard"}]},
"description": "Per-router drilldown — BMP state, peer health, prefix counts, update rate and recent session events for a single monitored router.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{"asDropdown": true,"icon": "external link","includeVars": true,"keepTime": true,"tags": ["obmp-nav"],"title": "OBMP Dashboards","type": "dashboards"}
],
"panels": [
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BMP session state for this router. Should be UP.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "red","value": null},{"color": "green","value": 1}]},"mappings": [{"options": {"0": {"color": "red","index": 1,"text": "DOWN"},"1": {"color": "green","index": 0,"text": "UP"}},"type": "value"}],"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 0,"y": 0},
"id": 1,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, CASE WHEN state = 'up' THEN 1 ELSE 0 END AS \"Router State\"\nFROM routers WHERE hash_id = '$router_hash'::uuid","refId": "A"}],
"title": "Router State",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP peers on this router that are currently up (pre-policy Adj-RIB-In).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 4,"y": 0},
"id": 2,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Peers Up\"\nFROM bgp_peers\nWHERE router_hash_id = '$router_hash'::uuid AND isprepolicy = true AND state = 'up'","refId": "A"}],
"title": "Peers Up",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP peers on this router that are not up. Investigate any non-zero value.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "red","value": 1}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 8,"y": 0},
"id": 3,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Peers Down\"\nFROM bgp_peers\nWHERE router_hash_id = '$router_hash'::uuid AND isprepolicy = true AND state != 'up'","refId": "A"}],
"title": "Peers Down",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total IPv4 prefixes across this router's peers (latest stats_peer_rib interval per peer).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 12,"y": 0},
"id": 4,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE(SUM(s.v4_prefixes),0) AS \"IPv4 Prefixes\"\nFROM bgp_peers p\nLEFT JOIN LATERAL (SELECT v4_prefixes FROM stats_peer_rib sr WHERE sr.peer_hash_id = p.hash_id ORDER BY interval_time DESC LIMIT 1) s ON true\nWHERE p.router_hash_id = '$router_hash'::uuid AND p.isprepolicy = true","refId": "A"}],
"title": "IPv4 Prefixes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Total IPv6 prefixes across this router's peers (latest stats_peer_rib interval per peer).",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "blue","value": null}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 16,"y": 0},
"id": 5,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, COALESCE(SUM(s.v6_prefixes),0) AS \"IPv6 Prefixes\"\nFROM bgp_peers p\nLEFT JOIN LATERAL (SELECT v6_prefixes FROM stats_peer_rib sr WHERE sr.peer_hash_id = p.hash_id ORDER BY interval_time DESC LIMIT 1) s ON true\nWHERE p.router_hash_id = '$router_hash'::uuid AND p.isprepolicy = true","refId": "A"}],
"title": "IPv6 Prefixes",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Peer session down-events on this router in the last hour.",
"fieldConfig": {"defaults": {"color": {"mode": "thresholds"},"thresholds": {"mode": "absolute","steps": [{"color": "green","value": null},{"color": "yellow","value": 1},{"color": "red","value": 5}]},"unit": "short"}},
"gridPos": {"h": 4,"w": 4,"x": 20,"y": 0},
"id": 6,
"options": {"colorMode": "background","graphMode": "none","justifyMode": "auto","orientation": "auto","reduceOptions": {"calcs": ["lastNotNull"],"fields": "","values": false},"textMode": "auto"},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT NOW() AS time, count(*) AS \"Flaps (1h)\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE p.router_hash_id = '$router_hash'::uuid AND e.state = 'down' AND e.timestamp > NOW() - INTERVAL '1 hour'","refId": "A"}],
"title": "Flap Events (1h)",
"type": "stat"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Identity and BMP state for the selected router.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]}]
},
"gridPos": {"h": 5,"w": 24,"x": 0,"y": 4},
"id": 7,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n r.name AS \"Router\",\n host(r.ip_address) AS \"Mgmt IP\",\n host(r.bgp_id) AS \"BGP ID\",\n r.router_as AS \"AS\",\n r.state AS \"State\",\n r.timestamp AS \"Last Update\",\n r.description AS \"Description\"\nFROM routers r\nWHERE r.hash_id = '$router_hash'::uuid","refId": "A"}],
"title": "Router Info",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP peers on this router with state, ASN and latest prefix counts. Click a peer to open Peer Detail.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [
{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]},
{"matcher": {"id": "byName","options": "Peer"},"properties": [{"id": "links","value": [{"title": "Open Peer Detail","url": "/d/obmp-peer-detail/peer-detail?var-peer_hash=${__data.fields[\"peer_hash_id\"]}"}]}]},
{"matcher": {"id": "byName","options": "peer_hash_id"},"properties": [{"id": "custom.hidden","value": true}]}
]
},
"gridPos": {"h": 11,"w": 24,"x": 0,"y": 9},
"id": 8,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": false,"displayName": "State"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n p.hash_id AS peer_hash_id,\n COALESCE(NULLIF(p.name,''), p.peer_addr::text) AS \"Peer\",\n host(p.peer_addr) AS \"Address\",\n p.peer_as AS \"AS\",\n p.state AS \"State\",\n COALESCE(s.v4_prefixes,0) AS \"IPv4 Prefixes\",\n COALESCE(s.v6_prefixes,0) AS \"IPv6 Prefixes\",\n p.timestamp AS \"Last Change\"\nFROM bgp_peers p\nLEFT JOIN LATERAL (SELECT v4_prefixes, v6_prefixes FROM stats_peer_rib sr WHERE sr.peer_hash_id = p.hash_id ORDER BY interval_time DESC LIMIT 1) s ON true\nWHERE p.router_hash_id = '$router_hash'::uuid AND p.isprepolicy = true\nORDER BY p.state, p.peer_addr","refId": "A"}],
"title": "Peers",
"type": "table"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "BGP update vs withdraw rate across this router's peers (from stats_chg_bypeer).",
"fieldConfig": {
"defaults": {"color": {"mode": "palette-classic"},"custom": {"axisCenteredZero": false,"axisColorMode": "text","axisLabel": "","axisPlacement": "auto","barAlignment": 0,"drawStyle": "line","fillOpacity": 20,"gradientMode": "none","lineInterpolation": "smooth","lineWidth": 1,"pointSize": 5,"scaleDistribution": {"type": "linear"},"showPoints": "never","spanNulls": false,"stacking": {"group": "A","mode": "none"},"thresholdsStyle": {"mode": "off"}},"unit": "short"},
"overrides": [{"matcher": {"id": "byName","options": "Withdraws"},"properties": [{"id": "color","value": {"fixedColor": "red","mode": "fixed"}}]},{"matcher": {"id": "byName","options": "Updates"},"properties": [{"id": "color","value": {"fixedColor": "green","mode": "fixed"}}]}]
},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 20},
"id": 9,
"options": {"legend": {"calcs": ["sum"],"displayMode": "table","placement": "bottom","showLegend": true},"tooltip": {"mode": "multi","sort": "none"}},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "time_series","rawSql": "SELECT\n $__timeGroupAlias(c.interval_time,'5m'),\n SUM(c.updates) AS \"Updates\",\n SUM(c.withdraws) AS \"Withdraws\"\nFROM stats_chg_bypeer c\nJOIN bgp_peers p ON p.hash_id = c.peer_hash_id\nWHERE p.router_hash_id = '$router_hash'::uuid AND $__timeFilter(c.interval_time)\nGROUP BY 1\nORDER BY 1","refId": "A"}],
"title": "BGP Update Rate",
"type": "timeseries"
},
{
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"description": "Recent BGP session state changes for this router's peers.",
"fieldConfig": {
"defaults": {"custom": {"align": "auto","displayMode": "auto"}},
"overrides": [{"matcher": {"id": "byName","options": "State"},"properties": [{"id": "custom.displayMode","value": "color-background"},{"id": "mappings","value": [{"options": {"down": {"color": "red","index": 1,"text": "DOWN"},"up": {"color": "green","index": 0,"text": "UP"}},"type": "value"}]}]}]
},
"gridPos": {"h": 9,"w": 24,"x": 0,"y": 29},
"id": 10,
"options": {"footer": {"countRows": false,"fields": "","reducer": ["sum"],"show": false},"showHeader": true,"sortBy": [{"desc": true,"displayName": "Time"}]},
"targets": [{"datasource": {"type": "postgres","uid": "obmp_postgres"},"format": "table","rawSql": "SELECT\n e.timestamp AS \"Time\",\n COALESCE(NULLIF(p.name,''), p.peer_addr::text) AS \"Peer\",\n host(p.peer_addr) AS \"Address\",\n e.state AS \"State\",\n e.error_text AS \"Reason\"\nFROM peer_event_log e\nJOIN bgp_peers p ON p.hash_id = e.peer_hash_id\nWHERE p.router_hash_id = '$router_hash'::uuid AND $__timeFilter(e.timestamp)\nORDER BY e.timestamp DESC\nLIMIT 100","refId": "A"}],
"title": "Recent Peer Events",
"type": "table"
}
],
"refresh": "1m",
"schemaVersion": 36,
"style": "dark",
"tags": ["obmp","obmp-nav","operations","router"],
"templating": {
"list": [
{
"current": {},
"datasource": {"type": "postgres","uid": "obmp_postgres"},
"definition": "select name as __text, hash_id as __value from routers where length(name) > 0",
"hide": 0,
"includeAll": false,
"label": "Router",
"multi": false,
"name": "router_hash",
"options": [],
"query": "select name as __text, hash_id as __value from routers where length(name) > 0",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"type": "query"
}
]
},
"time": {"from": "now-6h","to": "now"},
"timepicker": {},
"timezone": "browser",
"title": "Router Detail",
"uid": "obmp-router-detail",
"version": 1
}

View File

@ -0,0 +1,466 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "AS path length distribution and analysis. Teaches how BGP AS paths reflect internet topology and how to detect anomalies like route leaks or AS path prepending.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Internet routes typically have 2-5 hops. A /32 or /24 appearing with only 1-hop AS path from an unexpected ASN is a classic hijack indicator. Routes with 10+ hops may indicate prepending.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"fillOpacity": 80,
"gradientMode": "none",
"lineWidth": 0
},
"unit": "short"
}
},
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"barRadius": 0,
"barWidth": 0.7,
"groupWidth": 0.7,
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom"
},
"orientation": "auto",
"tooltip": {
"mode": "single"
},
"xTickLabelRotation": 0,
"xTickLabelSpacing": 200
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n ba.as_path_count AS \"AS Path Length (hops)\",\n COUNT(*) AS \"Prefix Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false\n AND r.isipv4 = true\n AND ba.as_path_count > 0\nGROUP BY ba.as_path_count\nORDER BY ba.as_path_count",
"refId": "A"
}
],
"title": "AS Path Length Distribution (Active IPv4 Routes)",
"type": "barchart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Average AS path length on the internet is ~4-5 hops. Your lab has shorter paths since ExaBGP is a single eBGP hop away.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 5
},
{
"color": "red",
"value": 8
}
]
},
"unit": "short",
"decimals": 1
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 12,
"y": 0
},
"id": 2,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(AVG(ba.as_path_count)::numeric, 1) AS \"Avg AS Path Length\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true AND ba.as_path_count > 0",
"refId": "A"
}
],
"title": "Average AS Path Length",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Routes with only 1-hop AS path are directly connected or possibly hijacked. In your lab, ExaBGP injects routes starting with AS 65100.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 5
},
{
"color": "red",
"value": 20
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 18,
"y": 0
},
"id": 3,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n COUNT(*) AS \"Direct (1-hop) Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true AND ba.as_path_count = 1",
"refId": "A"
}
],
"title": "1-Hop Routes (Direct/Possible Hijack)",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: The longest paths reveal the most AS-level hops in your network. AS path prepending intentionally lengthens paths to make a route less preferred.",
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "AS Path Length"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 5
},
{
"color": "red",
"value": 10
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "AS Path"
},
"properties": [
{
"id": "custom.width",
"value": 400
}
]
}
]
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 10
},
"id": 4,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "AS Path Length"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n r.prefix AS \"Prefix\",\n ba.as_path_count AS \"AS Path Length\",\n ba.as_path::text AS \"AS Path\",\n ba.origin_as AS \"Origin AS\",\n ba.next_hop AS \"Next Hop\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nORDER BY ba.as_path_count DESC\nLIMIT 30",
"refId": "A"
}
],
"title": "Longest AS Paths (Top 30)",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Origin AS is the rightmost ASN in the AS path \u2014 the network that first originated the prefix. Most internet prefixes are originated by their owning organization.",
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Route Count"
},
"properties": [
{
"id": "custom.displayMode",
"value": "lcd-gauge"
},
{
"id": "custom.width",
"value": 200
}
]
}
]
},
"gridPos": {
"h": 12,
"w": 12,
"x": 0,
"y": 20
},
"id": 5,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "Route Count"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n ba.origin_as AS \"Origin AS\",\n COALESCE(ia.as_name, 'Unknown') AS \"AS Name\",\n COUNT(*) AS \"Route Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nLEFT JOIN info_asn ia ON ia.asn = ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.origin_as, ia.as_name\nORDER BY COUNT(*) DESC\nLIMIT 20",
"refId": "A"
}
],
"title": "Top Origin ASNs by Route Count",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: A transit AS (appearing frequently in AS paths but not as origin) is a carrier. The most frequent transit ASNs in your lab correspond to simulated Tier-1 carriers (174=Cogent, 3356=Lumen, 1299=Telia, etc.)",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"fillOpacity": 80,
"lineWidth": 0
},
"unit": "short"
}
},
"gridPos": {
"h": 12,
"w": 12,
"x": 12,
"y": 20
},
"id": 6,
"options": {
"barRadius": 0,
"barWidth": 0.7,
"groupWidth": 0.7,
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "bottom"
},
"orientation": "horizontal",
"tooltip": {
"mode": "single"
},
"xTickLabelRotation": 0,
"xTickLabelSpacing": 200
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n asn_val AS \"Transit ASN\",\n COUNT(*) AS \"Appearances in AS Paths\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nCROSS JOIN LATERAL unnest(ba.as_path) AS asn_val\nWHERE r.iswithdrawn = false AND asn_val != ba.origin_as\nGROUP BY asn_val\nORDER BY COUNT(*) DESC\nLIMIT 15",
"refId": "A"
}
],
"title": "Most Common Transit ASNs",
"type": "barchart"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"as-path",
"topology",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "AS Path Analysis",
"uid": "obmp-learn-03",
"version": 1
}

View File

@ -0,0 +1,623 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"description": "Explore BGP path attributes: communities, MED, local-pref and how they influence routing policy decisions.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"gridPos": {
"h": 8,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"content": "## BGP Path Attributes \u2014 What They Mean\n\n### BGP Communities (RFC 1997)\nCommunities are 32-bit tags attached to routes, written as **ASN:value** (e.g., `65000:100`). They carry policy signals between routers and ASes.\n\n**Well-known communities:**\n| Community | Decimal | Meaning |\n|-----------|---------|----------|\n| `65535:0` | NO_EXPORT | Do not advertise outside this AS or confederation |\n| `65535:1` | NO_ADVERTISE | Do not advertise to any peer |\n| `65535:666` | BLACKHOLE | Drop traffic destined for this prefix (RFC 7999) |\n\nPrivate communities (e.g., `65001:200`) are operator-defined \u2014 they may encode region, customer tier, or traffic-engineering intent.\n\n### Local Preference (local-pref)\n- **Scope:** iBGP only \u2014 never sent to eBGP peers.\n- **Effect:** Higher local-pref wins. Default is **100**.\n- **Use case:** Prefer one upstream provider over another for all outbound traffic.\n\n### Multi-Exit Discriminator (MED)\n- **Scope:** Sent to directly connected eBGP peers to influence *inbound* traffic.\n- **Effect:** Lower MED wins (when comparing routes from the same AS).\n- **Use case:** Tell a peer which of your links to prefer when sending traffic to you.\n\n> **Tip:** Use the panels below to explore what communities and attributes are actually present in the current RIB. Run `inject.py attributes` to load routes with varied communities and MED values.",
"mode": "markdown"
},
"title": "BGP Attribute Reference \u2014 Communities, Local-Pref, MED",
"type": "text"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Each row is a unique community string (format ASN:value) seen across all active routes. High route counts for a community mean many routes share that policy tag. Look for well-known communities: 65535:0 (NO_EXPORT), 65535:1 (NO_ADVERTISE), 65535:666 (BLACKHOLE).",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Routes Tagged"
},
"properties": [
{
"id": "custom.displayMode",
"value": "lcd-gauge"
},
{
"id": "color",
"value": {
"mode": "thresholds"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "blue",
"value": null
},
{
"color": "green",
"value": 10
},
{
"color": "yellow",
"value": 100
}
]
}
}
]
}
]
},
"gridPos": {
"h": 11,
"w": 12,
"x": 0,
"y": 8
},
"id": 2,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "Routes Tagged"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n comm AS \"Community\",\n COUNT(*) AS \"Routes Tagged\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nCROSS JOIN LATERAL unnest(ba.community_list) AS comm\nWHERE r.iswithdrawn = false AND ba.community_list IS NOT NULL\nGROUP BY comm\nORDER BY COUNT(*) DESC\nLIMIT 30",
"refId": "A"
}
],
"title": "Top BGP Communities in Current RIB",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Routes with notable BGP attributes \u2014 tagged with communities or using non-default local-pref / MED values. These routes carry explicit policy information. Examine the Communities column for operator-defined tags and the Local Pref column to see traffic engineering decisions.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Local Pref"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-text"
},
{
"id": "color",
"value": {
"mode": "thresholds"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 101
},
{
"color": "red",
"value": 200
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "MED"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-text"
},
{
"id": "color",
"value": {
"mode": "thresholds"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 100
}
]
}
}
]
}
]
},
"gridPos": {
"h": 11,
"w": 12,
"x": 12,
"y": 8
},
"id": 3,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n r.prefix::text AS \"Prefix\",\n ba.origin_as AS \"Origin AS\",\n ba.community_list::text AS \"Communities\",\n ba.local_pref AS \"Local Pref\",\n ba.med AS \"MED\",\n ba.as_path_count AS \"Path Length\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND (ba.community_list IS NOT NULL OR ba.med IS NOT NULL OR ba.local_pref IS NOT NULL)\nORDER BY r.prefix\nLIMIT 100",
"refId": "A"
}
],
"title": "Routes with Notable Attributes",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: MED (Multi-Exit Discriminator) is used to influence inbound traffic from a directly connected AS. Lower MED is preferred. If most routes show 'Not Set', MED is not being used for traffic engineering. A single dominant MED value means a simple policy; many different values indicate fine-grained control.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"fillOpacity": 80,
"lineWidth": 0
},
"unit": "short"
}
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 19
},
"id": 4,
"options": {
"barRadius": 0.1,
"barWidth": 0.6,
"groupWidth": 0.7,
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"orientation": "auto",
"text": {},
"tooltip": {
"mode": "single"
},
"xTickLabelRotation": -30,
"xTickLabelSpacing": 100
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n COALESCE(ba.med::text, 'Not Set') AS \"MED Value\",\n COUNT(*) AS \"Route Count\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.med\nORDER BY ba.med NULLS LAST\nLIMIT 20",
"refId": "A"
}
],
"title": "MED Value Distribution",
"type": "barchart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Local preference is an iBGP attribute \u2014 it never crosses AS boundaries. Default is 100. Routes with local-pref above 100 are preferred over the default path; below 100 they are used as last-resort. Non-100 values indicate active traffic-engineering policy. Run 'inject.py attributes' to inject routes with varied local-pref values.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"fillOpacity": 80,
"lineWidth": 0
},
"unit": "short"
}
},
"gridPos": {
"h": 9,
"w": 12,
"x": 12,
"y": 19
},
"id": 5,
"options": {
"barRadius": 0.1,
"barWidth": 0.6,
"groupWidth": 0.7,
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"orientation": "auto",
"text": {},
"tooltip": {
"mode": "single"
},
"xTickLabelRotation": -30,
"xTickLabelSpacing": 100
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n COALESCE(ba.local_pref::text, 'Not Set') AS \"Local Pref\",\n COUNT(*) AS \"Route Count\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY ba.local_pref\nORDER BY ba.local_pref DESC NULLS LAST\nLIMIT 20",
"refId": "A"
}
],
"title": "Local Preference Value Distribution",
"type": "barchart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: This count tells you how widely BGP communities are used in your network. A value of 0 means no community tagging \u2014 communities are an opt-in feature. Run 'inject.py attributes' to add routes with community strings.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "blue",
"value": null
},
{
"color": "green",
"value": 1
}
]
},
"unit": "short",
"mappings": []
}
},
"gridPos": {
"h": 5,
"w": 8,
"x": 0,
"y": 28
},
"id": 6,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(*) AS \"Routes with Communities\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false\n AND ba.community_list IS NOT NULL\n AND array_length(ba.community_list, 1) > 0",
"refId": "A"
}
],
"title": "Routes with Communities",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: The number of distinct community strings seen across all active routes. A diverse set indicates fine-grained policy tagging. A single value means one uniform policy tag is applied.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "blue",
"value": null
},
{
"color": "green",
"value": 1
},
{
"color": "yellow",
"value": 50
}
]
},
"unit": "short",
"mappings": []
}
},
"gridPos": {
"h": 5,
"w": 8,
"x": 8,
"y": 28
},
"id": 7,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(DISTINCT comm) AS \"Unique Communities\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nCROSS JOIN LATERAL unnest(ba.community_list) AS comm\nWHERE r.iswithdrawn = false",
"refId": "A"
}
],
"title": "Unique Community Values",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Routes with a local-pref other than the default (100) have been explicitly policy-engineered. A high count here means your network actively uses local-pref to prefer specific paths. A value of 0 means all paths are at default preference.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 100
},
{
"color": "red",
"value": 1000
}
]
},
"unit": "short",
"mappings": []
}
},
"gridPos": {
"h": 5,
"w": 8,
"x": 16,
"y": 28
},
"id": 8,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() as time, COUNT(*) AS \"Custom Local-Pref Routes\"\nFROM base_attrs ba\nJOIN ip_rib r ON r.base_attr_hash_id = ba.hash_id\nWHERE r.iswithdrawn = false\n AND ba.local_pref IS NOT NULL\n AND ba.local_pref != 100",
"refId": "A"
}
],
"title": "Routes with Non-Default Local-Pref",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"communities",
"attributes",
"policy",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "BGP Attribute Explorer",
"uid": "obmp-learn-06",
"version": 1
}

View File

@ -0,0 +1,540 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"description": "Prefix stability analysis and route churn visualization. Teaches how to identify unstable routes and understand BGP churn.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: This chart shows BGP advertisements and withdrawals bucketed per hour. A healthy network has steady low churn. Spikes in withdrawals indicate route instability events \u2014 link failures, IBGP reconvergence, or policy changes. Run 'inject.py churn' to generate synthetic churn data and observe it here.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "bars",
"fillOpacity": 60,
"lineWidth": 1,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
}
},
"unit": "short"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Advertisements"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "green",
"mode": "fixed"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "Withdrawals"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "red",
"mode": "fixed"
}
}
]
}
]
},
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": [
"sum",
"max"
],
"displayMode": "list",
"placement": "bottom"
},
"tooltip": {
"mode": "multi"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(timestamp,'1h'),\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) AS \"Advertisements\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) AS \"Withdrawals\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "Advertisements vs Withdrawals Rate (per hour)",
"type": "timeseries"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: A prefix with more than 30 updates per day is considered unstable \u2014 it is flapping or being re-announced frequently. The Stability column categorizes each prefix. Run 'inject.py churn' to generate churn data and observe it here. Sort by 'Total Updates' to find the most problematic prefixes.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Stability"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-text"
},
{
"id": "mappings",
"value": [
{
"options": {
"Very Stable": {
"color": "green",
"index": 0
},
"Stable": {
"color": "blue",
"index": 1
},
"Moderate": {
"color": "yellow",
"index": 2
},
"Unstable": {
"color": "red",
"index": 3
}
},
"type": "value"
}
]
}
]
},
{
"matcher": {
"id": "byName",
"options": "Total Updates"
},
"properties": [
{
"id": "custom.displayMode",
"value": "lcd-gauge"
},
{
"id": "color",
"value": {
"mode": "thresholds"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 7
},
{
"color": "red",
"value": 30
}
]
}
}
]
}
]
},
"gridPos": {
"h": 12,
"w": 24,
"x": 0,
"y": 9
},
"id": 2,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": [
{
"desc": true,
"displayName": "Total Updates"
}
]
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n prefix::text AS \"Prefix\",\n COUNT(*) AS \"Total Updates\",\n SUM(CASE WHEN iswithdrawn THEN 1 ELSE 0 END) AS \"Withdrawals\",\n SUM(CASE WHEN NOT iswithdrawn THEN 1 ELSE 0 END) AS \"Announcements\",\n MAX(timestamp) AS \"Last Change\",\n CASE\n WHEN COUNT(*) = 1 THEN 'Very Stable'\n WHEN COUNT(*) <= 7 THEN 'Stable'\n WHEN COUNT(*) <= 30 THEN 'Moderate'\n ELSE 'Unstable'\n END AS \"Stability\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY prefix\nORDER BY \"Total Updates\" DESC\nLIMIT 100",
"refId": "A"
}
],
"title": "Top Churning Prefixes",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: This bar chart shows how many prefixes fall into each stability tier. In a healthy network, the vast majority of prefixes should be 'Very Stable' (only announced once during the window). A large 'Unstable' bar is a red flag. Run 'inject.py churn' to shift prefixes into the Unstable tier.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "fixed",
"fixedColor": "blue"
},
"custom": {
"fillOpacity": 80,
"lineWidth": 0
},
"unit": "short"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "1. Very Stable (1 update)"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "green",
"mode": "fixed"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "2. Stable (2-7 updates)"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "blue",
"mode": "fixed"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "3. Moderate (8-30 updates)"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "yellow",
"mode": "fixed"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "4. Unstable (31+ updates)"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "red",
"mode": "fixed"
}
}
]
}
]
},
"gridPos": {
"h": 9,
"w": 14,
"x": 0,
"y": 21
},
"id": 3,
"options": {
"barRadius": 0.1,
"barWidth": 0.6,
"groupWidth": 0.7,
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"orientation": "auto",
"text": {},
"tooltip": {
"mode": "single"
},
"xTickLabelRotation": 0,
"xTickLabelSpacing": 200
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n CASE\n WHEN cnt = 1 THEN '1. Very Stable (1 update)'\n WHEN cnt <= 7 THEN '2. Stable (2-7 updates)'\n WHEN cnt <= 30 THEN '3. Moderate (8-30 updates)'\n ELSE '4. Unstable (31+ updates)'\n END AS \"Stability Tier\",\n COUNT(*) AS \"Prefix Count\"\nFROM (\n SELECT prefix, COUNT(*) as cnt\n FROM ip_rib_log\n WHERE $__timeFilter(timestamp)\n GROUP BY prefix\n) sub\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "Prefix Distribution by Stability Tier",
"type": "barchart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: This is the single most churning prefix in the selected time range. If a prefix appears here repeatedly across time ranges, it may warrant investigation \u2014 check the AS path and peers announcing it.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
}
]
},
"unit": "string",
"mappings": []
}
},
"gridPos": {
"h": 5,
"w": 10,
"x": 14,
"y": 21
},
"id": 4,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "center",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {
"titleSize": 14,
"valueSize": 18
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, prefix::text AS \"Most Churned Prefix\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY prefix\nORDER BY COUNT(*) DESC\nLIMIT 1",
"refId": "A"
}
],
"title": "Most Churned Prefix",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: This counts how many distinct prefixes had at least one update event in the selected time window. During a normal steady state this number should be low. After a major routing event (e.g., upstream link failure) you may see thousands of prefixes change simultaneously.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 500
},
{
"color": "red",
"value": 2000
}
]
},
"unit": "short",
"mappings": []
}
},
"gridPos": {
"h": 4,
"w": 10,
"x": 14,
"y": 26
},
"id": 5,
"options": {
"colorMode": "background",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT prefix) AS \"Prefixes with Updates\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)",
"refId": "A"
}
],
"title": "Total Unique Prefixes with Updates",
"type": "stat"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"churn",
"stability",
"obmp-nav"
],
"time": {
"from": "now-24h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "Route Churn & Stability Score",
"uid": "obmp-learn-05",
"version": 1
}

View File

@ -0,0 +1,405 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "RPKI (Resource Public Key Infrastructure) validation status. Teaches BGP routing security and how RPKI prevents prefix hijacks by validating route origin.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"content": "## What is RPKI?\n\nRPKI (Resource Public Key Infrastructure) is a cryptographic security framework for BGP routing. It lets IP address holders publish **Route Origin Authorizations (ROAs)** stating which ASNs are authorized to originate their prefixes.\n\n### RPKI Validation States\n| State | Meaning |\n|-------|----------|\n| **Valid** | The route's origin AS matches a ROA for this prefix |\n| **Invalid** | A ROA exists but the origin AS or prefix length does NOT match \u2014 this route is potentially a hijack |\n| **NotFound** | No ROA exists for this prefix/origin \u2014 unprotected, can't be validated |\n\n### How to read this dashboard\n- **Valid %** should be as high as possible (target: 100%)\n- **Invalid routes** are critical \u2014 they indicate either a misconfiguration or a prefix hijack\n- Routes with no RPKI data show as **NotFound** \u2014 they are not necessarily invalid, just unprotected\n\n> **Lab note:** The RPKI validator table is populated by a cron job in psql-app every 2 hours. If the table shows 0 rows, wait for the cron to run or check `ENABLE_RPKI=1` in docker-compose.yml.",
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"gridPos": {
"h": 10,
"w": 8,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"content": "## What is RPKI?\n\nRPKI (Resource Public Key Infrastructure) is a cryptographic security framework for BGP routing. It lets IP address holders publish **Route Origin Authorizations (ROAs)** stating which ASNs are authorized to originate their prefixes.\n\n### RPKI Validation States\n| State | Meaning |\n|-------|----------|\n| **Valid** | The route's origin AS matches a ROA for this prefix |\n| **Invalid** | A ROA exists but the origin AS or prefix length does NOT match \u2014 this route is potentially a hijack |\n| **NotFound** | No ROA exists for this prefix/origin \u2014 unprotected, can't be validated |\n\n### How to read this dashboard\n- **Valid %** should be as high as possible (target: 100%)\n- **Invalid routes** are critical \u2014 they indicate either a misconfiguration or a prefix hijack\n- Routes with no RPKI data show as **NotFound** \u2014 they are not necessarily invalid, just unprotected\n\n> **Lab note:** The RPKI validator table is populated by a cron job in psql-app every 2 hours. If the table shows 0 rows, wait for the cron to run or check `ENABLE_RPKI=1` in docker-compose.yml.",
"mode": "markdown"
},
"pluginVersion": "9.1.7",
"title": "RPKI Learning Guide",
"type": "text"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Total ROAs (Route Origin Authorizations) loaded from the RPKI validator. If 0, the cron job has not yet run.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
},
{
"color": "yellow",
"value": 1
},
{
"color": "green",
"value": 100000
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 4,
"x": 8,
"y": 0
},
"id": 2,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"RPKI ROAs Loaded\" FROM rpki_validator",
"refId": "A"
}
],
"title": "RPKI ROAs Loaded",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Routes with a matching valid ROA \u2014 origin AS and prefix length both match.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "red",
"value": null
},
{
"color": "green",
"value": 1
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 4,
"x": 12,
"y": 0
},
"id": 3,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"Valid Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nJOIN rpki_validator rv ON rv.prefix >>= r.prefix AND rv.origin_as = ba.origin_as AND r.prefix_len <= rv.prefix_len_max\nWHERE r.iswithdrawn = false AND r.isipv4 = true",
"refId": "A"
}
],
"title": "RPKI Valid Routes",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Routes where a ROA exists but the origin AS does NOT match \u2014 high-priority investigation needed.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 1
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 4,
"x": 16,
"y": 0
},
"id": 4,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"RPKI Invalid Routes\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND EXISTS (\n SELECT 1 FROM rpki_validator rv\n WHERE rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as\n )\n AND NOT EXISTS (\n SELECT 1 FROM rpki_validator rv\n WHERE rv.prefix >>= r.prefix AND rv.origin_as = ba.origin_as AND r.prefix_len <= rv.prefix_len_max\n )",
"refId": "A"
}
],
"title": "RPKI Invalid Routes",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: ExaBGP-injected routes (AS 65100) will be NotFound since they use synthetic ASNs not registered in RPKI. Real internet prefixes with valid ROAs will appear as Valid.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"mappings": []
},
"overrides": []
},
"gridPos": {
"h": 10,
"w": 10,
"x": 0,
"y": 10
},
"id": 5,
"options": {
"displayLabels": [
"percent",
"name"
],
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"pieType": "donut",
"tooltip": {
"mode": "single"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n CASE\n WHEN rv_valid.prefix IS NOT NULL THEN 'Valid'\n WHEN rv_any.prefix IS NOT NULL THEN 'Invalid'\n ELSE 'NotFound'\n END AS \"RPKI Status\",\n COUNT(*) AS \"Route Count\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nLEFT JOIN rpki_validator rv_valid\n ON rv_valid.prefix >>= r.prefix AND rv_valid.origin_as = ba.origin_as AND r.prefix_len <= rv_valid.prefix_len_max\nLEFT JOIN rpki_validator rv_any\n ON rv_any.prefix >>= r.prefix AND rv_any.origin_as != ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "RPKI Validation Status Distribution",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefixes that have a ROA but the observed origin AS does not match. These are the most security-critical routes \u2014 each one represents a potential hijack or misconfiguration.",
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Status"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "mappings",
"value": [
{
"options": {
"Invalid": {
"color": "red",
"index": 0
},
"Valid": {
"color": "green",
"index": 1
},
"NotFound": {
"color": "yellow",
"index": 2
}
},
"type": "value"
}
]
}
]
}
]
},
"gridPos": {
"h": 14,
"w": 14,
"x": 10,
"y": 10
},
"id": 6,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT\n r.prefix AS \"Prefix\",\n ba.origin_as AS \"Observed Origin AS\",\n rv.origin_as AS \"Authorized Origin AS (ROA)\",\n 'Invalid' AS \"Status\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nJOIN rpki_validator rv ON rv.prefix >>= r.prefix AND rv.origin_as != ba.origin_as\nWHERE r.iswithdrawn = false AND r.isipv4 = true\n AND NOT EXISTS (\n SELECT 1 FROM rpki_validator rv2\n WHERE rv2.prefix >>= r.prefix AND rv2.origin_as = ba.origin_as AND r.prefix_len <= rv2.prefix_len_max\n )\nORDER BY r.prefix\nLIMIT 50",
"refId": "A"
}
],
"title": "RPKI Invalid Routes \u2014 Potential Hijacks",
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"rpki",
"security",
"obmp-nav"
],
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "RPKI Validation Status",
"uid": "obmp-learn-04",
"version": 1
}

View File

@ -0,0 +1,465 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"description": "BGP update and withdrawal rates over time. Teaches what normal BGP traffic looks like and how to detect route churn or instability.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: A healthy network has far more advertisements than withdrawals. A withdrawal spike often signals a link failure or route flap.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "bars",
"fillOpacity": 60,
"lineWidth": 1,
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
}
},
"unit": "short"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Advertisements"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "green",
"mode": "fixed"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "Withdrawals"
},
"properties": [
{
"id": "color",
"value": {
"fixedColor": "red",
"mode": "fixed"
}
}
]
}
]
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"legend": {
"calcs": [
"sum",
"max"
],
"displayMode": "list",
"placement": "bottom"
},
"tooltip": {
"mode": "multi"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(timestamp,'5m'),\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) AS \"Advertisements\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) AS \"Withdrawals\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY 1\nORDER BY 1",
"refId": "A"
}
],
"title": "BGP Updates Over Time \u2014 Advertisements vs Withdrawals",
"type": "timeseries"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 100
},
{
"color": "red",
"value": 1000
}
]
},
"unit": "short",
"mappings": []
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 0,
"y": 10
},
"id": 2,
"options": {
"colorMode": "background",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(*) AS \"Total Updates (24h)\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Total Updates (24h)",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Learn: Withdrawal rate above 30% is unusual. Above 50% may indicate a route leak or oscillation event.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 20
},
{
"color": "red",
"value": 50
}
]
},
"unit": "percent",
"max": 100
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 6,
"y": 10
},
"id": 3,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time,\n ROUND(100.0 * SUM(CASE WHEN iswithdrawn THEN 1 ELSE 0 END) / NULLIF(COUNT(*),0), 1) AS \"Withdrawal Rate %\"\nFROM ip_rib_log\nWHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Withdrawal Rate % (24h)",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 1000
},
{
"color": "red",
"value": 10000
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 12,
"y": 10
},
"id": 4,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT peer_hash_id) AS \"Active Peers\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '1 hour'",
"refId": "A"
}
],
"title": "Active Reporting Peers (1h)",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "yellow",
"value": 500
},
{
"color": "red",
"value": 2000
}
]
},
"unit": "short"
}
},
"gridPos": {
"h": 5,
"w": 6,
"x": 18,
"y": 10
},
"id": 5,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT NOW() AS time, COUNT(DISTINCT prefix) AS \"Unique Prefixes Updated (24h)\" FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '24 hours'",
"refId": "A"
}
],
"title": "Unique Prefixes Updated (24h)",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Updates per peer over time. Learn: Peers should have similar update rates. A peer with dramatically more updates may be experiencing instability or receiving a full BGP table with frequent changes.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "line",
"fillOpacity": 10,
"lineWidth": 1,
"spanNulls": false
},
"unit": "short"
}
},
"gridPos": {
"h": 9,
"w": 24,
"x": 0,
"y": 15
},
"id": 6,
"options": {
"legend": {
"calcs": [],
"displayMode": "list",
"placement": "right"
},
"tooltip": {
"mode": "multi"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawSql": "SELECT\n $__timeGroupAlias(s.interval_time,'30m'),\n COALESCE(p.name, p.peer_addr::text) AS metric,\n SUM(s.advertise_avg + s.withdraw_avg) AS \"Updates\"\nFROM stats_peer_update_counts s\nJOIN bgp_peers p ON p.hash_id = s.peer_hash_id\nWHERE $__timeFilter(s.interval_time)\nGROUP BY 1, 2\nORDER BY 1",
"refId": "A"
}
],
"title": "Update Rate by Peer (30-min buckets)",
"type": "timeseries"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp",
"bgp",
"churn",
"obmp-nav"
],
"time": {
"from": "now-24h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "BGP Update Rate & Churn",
"uid": "obmp-learn-01",
"version": 1
}

View File

@ -25,7 +25,19 @@
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 7,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -497,7 +509,9 @@
"schemaVersion": 37,
"style": "dark",
"tags": [
"obmp-history"
"obmp-history",
"obmp",
"obmp-nav"
],
"templating": {
"list": [

View File

@ -25,7 +25,19 @@
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 8,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -231,7 +243,9 @@
"schemaVersion": 37,
"style": "dark",
"tags": [
"obmp-history"
"obmp-history",
"obmp",
"obmp-nav"
],
"templating": {
"list": [

View File

@ -26,7 +26,19 @@
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 9,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -141,10 +153,6 @@
"type": "table"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
@ -152,46 +160,42 @@
"decimals": 0,
"fieldConfig": {
"defaults": {
"links": []
"links": [],
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "line",
"lineInterpolation": "smooth",
"lineWidth": 1,
"fillOpacity": 15,
"showPoints": "never",
"spanNulls": false,
"axisPlacement": "auto"
}
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 11,
"x": 0,
"y": 6
},
"hiddenSeries": false,
"id": 1,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": false,
"show": true,
"total": true,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
"legend": {
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"percentage": false,
"pluginVersion": "9.1.7",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"alias": "",
@ -222,43 +226,10 @@
]
}
],
"thresholds": [],
"timeRegions": [],
"title": "Prefix Advertisements & Withdrawals",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"mode": "time",
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:289",
"format": "none",
"logBase": 1,
"show": true
},
{
"$$hashKey": "object:290",
"format": "short",
"logBase": 1,
"show": false
}
],
"yaxis": {
"align": false
}
"type": "timeseries"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
@ -266,49 +237,42 @@
"decimals": 0,
"fieldConfig": {
"defaults": {
"links": []
"links": [],
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "line",
"lineInterpolation": "smooth",
"lineWidth": 1,
"fillOpacity": 15,
"showPoints": "never",
"spanNulls": false,
"axisPlacement": "auto"
}
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 13,
"x": 11,
"y": 6
},
"hiddenSeries": false,
"id": 2,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": false,
"rightSide": true,
"show": true,
"sort": "total",
"sortDesc": true,
"total": true,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
"legend": {
"displayMode": "list",
"placement": "bottom",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"percentage": false,
"pluginVersion": "9.1.7",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"alias": "",
@ -338,39 +302,8 @@
]
}
],
"thresholds": [],
"timeRegions": [],
"title": "Changes by Peer",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"mode": "time",
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:346",
"decimals": 0,
"format": "none",
"label": "",
"logBase": 1,
"show": true
},
{
"$$hashKey": "object:347",
"format": "short",
"logBase": 1,
"show": false
}
],
"yaxis": {
"align": false
}
"type": "timeseries"
},
{
"datasource": {
@ -505,7 +438,9 @@
"schemaVersion": 37,
"style": "dark",
"tags": [
"obmp-history"
"obmp-history",
"obmp",
"obmp-nav"
],
"templating": {
"list": [

View File

@ -26,7 +26,19 @@
"graphTooltip": 0,
"id": 11,
"iteration": 1654876675775,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -949,7 +961,9 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-tops"
"obmp-tops",
"obmp",
"obmp-nav"
],
"templating": {
"list": [

View File

@ -26,7 +26,19 @@
"graphTooltip": 0,
"id": 12,
"iteration": 1654876366831,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -1268,7 +1280,9 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-tops"
"obmp-tops",
"obmp",
"obmp-nav"
],
"templating": {
"list": [

View File

@ -1,780 +0,0 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 19,
"iteration": 1654877653557,
"links": [],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefix found in router's RIB.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 5,
"w": 6,
"x": 0,
"y": 0
},
"id": 9,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"sum"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT\n floor(extract(epoch from max(r.timestamp))) as time,\n CASE WHEN v.router_hash_id is null THEN 'Not in Router RIB' ELSE 'In Router Rib' END as metric,\n 1 as value\nFROM routers r\n left join (select distinct router_hash_id\n from v_l3vpn_routes\n where prefix = '$prefix'\n and ('$rd' = '-' OR rd = '$rd')\n and iswithdrawn = false group by router_hash_id) v \n on (r.hash_id = v.router_hash_id)\nWHERE r.state = 'up'\nGROUP BY r.hash_id,v.router_hash_id\norder by time\n\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Router Visibility",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefix found in peer RIB's",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 5,
"w": 6,
"x": 6,
"y": 0
},
"id": 10,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"sum"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT\n floor(extract(epoch from max(p.timestamp))) as time,\n CASE WHEN v.peer_hash_id is null THEN 'Not in Peers RIB' ELSE 'In Peer RIB' END as metric,\n 1 as value\nFROM bgp_peers p\n left join (select peer_hash_id,isipv4\n from l3vpn_rib \n where prefix = '$prefix' and prefix != '0.0.0.0/0'\n AND ('$rd' = '-' OR rd = '$rd')\n and iswithdrawn = false group by peer_hash_id,isipv4) v \n on (p.hash_id = v.peer_hash_id)\nWHERE p.isipv4 = CASE WHEN family('$prefix') = 4 THEN true ELSE false END\n AND p.state = 'up'\nGROUP BY p.hash_id,v.peer_hash_id,p.isipv4\norder by time\n\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Peer Visibility",
"type": "piechart"
},
{
"circleMaxSize": "15",
"circleMinSize": 2,
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"esMetric": "Count",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"hideEmpty": false,
"hideZero": false,
"id": 17,
"initialZoom": "1",
"locationData": "table",
"mapCenter": "(0°, 0°)",
"mapCenterLatitude": 0,
"mapCenterLongitude": 0,
"maxDataPoints": 1,
"mouseWheelZoom": false,
"showLegend": false,
"stickyLabels": false,
"tableQueryOptions": {
"geohashField": "geohash",
"labelField": "name",
"latitudeField": "latitude",
"longitudeField": "longitude",
"metricField": "value",
"queryType": "coordinates"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT\n 10 as value, latitude, longitude, stateprov as name\nFROM geo_ip\nWHERE\n ip && '$input'\nORDER BY ip desc limit 1",
"refId": "A",
"select": [
[
{
"params": [
"latitude"
],
"type": "column"
}
]
],
"table": "v_ip_routes_geo",
"timeColumn": "lastmodified",
"timeColumnType": "timestamp",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"thresholds": "0,10",
"title": "Prefix Location",
"type": "grafana-worldmap-panel",
"unitPlural": "",
"unitSingle": "",
"valueName": "current"
},
{
"columns": [],
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fontSize": "100%",
"gridPos": {
"h": 6,
"w": 24,
"x": 0,
"y": 8
},
"id": 12,
"links": [],
"scroll": true,
"showHeader": true,
"sort": {
"col": 0,
"desc": true
},
"styles": [
{
"alias": "Time",
"align": "auto",
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"pattern": "Time",
"type": "date"
},
{
"alias": "",
"align": "auto",
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"dateFormat": "YYYY-MM-DD HH:mm:ss",
"decimals": 2,
"mappingType": 1,
"pattern": "raw_output",
"preserveFormat": true,
"sanitize": false,
"thresholds": [],
"type": "string",
"unit": "short"
},
{
"alias": "",
"align": "auto",
"colors": [
"rgba(245, 54, 54, 0.9)",
"rgba(237, 129, 40, 0.89)",
"rgba(50, 172, 45, 0.97)"
],
"decimals": 2,
"pattern": "/.*/",
"thresholds": [],
"type": "string",
"unit": "short"
}
],
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select distinct origin_as,i.as_name,org_id,org_name,remarks,address,city,state_prov,country,raw_output,source\n from l3vpn_rib r LEFT JOIN info_asn i ON (i.asn = r.origin_as)\n where r.prefix = '$prefix'\n and ('$rd' = '-' OR rd = '$rd')\n and origin_as > 0\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "ASN Info",
"transform": "table",
"type": "table-old"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": false
},
"decimals": 0,
"displayName": "",
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "locale"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "lastmodified"
},
"properties": [
{
"id": "displayName",
"value": "Time"
},
{
"id": "unit",
"value": "time: YYYY-MM-DD HH:mm:ss.SSS"
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "prefix"
},
"properties": [
{
"id": "displayName",
"value": "Prefix"
},
{
"id": "unit",
"value": "short"
},
{
"id": "decimals",
"value": 2
},
{
"id": "links",
"value": [
{
"targetBlank": true,
"title": "Prefix History ",
"url": "/d/l3vpn-prefix-hist/prefix-history-by-prefix-l3vpn?orgId=1&var-input=${__value.text}&var-rd=$rd"
}
]
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "origin_as"
},
"properties": [
{
"id": "displayName",
"value": "Origin"
},
{
"id": "unit",
"value": "none"
},
{
"id": "links",
"value": [
{
"targetBlank": true,
"title": "ASN View",
"url": "/grafana/d/asnview/asn-view?orgId=1&var-asn_num=${__value.text}"
}
]
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "iswithdrawn"
},
"properties": [
{
"id": "displayName",
"value": "Withdrawn"
},
{
"id": "unit",
"value": "bool"
},
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "custom.align",
"value": "auto"
},
{
"id": "color",
"value": {
"mode": "continuous-GrYlRd"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "Time"
},
"properties": [
{
"id": "custom.width",
"value": 194
}
]
}
]
},
"gridPos": {
"h": 23,
"w": 24,
"x": 0,
"y": 14
},
"id": 3,
"links": [],
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": []
},
"pluginVersion": "8.5.4",
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select distinct ip.*, \n \tFIRST_VALUE(geo_ip.city) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as city,\n \tFIRST_VALUE(geo_ip.stateprov) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as stateprov,\n \tFIRST_VALUE(geo_ip.country) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as country,\n ls.local_router_name\n\tFROM (SELECT lastmodified,peername,rd,prefix,\n \tiswithdrawn,origin_as,med,localpref,nh,as_path,extcommunities,communities,largecommunities\n from v_l3vpn_routes\n \t\twhere prefix && '$input' \n \t\t AND peer_hash_id in ($peer_hash)\n \t\t AND ('$rd' = '-' OR rd = '$rd')\n \t\tlimit 2000\n \t) ip\n\t\tLEFT JOIN geo_ip on (geo_ip.ip >>= ip.prefix AND geo_ip.ip != '0.0.0.0/0')\n LEFT JOIN v_ls_prefixes ls ON (ls.prefix >>= ip.nh and length(ls.local_router_name) > 0)",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Looking Glass",
"transformations": [
{
"id": "merge",
"options": {
"reducers": []
}
}
],
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-l3vpn"
],
"templating": {
"list": [
{
"current": {
"selected": false,
"text": "80.0.0.2",
"value": "80.0.0.2"
},
"hide": 0,
"label": "Prefix/IP",
"name": "input",
"options": [
{
"selected": true,
"text": "80.0.0.2",
"value": "80.0.0.2"
}
],
"query": "80.0.0.2",
"queryValue": "50.227.215.188",
"skipUrlSync": false,
"type": "textbox"
},
{
"current": {
"selected": true,
"text": [
"All"
],
"value": [
"$__all"
]
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select name as __text, hash_id as __value from routers where state = 'up'",
"hide": 0,
"includeAll": true,
"label": "Router",
"multi": true,
"name": "router_hash",
"options": [],
"query": "select name as __text, hash_id as __value from routers where state = 'up'",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"current": {
"selected": true,
"text": [
"All"
],
"value": [
"$__all"
]
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select peername as __text, peer_hash_id as __value from v_peers where router_hash_id in ($router_hash) and recvcapabilities like '% afi=1 safi=128 %';",
"hide": 0,
"includeAll": true,
"label": "Peer",
"multi": true,
"name": "peer_hash",
"options": [],
"query": "select peername as __text, peer_hash_id as __value from v_peers where router_hash_id in ($router_hash) and recvcapabilities like '% afi=1 safi=128 %';",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"current": {
"isNone": true,
"selected": false,
"text": "None",
"value": ""
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select prefix from l3vpn_rib \nwhere prefix >>= '$input' and peer_hash_id in ($peer_hash) and ('$rd' = '-' OR rd = '$rd')\norder by prefix desc limit 1",
"hide": 2,
"includeAll": false,
"multi": false,
"name": "prefix",
"options": [],
"query": "select prefix from l3vpn_rib \nwhere prefix >>= '$input' and peer_hash_id in ($peer_hash) and ('$rd' = '-' OR rd = '$rd')\norder by prefix desc limit 1",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
},
{
"description": "RD in the format of N:N. Set to - for all.",
"hide": 2,
"label": "RD",
"name": "rd",
"query": "-",
"skipUrlSync": false,
"type": "constant"
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
],
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "",
"title": "Looking Glass - L3VPN",
"uid": "jiQW6VB7k",
"version": 1,
"weekStart": ""
}

View File

@ -26,7 +26,19 @@
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 20,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -145,245 +157,192 @@
"type": "table"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"fieldConfig": {
"defaults": {
"links": []
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"decimals": 0,
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 6
},
"hiddenSeries": false,
"id": 1,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": false,
"rightSide": true,
"show": true,
"total": true,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
"legend": {
"calcs": [
"max",
"mean",
"sum"
],
"displayMode": "table",
"placement": "right",
"showLegend": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"percentage": false,
"pluginVersion": "9.1.7",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT\n interval_time as time,\n sum(updates) as updates, sum(withdraws) as withdraws\nFROM stats_l3vpn_chg_byprefix s\nWHERE $__timeFilter(interval_time)\n AND peer_hash_id in ($peer_hash)\n ${prefix_clause:raw}\n\ngroup by interval_time\nORDER BY interval_time ASC\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"thresholds": [],
"timeRegions": [],
"title": "Prefix Advertisements & Withdrawals",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"mode": "time",
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:289",
"format": "none",
"logBase": 1,
"show": true
},
{
"$$hashKey": "object:290",
"format": "short",
"logBase": 1,
"show": false
}
],
"yaxis": {
"align": false
}
"type": "timeseries"
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"decimals": 0,
"fieldConfig": {
"defaults": {
"links": []
"color": {
"mode": "palette-classic"
},
"custom": {
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
},
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": {
"type": "linear"
},
"showPoints": "auto",
"spanNulls": false,
"stacking": {
"group": "A",
"mode": "none"
},
"thresholdsStyle": {
"mode": "off"
}
},
"decimals": 0,
"links": [],
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 6
},
"hiddenSeries": false,
"id": 2,
"legend": {
"alignAsTable": true,
"avg": true,
"current": false,
"max": true,
"min": false,
"rightSide": true,
"show": true,
"sort": "total",
"sortDesc": true,
"total": true,
"values": true
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
"legend": {
"calcs": [
"max",
"mean",
"sum"
],
"displayMode": "table",
"placement": "right",
"showLegend": true,
"sortBy": "Total",
"sortDesc": true
},
"tooltip": {
"mode": "multi",
"sort": "none"
}
},
"percentage": false,
"pluginVersion": "9.1.7",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT\n interval_time as time,\n sum(updates) + sum(withdraws) as value,\n left(PeerName,32) as metric\nFROM stats_l3vpn_chg_byprefix s\n JOIN v_peers p ON (s.peer_hash_id = p.peer_hash_id)\nWHERE $__timeFilter(interval_time)\n AND s.peer_hash_id in ($peer_hash)\n ${prefix_clause:raw}\n\nGROUP BY s.interval_time,peername\nORDER BY interval_time ASC\n\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"thresholds": [],
"timeRegions": [],
"title": "Changes by Peer",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"mode": "time",
"show": true,
"values": []
},
"yaxes": [
{
"$$hashKey": "object:346",
"decimals": 0,
"format": "none",
"label": "",
"logBase": 1,
"show": true
},
{
"$$hashKey": "object:347",
"format": "short",
"logBase": 1,
"show": false
}
],
"yaxis": {
"align": false
}
"type": "timeseries"
},
{
"datasource": {
@ -537,9 +496,11 @@
}
],
"refresh": "",
"schemaVersion": 37,
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"l3vpn",
"obmp-l3vpn"
],
"templating": {

View File

@ -21,14 +21,39 @@
}
]
},
"description": "L3VPN RIB browser combined with the per-prefix Looking Glass: route counts by RD, prefix visibility, geolocation and ASN ownership.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 21,
"iteration": 1654877634754,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 0
},
"id": 20,
"panels": [],
"title": "RIB Browser",
"type": "row"
},
{
"datasource": {
"type": "postgres",
@ -76,7 +101,7 @@
"h": 8,
"w": 11,
"x": 0,
"y": 0
"y": 1
},
"id": 5,
"options": {
@ -100,7 +125,7 @@
"xTickLabelRotation": 0,
"xTickLabelSpacing": 0
},
"pluginVersion": "8.3.4",
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
@ -108,31 +133,9 @@
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select\n count(*) as count,\n rd\n from l3vpn_rib\n where\n peer_hash_id in ($peer_hash)\n and ('$rd' = '-' or rd = '$rd')\n and iswithdrawn = false\n group by rd\n",
"refId": "A",
"select": [
[
{
"params": [
"latitude"
],
"type": "column"
}
]
],
"table": "v_ip_routes_geo",
"timeColumn": "lastmodified",
"timeColumnType": "timestamp",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Routes Advertised/Active",
@ -208,9 +211,9 @@
},
"gridPos": {
"h": 8,
"w": 12,
"w": 13,
"x": 11,
"y": 0
"y": 1
},
"id": 6,
"options": {
@ -234,7 +237,7 @@
"xTickLabelRotation": 0,
"xTickLabelSpacing": 0
},
"pluginVersion": "8.3.4",
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
@ -242,31 +245,9 @@
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select\n count(*) as count,\n rd\n from l3vpn_rib\n where\n peer_hash_id in ($peer_hash)\n and ('$rd' = '-' OR rd = '$rd')\n and iswithdrawn = true\n group by rd\n",
"refId": "A",
"select": [
[
{
"params": [
"latitude"
],
"type": "column"
}
]
],
"table": "v_ip_routes_geo",
"timeColumn": "lastmodified",
"timeColumnType": "timestamp",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Routes Withdrawn/Inactive",
@ -429,10 +410,10 @@
]
},
"gridPos": {
"h": 23,
"h": 18,
"w": 24,
"x": 0,
"y": 8
"y": 9
},
"id": 3,
"links": [],
@ -447,39 +428,17 @@
"showHeader": true,
"sortBy": []
},
"pluginVersion": "8.5.4",
"pluginVersion": "9.1.7",
"targets": [
{
"alias": "",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select distinct ip.*, \n \tFIRST_VALUE(geo_ip.city) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as city,\n \tFIRST_VALUE(geo_ip.stateprov) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as stateprov,\n \tFIRST_VALUE(geo_ip.country) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as country,\n ls.local_router_name\n\tFROM (SELECT lastmodified,peername,rd,prefix,\n \tiswithdrawn,origin_as,med,localpref,nh,as_path,communities,extcommunities\n from v_l3vpn_routes\n \t\twhere \n \t\t peer_hash_id in ($peer_hash)\n \t\t AND ('$rd' = '-' OR rd = '$rd')\n \t\t AND (iswithdrawn in ($state))\n \t\tlimit $limit\n \t) ip\n\t\tLEFT JOIN geo_ip on (geo_ip.ip >>= ip.prefix AND geo_ip.ip != '0.0.0.0/0')\n LEFT JOIN v_ls_prefixes ls ON (ls.prefix >>= ip.nh and length(ls.local_router_name) > 0)",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Looking Glass (RD = $rd)",
@ -492,11 +451,555 @@
}
],
"type": "table"
},
{
"collapsed": false,
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 27
},
"id": 21,
"panels": [],
"title": "Looking Glass - Prefix Lookup ($input)",
"type": "row"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefix found in router's RIB.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 6,
"x": 0,
"y": 28
},
"id": 9,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"sum"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawQuery": true,
"rawSql": "SELECT\n floor(extract(epoch from max(r.timestamp))) as time,\n CASE WHEN v.router_hash_id is null THEN 'Not in Router RIB' ELSE 'In Router Rib' END as metric,\n 1 as value\nFROM routers r\n left join (select distinct router_hash_id\n from v_l3vpn_routes\n where prefix = '$prefix'\n and ('$rd' = '-' OR rd = '$rd')\n and iswithdrawn = false group by router_hash_id) v \n on (r.hash_id = v.router_hash_id)\nWHERE r.state = 'up'\nGROUP BY r.hash_id,v.router_hash_id\norder by time\n\n",
"refId": "A"
}
],
"title": "Router Visibility",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Prefix found in peer RIB's",
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 6,
"x": 6,
"y": 28
},
"id": 10,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"sum"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"rawQuery": true,
"rawSql": "SELECT\n floor(extract(epoch from max(p.timestamp))) as time,\n CASE WHEN v.peer_hash_id is null THEN 'Not in Peers RIB' ELSE 'In Peer RIB' END as metric,\n 1 as value\nFROM bgp_peers p\n left join (select peer_hash_id,isipv4\n from l3vpn_rib \n where prefix = '$prefix' and prefix != '0.0.0.0/0'\n AND ('$rd' = '-' OR rd = '$rd')\n and iswithdrawn = false group by peer_hash_id,isipv4) v \n on (p.hash_id = v.peer_hash_id)\nWHERE p.isipv4 = CASE WHEN family('$prefix') = 4 THEN true ELSE false END\n AND p.state = 'up'\nGROUP BY p.hash_id,v.peer_hash_id,p.isipv4\norder by time\n\n",
"refId": "A"
}
],
"title": "Peer Visibility",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Geolocation of the looked-up prefix.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 28
},
"id": 17,
"options": {
"basemap": {
"config": {},
"name": "Layer 0",
"type": "default"
},
"controls": {
"mouseWheelZoom": true,
"showAttribution": true,
"showDebug": false,
"showMeasure": false,
"showScale": false,
"showZoom": true
},
"layers": [
{
"config": {
"showLegend": false,
"style": {
"color": {
"fixed": "dark-orange"
},
"opacity": 0.4,
"rotation": {
"fixed": 0,
"max": 360,
"min": -360,
"mode": "mod"
},
"size": {
"fixed": 8,
"max": 15,
"min": 2
},
"symbol": {
"fixed": "img/icons/marker/circle.svg",
"mode": "fixed"
},
"textConfig": {
"fontSize": 12,
"offsetX": 0,
"offsetY": 0,
"textAlign": "center",
"textBaseline": "middle"
}
}
},
"location": {
"latitude": "latitude",
"longitude": "longitude",
"mode": "coords"
},
"name": "Prefix Location",
"tooltip": true,
"type": "markers"
}
],
"tooltip": {
"mode": "details"
},
"view": {
"id": "zero",
"lat": 0,
"lon": 0,
"zoom": 1
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawQuery": true,
"rawSql": "SELECT\n 10 as value, latitude, longitude, stateprov as name\nFROM geo_ip\nWHERE\n ip && '$input'\nORDER BY ip desc limit 1",
"refId": "A"
}
],
"title": "Prefix Location",
"type": "geomap"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"description": "Origin-AS ownership for the looked-up prefix.",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": false
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 6,
"w": 24,
"x": 0,
"y": 36
},
"id": 12,
"links": [],
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawQuery": true,
"rawSql": "select distinct origin_as,i.as_name,org_id,org_name,remarks,address,city,state_prov,country,raw_output,source\n from l3vpn_rib r LEFT JOIN info_asn i ON (i.asn = r.origin_as)\n where r.prefix = '$prefix'\n and ('$rd' = '-' OR rd = '$rd')\n and origin_as > 0\n",
"refId": "A"
}
],
"title": "ASN Info",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": false
},
"decimals": 0,
"displayName": "",
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unit": "locale"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "lastmodified"
},
"properties": [
{
"id": "displayName",
"value": "Time"
},
{
"id": "unit",
"value": "time: YYYY-MM-DD HH:mm:ss.SSS"
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "prefix"
},
"properties": [
{
"id": "displayName",
"value": "Prefix"
},
{
"id": "unit",
"value": "short"
},
{
"id": "decimals",
"value": 2
},
{
"id": "links",
"value": [
{
"targetBlank": true,
"title": "Prefix History ",
"url": "/d/l3vpn-prefix-hist/prefix-history-by-prefix-l3vpn?orgId=1&var-input=${__value.text}&var-rd=$rd"
}
]
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "origin_as"
},
"properties": [
{
"id": "displayName",
"value": "Origin"
},
{
"id": "unit",
"value": "none"
},
{
"id": "links",
"value": [
{
"targetBlank": true,
"title": "ASN View",
"url": "/grafana/d/asnview/asn-view?orgId=1&var-asn_num=${__value.text}"
}
]
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "iswithdrawn"
},
"properties": [
{
"id": "displayName",
"value": "Withdrawn"
},
{
"id": "unit",
"value": "bool"
},
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "custom.align",
"value": "auto"
},
{
"id": "color",
"value": {
"mode": "continuous-GrYlRd"
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "Time"
},
"properties": [
{
"id": "custom.width",
"value": 194
}
]
}
]
},
"gridPos": {
"h": 18,
"w": 24,
"x": 0,
"y": 42
},
"id": 13,
"links": [],
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true,
"sortBy": []
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawQuery": true,
"rawSql": "select distinct ip.*, \n \tFIRST_VALUE(geo_ip.city) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as city,\n \tFIRST_VALUE(geo_ip.stateprov) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as stateprov,\n \tFIRST_VALUE(geo_ip.country) OVER (PARTITION BY ip.prefix ORDER BY geo_ip.ip DESC) as country,\n ls.local_router_name\n\tFROM (SELECT lastmodified,peername,rd,prefix,\n \tiswithdrawn,origin_as,med,localpref,nh,as_path,extcommunities,communities,largecommunities\n from v_l3vpn_routes\n \t\twhere prefix && '$input' \n \t\t AND peer_hash_id in ($peer_hash)\n \t\t AND ('$rd' = '-' OR rd = '$rd')\n \t\tlimit 2000\n \t) ip\n\t\tLEFT JOIN geo_ip on (geo_ip.ip >>= ip.prefix AND geo_ip.ip != '0.0.0.0/0')\n LEFT JOIN v_ls_prefixes ls ON (ls.prefix >>= ip.nh and length(ls.local_router_name) > 0)",
"refId": "A"
}
],
"title": "Looking Glass (Prefix Lookup)",
"transformations": [
{
"id": "merge",
"options": {
"reducers": []
}
}
],
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"l3vpn",
"obmp-l3vpn"
],
"templating": {
@ -534,10 +1037,13 @@
},
{
"current": {
"isNone": true,
"selected": false,
"text": "None",
"value": ""
"selected": true,
"text": [
"All"
],
"value": [
"$__all"
]
},
"datasource": {
"type": "postgres",
@ -545,7 +1051,7 @@
},
"definition": "select peername as __text, peer_hash_id as __value from v_peers where router_hash_id in ($router_hash) and recvcapabilities like '% afi=1 safi=128 %';",
"hide": 0,
"includeAll": false,
"includeAll": true,
"label": "Peer",
"multi": true,
"name": "peer_hash",
@ -639,6 +1145,7 @@
},
"hide": 0,
"includeAll": true,
"label": "State",
"multi": true,
"name": "state",
"options": [
@ -662,6 +1169,51 @@
"queryValue": "",
"skipUrlSync": false,
"type": "custom"
},
{
"current": {
"selected": false,
"text": "80.0.0.2",
"value": "80.0.0.2"
},
"hide": 0,
"label": "Prefix/IP Lookup",
"name": "input",
"options": [
{
"selected": true,
"text": "80.0.0.2",
"value": "80.0.0.2"
}
],
"query": "80.0.0.2",
"queryValue": "50.227.215.188",
"skipUrlSync": false,
"type": "textbox"
},
{
"current": {
"isNone": true,
"selected": false,
"text": "None",
"value": ""
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select prefix from l3vpn_rib \nwhere prefix >>= '$input' and peer_hash_id in ($peer_hash) and ('$rd' = '-' OR rd = '$rd')\norder by prefix desc limit 1",
"hide": 2,
"includeAll": false,
"multi": false,
"name": "prefix",
"options": [],
"query": "select prefix from l3vpn_rib \nwhere prefix >>= '$input' and peer_hash_id in ($peer_hash) and ('$rd' = '-' OR rd = '$rd')\norder by prefix desc limit 1",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
}
]
},
@ -695,8 +1247,8 @@
]
},
"timezone": "",
"title": "L3VPN RIB Browser",
"title": "L3VPN RIB & Looking Glass",
"uid": "v-cdzIBnz",
"version": 1,
"version": 2,
"weekStart": ""
}
}

View File

@ -26,7 +26,19 @@
"graphTooltip": 0,
"id": 14,
"iteration": 1654877691622,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -278,6 +290,8 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"linkstate",
"obmp-linkstate"
],
"templating": {

View File

@ -1,479 +0,0 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target": {
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
},
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 15,
"iteration": 1654877712696,
"links": [],
"liveNow": false,
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 7,
"w": 7,
"x": 0,
"y": 0
},
"id": 4,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value",
"percent"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"sum"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"format": "time_series",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select floor(extract(epoch from max(timestamp))) as time,\n count(*) as value, CASE WHEN iswithdrawn THEN 'WITHDRAWN' ELSE 'ACTIVE' END as metric\nfrom ls_links\nwhere local_node_hash_id = '$local_node_hash_id'\n AND peer_hash_id = '$peer_hash'\ngroup by iswithdrawn\norder by time\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Link States",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"hideFrom": {
"legend": false,
"tooltip": false,
"viz": false
}
},
"decimals": 0,
"mappings": [],
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 7,
"w": 6,
"x": 7,
"y": 0
},
"id": 6,
"links": [],
"maxDataPoints": 3,
"options": {
"legend": {
"calcs": [],
"displayMode": "table",
"placement": "right",
"values": [
"value"
]
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"format": "time_series",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select floor(extract(epoch from max(timestamp))) as time,\n count(*) as value, CASE WHEN mt_id = 2 THEN 'IPv6' ELSE 'IPv4' END as metric\nfrom ls_links\nwhere local_node_hash_id = '$local_node_hash_id'\n AND peer_hash_id = '$peer_hash'\ngroup by mt_id\norder by time\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "Links by Type",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"inspect": false
},
"decimals": 0,
"displayName": "",
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "timestamp"
},
"properties": [
{
"id": "displayName",
"value": "Time"
},
{
"id": "unit",
"value": "short"
},
{
"id": "decimals",
"value": 2
},
{
"id": "unit",
"value": "time: YYYY-MM-DD HH:mm:ss.SSS"
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "seq"
},
"properties": [
{
"id": "unit",
"value": "locale"
}
]
},
{
"matcher": {
"id": "byName",
"options": "state"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "mappings",
"value": [
{
"options": {
"ACTIVE": {
"color": "semi-dark-green",
"index": 0
},
"WITHDRAWN": {
"color": "semi-dark-red",
"index": 1
}
},
"type": "value"
}
]
}
]
}
]
},
"gridPos": {
"h": 14,
"w": 24,
"x": 0,
"y": 7
},
"id": 2,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"pluginVersion": "8.5.4",
"targets": [
{
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT state,local_router_name,local_igp_routerid,remote_router_name,remote_igp_routerid,mt_id,igp_metric,protocol, timestamp, seq\n FROM v_ls_links\n WHERE local_node_hash_id = '$local_node_hash_id'\n AND peer_hash_id = '$peer_hash'",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
}
],
"title": "$local_node_name Links",
"transformations": [
{
"id": "merge",
"options": {
"reducers": []
}
}
],
"transparent": true,
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-linkstate"
],
"templating": {
"list": [
{
"current": {
"selected": false,
"text": "yyz01-wxbb-crt01-lo0.webex.com",
"value": "367c22e4-57d9-2328-654b-96ea750e0267"
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0\n ",
"hide": 0,
"includeAll": false,
"label": "BGP Peer",
"multi": false,
"name": "peer_hash",
"options": [],
"query": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0\n ",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"current": {
"selected": false,
"text": "AMS10-WXBB-CRT02",
"value": "1ed1da6b-6f57-57aa-92f5-edda59049e9a"
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select name as __text, hash_id as __value from ls_nodes where peer_hash_id = '$peer_hash' and not igp_router_id ~ '\\..[1-9A-F]00$'",
"hide": 0,
"includeAll": false,
"label": "ISIS Node",
"multi": false,
"name": "local_node_hash_id",
"options": [],
"query": "select name as __text, hash_id as __value from ls_nodes where peer_hash_id = '$peer_hash' and not igp_router_id ~ '\\..[1-9A-F]00$'",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 5,
"tagValuesQuery": "",
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"current": {
"selected": false,
"text": "AMS10-WXBB-CRT02",
"value": "AMS10-WXBB-CRT02"
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "select name from ls_nodes where hash_id = '$local_node_hash_id' and peer_hash_id = '$peer_hash'",
"hide": 2,
"includeAll": false,
"multi": false,
"name": "local_node_name",
"options": [],
"query": "select name from ls_nodes where hash_id = '$local_node_hash_id' and peer_hash_id = '$peer_hash'",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
]
},
"timezone": "",
"title": "LS Links",
"uid": "MPqNG_sWz",
"version": 1,
"weekStart": ""
}

View File

@ -21,12 +21,24 @@
}
]
},
"description": "Combined BGP-LS node and link inventory for a selected BGP peer.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": 16,
"iteration": 1654877745288,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -54,7 +66,7 @@
"mode": "absolute",
"steps": [
{
"color": "green",
"color": "blue",
"value": null
}
]
@ -65,7 +77,7 @@
},
"gridPos": {
"h": 6,
"w": 3,
"w": 4,
"x": 0,
"y": 0
},
@ -84,36 +96,19 @@
"fields": "",
"values": false
},
"text": {},
"textMode": "auto"
},
"pluginVersion": "8.5.4",
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT count(*)\n FROM ls_nodes where peer_hash_id = '$peer_hash';",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Total Nodes",
@ -144,8 +139,8 @@
},
"gridPos": {
"h": 6,
"w": 7,
"x": 3,
"w": 10,
"x": 4,
"y": 0
},
"id": 8,
@ -173,33 +168,17 @@
"sort": "none"
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"hide": false,
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select floor(extract(epoch from max(timestamp))) as time,\n count(*) as count, \n CASE WHEN iswithdrawn THEN 'WITHDRAWN' ELSE 'ACTIVE' END as metric\nfrom ls_links\nwhere peer_hash_id = '$peer_hash'\ngroup by iswithdrawn\norder by time\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Link States",
@ -230,8 +209,8 @@
},
"gridPos": {
"h": 6,
"w": 7,
"x": 10,
"w": 10,
"x": 14,
"y": 0
},
"id": 9,
@ -259,32 +238,17 @@
"sort": "none"
}
},
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "time_series",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "select floor(extract(epoch from max(timestamp))) as time,\n count(*) as count, \n CASE WHEN mt_id = 2 THEN 'IPv6' ELSE 'IPv4' END as metric\nfrom ls_links \nwhere peer_hash_id = '$peer_hash'\ngroup by metric\norder by time\n",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Links by Type",
@ -389,33 +353,17 @@
},
"showHeader": true
},
"pluginVersion": "8.5.4",
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"group": [],
"metricColumn": "none",
"rawQuery": true,
"rawSql": "SELECT state, nodename, routerid, protocol, timestamp, seq\n FROM v_ls_nodes\n where peer_hash_id = '$peer_hash'\n\n ",
"refId": "A",
"select": [
[
{
"params": [
"value"
],
"type": "column"
}
]
],
"timeColumn": "time",
"where": [
{
"name": "$__timeFilter",
"params": [],
"type": "macro"
}
]
"refId": "A"
}
],
"title": "Backbone ISIS Nodes",
@ -434,24 +382,146 @@
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"displayMode": "auto",
"filterable": true,
"inspect": false
},
"decimals": 0,
"displayName": "",
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "none"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "timestamp"
},
"properties": [
{
"id": "displayName",
"value": "Time"
},
{
"id": "unit",
"value": "time: YYYY-MM-DD HH:mm:ss.SSS"
},
{
"id": "custom.align"
}
]
},
{
"matcher": {
"id": "byName",
"options": "seq"
},
"properties": [
{
"id": "unit",
"value": "locale"
}
]
},
{
"matcher": {
"id": "byName",
"options": "state"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background-solid"
},
{
"id": "mappings",
"value": [
{
"options": {
"ACTIVE": {
"color": "semi-dark-green",
"index": 0
},
"WITHDRAWN": {
"color": "semi-dark-red",
"index": 1
}
},
"type": "value"
}
]
}
]
}
]
},
"gridPos": {
"h": 2,
"h": 14,
"w": 24,
"x": 0,
"y": 19
},
"id": 2,
"options": {
"content": "\n\n",
"mode": "markdown"
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"pluginVersion": "8.5.4",
"type": "text"
"pluginVersion": "9.1.7",
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawQuery": true,
"rawSql": "SELECT state,local_router_name,local_igp_routerid,remote_router_name,remote_igp_routerid,mt_id,igp_metric,protocol, timestamp, seq\n FROM v_ls_links\n WHERE peer_hash_id = '$peer_hash'",
"refId": "A"
}
],
"title": "Backbone ISIS Links",
"transformations": [
{
"id": "merge",
"options": {
"reducers": []
}
}
],
"type": "table"
}
],
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"linkstate",
"obmp-linkstate"
],
"templating": {
@ -504,8 +574,8 @@
]
},
"timezone": "",
"title": "LS Nodes",
"title": "LS Nodes & Links",
"uid": "dzdSWlyWz",
"version": 1,
"version": 2,
"weekStart": ""
}
}

View File

@ -26,7 +26,19 @@
"graphTooltip": 0,
"id": 17,
"iteration": 1654877763755,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -265,6 +277,8 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"linkstate",
"obmp-linkstate"
],
"templating": {

View File

@ -26,7 +26,19 @@
"graphTooltip": 0,
"id": 23,
"iteration": 1654877522167,
"links": [],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"liveNow": false,
"panels": [
{
@ -116,6 +128,8 @@
"schemaVersion": 36,
"style": "dark",
"tags": [
"obmp-nav",
"linkstate",
"obmp-linkstate"
],
"templating": {

View File

@ -0,0 +1,761 @@
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "datasource",
"uid": "grafana"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
"id": null,
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
],
"panels": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": []
},
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 0
},
"id": 1,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT local_router_name as \"Local Router\", \n remote_router_name as \"Remote Router\",\n igp_metric as \"IGP Metric\",\n te_def_metric as \"TE Metric\",\n max_link_bw as \"Max BW (B/s)\",\n max_resv_bw as \"Max Reservable BW\",\n unreserved_bw as \"Unreserved BW\",\n admin_group as \"Admin Group\",\n protection_type as \"Protection\",\n srlg as \"SRLG\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nORDER BY local_router_name, remote_router_name",
"refId": "A"
}
],
"title": "TE Link Capacity Map",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
}
},
"overrides": []
},
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 10
},
"id": 2,
"options": {
"barRadius": 0,
"barWidth": 0.97,
"groupWidth": 0.7,
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"orientation": "auto",
"showValue": "auto",
"stacking": "none",
"tooltip": {
"mode": "single",
"sort": "none"
},
"xTickLabelRotation": -45
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT local_router_name || ' -> ' || remote_router_name as \"Link\",\n igp_metric as \"IGP Metric\",\n COALESCE(te_def_metric, igp_metric) as \"TE Metric\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nORDER BY igp_metric DESC",
"refId": "A"
}
],
"title": "IGP Metric vs TE Metric Comparison",
"type": "barchart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
}
},
"overrides": []
},
"gridPos": {
"h": 10,
"w": 6,
"x": 12,
"y": 10
},
"id": 3,
"options": {
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COALESCE(admin_group::text, 'None') as \"Admin Group\",\n COUNT(*) as \"Link Count\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nGROUP BY admin_group\nORDER BY \"Link Count\" DESC",
"refId": "A"
}
],
"title": "Admin Group Distribution",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
}
},
"overrides": []
},
"gridPos": {
"h": 10,
"w": 6,
"x": 18,
"y": 10
},
"id": 4,
"options": {
"legend": {
"displayMode": "list",
"placement": "bottom"
},
"pieType": "pie",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": true
},
"tooltip": {
"mode": "single",
"sort": "none"
}
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COALESCE(protection_type, 'None') as \"Protection Type\",\n COUNT(*) as \"Link Count\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nGROUP BY protection_type\nORDER BY \"Link Count\" DESC",
"refId": "A"
}
],
"title": "Link Protection Types",
"type": "piechart"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 20
},
"id": 5,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT nodename as \"Node\",\n routerid as \"Router ID\",\n protocol as \"Protocol\",\n sr_capabilities as \"SR Capabilities (SRGB)\"\nFROM v_ls_nodes\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nORDER BY nodename",
"refId": "A"
}
],
"title": "SR Node Capabilities",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 20
},
"id": 6,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT n.nodename as \"Node\",\n p.prefix::text as \"Prefix\",\n p.prefix_len as \"Len\",\n p.metric as \"Metric\",\n p.sr_prefix_sids as \"Prefix SID\",\n p.protocol::text as \"Protocol\"\nFROM ls_prefixes p\nJOIN ls_nodes n ON n.hash_id = p.local_node_hash_id \n AND n.peer_hash_id = p.peer_hash_id\nWHERE p.peer_hash_id = '$peer_hash' AND p.iswithdrawn = false\nORDER BY n.nodename, p.prefix",
"refId": "A"
}
],
"title": "SR Prefix SIDs",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 28
},
"id": 7,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT local_router_name as \"Local\",\n remote_router_name as \"Remote\",\n sr_adjacency_sids as \"Adjacency SIDs\",\n peer_node_sid as \"Peer Node SID\",\n mpls_proto_mask::text as \"MPLS Proto\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false\nORDER BY local_router_name, remote_router_name",
"refId": "A"
}
],
"title": "SR Adjacency SIDs",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"custom": {
"align": "auto",
"displayMode": "auto"
}
},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 28
},
"id": 8,
"options": {
"footer": {
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT srlg as \"SRLG Value\",\n COUNT(*) as \"Link Count\",\n string_agg(DISTINCT local_router_name || ' -> ' || remote_router_name, ', ') as \"Links\"\nFROM v_ls_links\nWHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false \n AND srlg IS NOT NULL AND srlg != ''\nGROUP BY srlg\nORDER BY COUNT(*) DESC",
"refId": "A"
}
],
"title": "SRLG Groups",
"type": "table"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 5,
"x": 0,
"y": 36
},
"id": 9,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) FROM v_ls_links WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false AND te_def_metric IS NOT NULL",
"refId": "A"
}
],
"title": "Links with TE Metric",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 5,
"x": 5,
"y": 36
},
"id": 10,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) FROM v_ls_links WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false AND max_link_bw IS NOT NULL AND max_link_bw > 0",
"refId": "A"
}
],
"title": "Links with Bandwidth",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 5,
"x": 10,
"y": 36
},
"id": 11,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) FROM v_ls_links WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false AND srlg IS NOT NULL AND srlg != ''",
"refId": "A"
}
],
"title": "Links with SRLG",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 5,
"x": 15,
"y": 36
},
"id": 12,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) FROM v_ls_nodes WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false AND sr_capabilities IS NOT NULL AND sr_capabilities != ''",
"refId": "A"
}
],
"title": "Nodes with SR",
"type": "stat"
},
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
}
},
"overrides": []
},
"gridPos": {
"h": 4,
"w": 4,
"x": 20,
"y": 36
},
"id": 13,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"textMode": "auto"
},
"targets": [
{
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"format": "table",
"rawSql": "SELECT COUNT(*) FROM v_ls_links WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false AND sr_adjacency_sids IS NOT NULL AND sr_adjacency_sids != ''",
"refId": "A"
}
],
"title": "Links with Adj SID",
"type": "stat"
},
{
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 40
},
"id": 14,
"options": {
"code": {
"language": "plaintext",
"showLineNumbers": false,
"showMiniMap": false
},
"content": "## Traffic Engineering & Segment Routing Analytics\n\nThis dashboard exposes TE and SR attributes from BGP-LS (RFC 7752) that OpenBMP collects but existing dashboards don't display.\n\n### TE Fields (from ls_links)\n- **admin_group**: Link color/affinity bitmap for RSVP-TE constraints\n- **max_link_bw / max_resv_bw**: Link capacity in bytes/sec\n- **unreserved_bw**: Available bandwidth per priority level\n- **te_def_metric**: TE metric (may differ from IGP metric)\n- **protection_type**: FRR protection (unprotected, shared, dedicated, etc.)\n- **srlg**: Shared Risk Link Group for diverse path computation\n\n### SR Fields\n- **sr_capabilities**: Node SRGB (Segment Routing Global Block) range\n- **sr_prefix_sids**: Prefix SID for SR-MPLS forwarding\n- **sr_adjacency_sids**: Adjacency SIDs for SR-TE path steering\n- **peer_node_sid**: BGP EPE SID (RFC 9086)\n\n### Notes\n- NULL values indicate the router is not advertising that TLV\n- To enable TE metrics on IOS-XR: `mpls traffic-eng` under IS-IS\n- To enable SR: `segment-routing mpls` under IS-IS with prefix-sid-map",
"mode": "markdown"
},
"title": "About This Dashboard",
"type": "text"
}
],
"schemaVersion": 39,
"tags": [
"obmp-learning",
"obmp",
"obmp-nav"
],
"templating": {
"list": [
{
"current": {},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"definition": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0",
"hide": 0,
"includeAll": false,
"label": "BGP Peer",
"multi": false,
"name": "peer_hash",
"options": [],
"query": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0",
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"type": "query"
}
]
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "TE & Segment Routing Analytics",
"uid": "obmp-learn-08",
"version": 1
}

View File

@ -0,0 +1,369 @@
{
"uid": "obmp-learn-09",
"title": "Topology Change & Anomaly Detection",
"tags": [
"obmp-learning",
"obmp",
"obmp-nav"
],
"editable": true,
"schemaVersion": 39,
"time": {
"from": "now-6h",
"to": "now"
},
"templating": {
"list": [
{
"name": "peer_hash",
"label": "BGP Peer",
"type": "query",
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"query": "SELECT __text,__value FROM (\n select peername as __text, peer_hash_id as __value, count(*) as count\n from v_ls_nodes\n group by peername,peer_hash_id) d\nwhere count > 0",
"refresh": 1,
"multi": false
}
]
},
"panels": [
{
"id": 1,
"title": "Link State Changes Over Time",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 0
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT $__timeGroupAlias(timestamp, '5m') as time,\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) as \"Links Up\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) as \"Links Down\"\nFROM ls_links_log\nWHERE $__timeFilter(timestamp) AND peer_hash_id = '$peer_hash'\nGROUP BY 1 ORDER BY 1",
"format": "time_series",
"refId": "A"
}
]
},
{
"id": 2,
"title": "Node Changes Over Time",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 0
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT $__timeGroupAlias(timestamp, '5m') as time,\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) as \"Nodes Appeared\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) as \"Nodes Withdrawn\"\nFROM ls_nodes_log\nWHERE $__timeFilter(timestamp) AND peer_hash_id = '$peer_hash'\nGROUP BY 1 ORDER BY 1",
"format": "time_series",
"refId": "A"
}
]
},
{
"id": 3,
"title": "BGP Peer Session Events",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 8
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT $__timeGroupAlias(pel.timestamp, '5m') as time,\n SUM(CASE WHEN pel.state = 'up' THEN 1 ELSE 0 END) as \"Sessions Up\",\n SUM(CASE WHEN pel.state = 'down' THEN 1 ELSE 0 END) as \"Sessions Down\"\nFROM peer_event_log pel\nWHERE $__timeFilter(pel.timestamp)\nGROUP BY 1 ORDER BY 1",
"format": "time_series",
"refId": "A"
}
]
},
{
"id": 4,
"title": "RIB Update Rate",
"type": "timeseries",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 8
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT $__timeGroupAlias(timestamp, '5m') as time,\n SUM(CASE WHEN iswithdrawn = false THEN 1 ELSE 0 END) as \"Advertisements\",\n SUM(CASE WHEN iswithdrawn = true THEN 1 ELSE 0 END) as \"Withdrawals\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY 1 ORDER BY 1",
"format": "time_series",
"refId": "A"
}
]
},
{
"id": 5,
"title": "Origin AS Changes (Potential Hijacks)",
"type": "table",
"gridPos": {
"h": 10,
"w": 12,
"x": 0,
"y": 16
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT DISTINCT ON (r1.prefix, r1.prefix_len)\n r1.prefix::text as \"Prefix\",\n r1.prefix_len as \"Len\",\n r1.origin_as as \"Current Origin AS\",\n r2.origin_as as \"Previous Origin AS\",\n r1.timestamp as \"Changed At\"\nFROM ip_rib_log r1\nJOIN ip_rib_log r2 ON r1.prefix = r2.prefix \n AND r1.prefix_len = r2.prefix_len\n AND r1.timestamp > r2.timestamp\nWHERE r1.origin_as != r2.origin_as\n AND $__timeFilter(r1.timestamp)\nORDER BY r1.prefix, r1.prefix_len, r1.timestamp DESC\nLIMIT 50",
"format": "table",
"refId": "A"
}
]
},
{
"id": 6,
"title": "Most Churned Prefixes",
"type": "table",
"gridPos": {
"h": 10,
"w": 12,
"x": 12,
"y": 16
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT prefix::text as \"Prefix\",\n prefix_len as \"Len\",\n COUNT(*) as \"Total Updates\",\n SUM(CASE WHEN iswithdrawn THEN 1 ELSE 0 END) as \"Withdrawals\",\n MIN(timestamp) as \"First Seen\",\n MAX(timestamp) as \"Last Change\",\n CASE \n WHEN COUNT(*) <= 2 THEN 'Stable'\n WHEN COUNT(*) <= 10 THEN 'Moderate'\n ELSE 'Unstable'\n END as \"Stability\"\nFROM ip_rib_log\nWHERE $__timeFilter(timestamp)\nGROUP BY prefix, prefix_len\nHAVING COUNT(*) > 1\nORDER BY COUNT(*) DESC\nLIMIT 30",
"format": "table",
"refId": "A"
}
]
},
{
"id": 7,
"title": "Recent Link State Changes",
"type": "table",
"gridPos": {
"h": 10,
"w": 24,
"x": 0,
"y": 26
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT l.timestamp as \"Time\",\n CASE WHEN l.iswithdrawn THEN 'DOWN' ELSE 'UP' END as \"State\",\n ln.name as \"Local Node\",\n l.local_igp_router_id as \"Local IGP ID\",\n rn.name as \"Remote Node\",\n l.remote_igp_router_id as \"Remote IGP ID\",\n l.igp_metric as \"IGP Metric\",\n l.protocol::text as \"Protocol\"\nFROM ls_links_log l\nLEFT JOIN ls_nodes ln ON ln.hash_id = l.local_node_hash_id AND ln.peer_hash_id = l.peer_hash_id\nLEFT JOIN ls_nodes rn ON rn.hash_id = l.remote_node_hash_id AND rn.peer_hash_id = l.peer_hash_id\nWHERE $__timeFilter(l.timestamp) AND l.peer_hash_id = '$peer_hash'\nORDER BY l.timestamp DESC\nLIMIT 50",
"format": "table",
"refId": "A"
}
]
},
{
"id": 8,
"title": "Multi-Peer Route Consistency",
"type": "table",
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 36
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT r.prefix::text as \"Prefix\",\n r.prefix_len as \"Len\",\n COUNT(DISTINCT r.peer_hash_id) as \"Peer Count\",\n COUNT(DISTINCT ba.origin_as) as \"Distinct Origins\",\n COUNT(DISTINCT ba.as_path_count) as \"Distinct Path Lengths\",\n string_agg(DISTINCT ba.origin_as::text, ', ') as \"Origin ASNs\"\nFROM ip_rib r\nJOIN base_attrs ba ON ba.hash_id = r.base_attr_hash_id\nWHERE r.iswithdrawn = false AND r.isipv4 = true\nGROUP BY r.prefix, r.prefix_len\nHAVING COUNT(DISTINCT ba.origin_as) > 1\nORDER BY COUNT(DISTINCT ba.origin_as) DESC\nLIMIT 30",
"format": "table",
"refId": "A"
}
]
},
{
"id": 9,
"title": "Active Peers",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 0,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(*) FROM bgp_peers WHERE state = 'up'",
"format": "table",
"refId": "A"
}
]
},
{
"id": 10,
"title": "Total LS Links",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 4,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(*) FROM ls_links WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false",
"format": "table",
"refId": "A"
}
]
},
{
"id": 11,
"title": "Total LS Nodes",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 8,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(*) FROM ls_nodes WHERE peer_hash_id = '$peer_hash' AND iswithdrawn = false",
"format": "table",
"refId": "A"
}
]
},
{
"id": 12,
"title": "RIB Updates (24h)",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 12,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(*) FROM ip_rib_log WHERE timestamp > NOW() - INTERVAL '24 hours'",
"format": "table",
"refId": "A"
}
]
},
{
"id": 13,
"title": "Link Changes (24h)",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 16,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(*) FROM ls_links_log WHERE timestamp > NOW() - INTERVAL '24 hours' AND peer_hash_id = '$peer_hash'",
"format": "table",
"refId": "A"
}
]
},
{
"id": 14,
"title": "Origin Changes (24h)",
"type": "stat",
"gridPos": {
"h": 4,
"w": 4,
"x": 20,
"y": 44
},
"datasource": {
"type": "postgres",
"uid": "obmp_postgres"
},
"targets": [
{
"rawSql": "SELECT COUNT(DISTINCT r1.prefix) FROM ip_rib_log r1\nJOIN ip_rib_log r2 ON r1.prefix = r2.prefix AND r1.prefix_len = r2.prefix_len AND r1.timestamp > r2.timestamp\nWHERE r1.origin_as != r2.origin_as AND r1.timestamp > NOW() - INTERVAL '24 hours'",
"format": "table",
"refId": "A"
}
]
},
{
"id": 15,
"title": "About This Dashboard",
"type": "text",
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 36
},
"options": {
"mode": "markdown",
"content": "## Topology Change & Anomaly Detection\n\nThis dashboard provides heuristic analysis of BMP data to detect network anomalies:\n\n### What to Watch For\n- **Link flaps**: Rapid up/down cycles in the Link State Changes panel indicate instability\n- **Origin AS changes**: Could indicate a route hijack or legitimate migration\n- **Multi-origin prefixes**: Same prefix seen from different origin ASNs across peers\n- **Correlated events**: Peer session drops followed by mass withdrawals indicate convergence events\n\n### Testing with ExaBGP Scenarios\n1. Load `origin_shift` scenario to simulate origin AS changes\n2. Load `hijack_simulation` to see how shorter paths override legitimate routes\n3. Load/unload `churn` scenario repeatedly to generate instability patterns\n\n### Data Sources\n- **ls_links_log / ls_nodes_log**: TimescaleDB hypertables tracking all BGP-LS topology changes\n- **ip_rib_log**: All BGP RIB updates and withdrawals with timestamps\n- **peer_event_log**: BGP session state changes (up/down events)"
}
}
],
"links": [
{
"asDropdown": true,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"obmp-nav"
],
"title": "OBMP Dashboards",
"type": "dashboards"
}
]
}

View File

@ -0,0 +1,71 @@
# OpenBMP — Grafana contact points & notification policy provisioning
# Grafana 9.1.7 (apiVersion: 1)
#
# Defines WHERE alert notifications go (contact points) and WHICH alerts go
# there (the notification policy tree). Pairs with obmp-alerts.yaml in this
# directory.
#
# ----------------------------------------------------------------------
# OPERATOR REVIEW — this file ships with PLACEHOLDERS. Fill them in.
# ----------------------------------------------------------------------
# * The 'obmp-ops' contact point below has BOTH an email and a webhook
# receiver as examples. Delete whichever you do not use and fill in real
# values for the one you keep.
# * EMAIL requires Grafana SMTP to be configured (the [smtp] section of
# grafana.ini, or GF_SMTP_* env vars on the obmp-grafana container).
# Without working SMTP the email receiver silently fails.
# * WEBHOOK url: point it at your alerting system (Slack incoming webhook,
# PagerDuty Events API, Mattermost, an internal handler, etc.).
# * After editing, restart Grafana and verify under
# Alerting > Contact points > (test).
# ----------------------------------------------------------------------
apiVersion: 1
# --- Contact points ----------------------------------------------------
contactPoints:
- orgId: 1
name: obmp-ops
receivers:
# ---- Email receiver (requires Grafana SMTP configured) ----
- uid: obmp-ops-email
type: email
settings:
# REPLACE with the real NOC / on-call distribution address(es).
# Comma-separate multiple recipients.
addresses: noc@example.net
singleEmail: false
disableResolveMessage: false
# ---- Webhook receiver (Slack / PagerDuty / internal handler) ----
# Delete this block if you only use email.
- uid: obmp-ops-webhook
type: webhook
settings:
# REPLACE with your real webhook endpoint.
url: https://hooks.example.net/services/REPLACE-ME
httpMethod: POST
disableResolveMessage: false
# --- Notification policy tree -----------------------------------------
# The root policy routes every alert from obmp-alerts.yaml to 'obmp-ops'.
# Sub-routes split by the `severity` label so critical alerts can page
# faster / repeat sooner than warnings.
policies:
- orgId: 1
receiver: obmp-ops
# Group alerts that share these labels into a single notification.
group_by: ['alertname', 'service']
# Timing for the default (warning-ish) path.
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
routes:
# Critical alerts (peer down, router BMP down): notify fast, repeat
# more often until resolved.
- receiver: obmp-ops
matchers:
- severity = critical
group_wait: 10s
group_interval: 2m
repeat_interval: 1h

View File

@ -0,0 +1,270 @@
# OpenBMP — Grafana unified-alerting rule provisioning
# Grafana 9.1.7 (apiVersion: 1)
#
# Provisioned alert rules for the OpenBMP BGP-monitoring stack. They query the
# PostgreSQL datasource (uid: obmp_postgres) and fire on BGP peer/router
# session loss, peer flap storms, and RPKI-invalid routes.
#
# ----------------------------------------------------------------------
# DEPLOYMENT
# ----------------------------------------------------------------------
# This file is read by Grafana from /etc/grafana/provisioning/alerting/.
# The compose stack bind-mounts ${OBMP_DATA_ROOT}/grafana/provisioning into
# the container, so copy this directory there and restart Grafana:
#
# cp -r obmp-grafana/provisioning/alerting ${OBMP_DATA_ROOT}/grafana/provisioning/
# docker compose -p obmp restart grafana
#
# Pair it with contact-points.yaml (in this directory) for notifications.
#
# ----------------------------------------------------------------------
# OPERATOR REVIEW — fields you should check before relying on these
# ----------------------------------------------------------------------
# * folderUID: '1001' — reuses the existing 'OBMP-Base' dashboard folder so
# the rules have a home in the UI. Change it to a dedicated alerting
# folder UID if you prefer; the folder must already exist in Grafana.
# * datasourceUid: obmp_postgres — confirmed correct for this stack.
# * Thresholds and `for:` durations below are reasonable starting points.
# Tune them against your production baseline (40 full-table routers will
# have a different normal flap/churn profile than the lab).
# * The reduce/threshold expression UIDs (B, C) and refIds are internal to
# each rule; do not rename them without updating the matching references.
# * Alert-rule provisioning YAML is intricate. These definitions are
# intentionally minimal and well-commented. After first load, open each
# rule in the Grafana UI (Alerting > Alert rules) and confirm it
# evaluates without error before depending on it for paging.
# ----------------------------------------------------------------------
apiVersion: 1
groups:
- orgId: 1
name: OpenBMP BGP Health
folder: OBMP-Base
# How often every rule in this group is evaluated.
interval: 1m
rules:
# ------------------------------------------------------------------
# (a) BGP peer down within the last 15 minutes
# ------------------------------------------------------------------
# bgp_peers.state is an enum ('up'/'down'); .timestamp is the last
# state-change time. A peer whose state is 'down' AND changed within
# the last 15 min indicates a recent session loss.
- uid: obmp-peer-down
title: BGP Peer Down (recent)
condition: C
for: 5m
data:
- refId: A
relativeTimeRange: { from: 600, to: 0 }
datasourceUid: obmp_postgres
model:
refId: A
datasource: { type: postgres, uid: obmp_postgres }
format: table
rawSql: >
SELECT count(*)::float8 AS value
FROM bgp_peers
WHERE state = 'down'
AND timestamp > (now() AT TIME ZONE 'utc') - interval '15 minutes';
- refId: B
datasourceUid: __expr__
model:
refId: B
type: reduce
datasource: { type: __expr__, uid: __expr__ }
expression: A
reducer: last
- refId: C
datasourceUid: __expr__
model:
refId: C
type: threshold
datasource: { type: __expr__, uid: __expr__ }
expression: B
# Fire when one or more peers went down in the last 15 min.
conditions:
- evaluator: { type: gt, params: [0] }
labels:
severity: critical
service: bmp
annotations:
summary: One or more BGP peers went down in the last 15 minutes
description: >
{{ $values.B }} BGP peer(s) are in state 'down' with a state
change within the last 15 minutes. Check the OBMP peer
inventory and the affected routers.
# ------------------------------------------------------------------
# (b) Peer flap storm — >5 down-events for one peer in 1 hour
# ------------------------------------------------------------------
# peer_event_log records every peer state transition. Counting 'down'
# events per peer over the last hour detects a flapping session even
# if the peer is currently 'up'. The inner query groups per peer; the
# outer takes the worst offender's count.
- uid: obmp-peer-flap-storm
title: BGP Peer Flap Storm
condition: C
for: 0m
data:
- refId: A
relativeTimeRange: { from: 3600, to: 0 }
datasourceUid: obmp_postgres
model:
refId: A
datasource: { type: postgres, uid: obmp_postgres }
format: table
rawSql: >
SELECT coalesce(max(c), 0)::float8 AS value
FROM (
SELECT count(*) AS c
FROM peer_event_log
WHERE state = 'down'
AND timestamp > (now() AT TIME ZONE 'utc') - interval '1 hour'
GROUP BY peer_hash_id
) s;
- refId: B
datasourceUid: __expr__
model:
refId: B
type: reduce
datasource: { type: __expr__, uid: __expr__ }
expression: A
reducer: last
- refId: C
datasourceUid: __expr__
model:
refId: C
type: threshold
datasource: { type: __expr__, uid: __expr__ }
expression: B
# >5 down-events for a single peer within 1h = flap storm.
conditions:
- evaluator: { type: gt, params: [5] }
labels:
severity: warning
service: bmp
annotations:
summary: A BGP peer is flapping (more than 5 resets in the last hour)
description: >
At least one peer has logged {{ $values.B }} 'down' events in
peer_event_log within the last hour. Investigate link/session
instability on the affected peer.
# ------------------------------------------------------------------
# (c) RPKI-invalid routes present
# ------------------------------------------------------------------
# ip_rib has no RPKI column on this schema, so validity is derived by
# joining against rpki_validator (ROA cache, refreshed by the psql-app
# RPKI cron). A route is "invalid" when a covering ROA exists for the
# prefix but NO ROA matches its origin AS.
#
# NOTE: rpki_validator is empty until ENABLE_RPKI=1 has run at least
# once (every ~2h). Until then this rule correctly reports 0.
- uid: obmp-rpki-invalid
title: RPKI-Invalid Routes Present
condition: C
for: 10m
data:
- refId: A
relativeTimeRange: { from: 600, to: 0 }
datasourceUid: obmp_postgres
model:
refId: A
datasource: { type: postgres, uid: obmp_postgres }
format: table
rawSql: >
SELECT count(*)::float8 AS value
FROM ip_rib r
WHERE r.iswithdrawn = false
AND r.origin_as IS NOT NULL
AND EXISTS (
SELECT 1 FROM rpki_validator v
WHERE r.prefix <<= v.prefix
AND r.prefix_len BETWEEN masklen(v.prefix) AND v.prefix_len_max
)
AND NOT EXISTS (
SELECT 1 FROM rpki_validator v2
WHERE r.prefix <<= v2.prefix
AND r.prefix_len BETWEEN masklen(v2.prefix) AND v2.prefix_len_max
AND v2.origin_as = r.origin_as
);
- refId: B
datasourceUid: __expr__
model:
refId: B
type: reduce
datasource: { type: __expr__, uid: __expr__ }
expression: A
reducer: last
- refId: C
datasourceUid: __expr__
model:
refId: C
type: threshold
datasource: { type: __expr__, uid: __expr__ }
expression: B
# Any RPKI-invalid route is worth surfacing. Raise the param
# (e.g. to 10) if you expect a steady-state baseline of
# invalids and only want to alert on spikes.
conditions:
- evaluator: { type: gt, params: [0] }
labels:
severity: warning
service: routing-security
annotations:
summary: RPKI-invalid routes are present in the RIB
description: >
{{ $values.B }} route(s) in ip_rib are RPKI-invalid (a covering
ROA exists but none matches the route's origin AS). Possible
mis-origination or hijack — review the RPKI Validation dashboard.
# ------------------------------------------------------------------
# (d) Router BMP session down
# ------------------------------------------------------------------
# routers.state is the BMP session state for each monitored router.
# 'down' means the router's BMP feed to the collector has dropped.
- uid: obmp-router-bmp-down
title: Router BMP Session Down
condition: C
for: 5m
data:
- refId: A
relativeTimeRange: { from: 600, to: 0 }
datasourceUid: obmp_postgres
model:
refId: A
datasource: { type: postgres, uid: obmp_postgres }
format: table
rawSql: >
SELECT count(*)::float8 AS value
FROM routers
WHERE state = 'down';
- refId: B
datasourceUid: __expr__
model:
refId: B
type: reduce
datasource: { type: __expr__, uid: __expr__ }
expression: A
reducer: last
- refId: C
datasourceUid: __expr__
model:
refId: C
type: threshold
datasource: { type: __expr__, uid: __expr__ }
expression: B
# Any router with a down BMP session.
conditions:
- evaluator: { type: gt, params: [0] }
labels:
severity: critical
service: bmp
annotations:
summary: One or more routers have a down BMP session
description: >
{{ $values.B }} router(s) are in BMP state 'down' — the
collector is no longer receiving BMP from them. Check the
router BMP config and reachability to the collector on port 5000.

View File

@ -27,7 +27,7 @@ providers:
# <int> Org id. Default to 1
orgId: 1
# <string> name of the dashboard folder.
folder: 'OBMP-Base'
folder: 'OBMP-Operations'
# <string> folder UID. will be automatically generated if not specified
folderUid: '1001'
# <string> provider type. Default to 'file'
@ -47,7 +47,7 @@ providers:
# <int> Org id. Default to 1
orgId: 1
# <string> name of the dashboard folder.
folder: 'OBMP-History'
folder: 'OBMP-Routing'
# <string> folder UID. will be automatically generated if not specified
folderUid: '1002'
# <string> provider type. Default to 'file'
@ -63,26 +63,6 @@ providers:
path: /var/lib/grafana/dashboards/obmp/History-1002
# <bool> use folder names from filesystem to create folders in Grafana
foldersFromFilesStructure: false
- name: 'OpenBMP-Tops'
# <int> Org id. Default to 1
orgId: 1
# <string> name of the dashboard folder.
folder: 'OBMP-Tops'
# <string> folder UID. will be automatically generated if not specified
folderUid: '1003'
# <string> provider type. Default to 'file'
type: file
# <bool> disable dashboard deletion
disableDeletion: false
# <int> how often Grafana will scan for changed dashboards
updateIntervalSeconds: 30
# <bool> allow updating provisioned dashboards from the UI
allowUiUpdates: true
options:
# <string, required> path to dashboard files on disk. Required when using the 'file' type
path: /var/lib/grafana/dashboards/obmp/Tops-1003
# <bool> use folder names from filesystem to create folders in Grafana
foldersFromFilesStructure: false
- name: 'OpenBMP-LinkState'
# <int> Org id. Default to 1
orgId: 1
@ -125,7 +105,7 @@ providers:
foldersFromFilesStructure: false
- name: 'OpenBMP-Learning'
orgId: 1
folder: 'OBMP-Learning'
folder: 'OBMP-Reference'
folderUid: '2001'
type: file
disableDeletion: false
@ -133,4 +113,15 @@ providers:
allowUiUpdates: true
options:
path: /var/lib/grafana/dashboards/Learning
foldersFromFilesStructure: false
- name: 'OpenBMP-Telemetry'
orgId: 1
folder: 'OBMP-Telemetry'
folderUid: '3001'
type: file
disableDeletion: false
updateIntervalSeconds: 30
allowUiUpdates: true
options:
path: /var/lib/grafana/dashboards/Telemetry-3001
foldersFromFilesStructure: false

View File

@ -0,0 +1,16 @@
apiVersion: 1
datasources:
- name: InfluxDB-Telemetry
uid: obmp_influxdb
type: influxdb
access: proxy
url: http://obmp-influxdb:8086
jsonData:
version: Flux
organization: openbmp
defaultBucket: telemetry
secureJsonData:
token: openbmp-telemetry-token
isDefault: false
editable: true

106
portal/index.html Normal file
View File

@ -0,0 +1,106 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>OpenBMP Lab Portal</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
background: #111217;
color: #d8dee9;
min-height: 100vh;
display: flex;
flex-direction: column;
align-items: center;
padding: 2rem;
}
.header {
text-align: center;
margin-bottom: 2.5rem;
}
.header h1 {
font-size: 1.8rem;
color: #e2e8f0;
margin-bottom: 0.5rem;
}
.header p {
color: #7b8da0;
font-size: 0.95rem;
}
.grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(260px, 1fr));
gap: 1.25rem;
max-width: 900px;
width: 100%;
}
.card {
background: #1a1d26;
border: 1px solid #2a2e3a;
border-radius: 10px;
padding: 1.5rem;
text-decoration: none;
color: inherit;
transition: border-color 0.2s, transform 0.15s;
}
.card:hover {
border-color: #3b82f6;
transform: translateY(-2px);
}
.card .icon {
font-size: 2rem;
margin-bottom: 0.75rem;
display: block;
}
.card h2 {
font-size: 1.1rem;
color: #e2e8f0;
margin-bottom: 0.4rem;
}
.card p {
font-size: 0.85rem;
color: #7b8da0;
line-height: 1.4;
}
.footer {
margin-top: 3rem;
color: #4a5568;
font-size: 0.8rem;
text-align: center;
}
</style>
</head>
<body>
<div class="header">
<h1>OpenBMP Lab</h1>
<p>BGP Monitoring Protocol &middot; Route Analysis &middot; Telemetry</p>
</div>
<div class="grid">
<a href="/grafana/" class="card">
<span class="icon">&#x1F4CA;</span>
<h2>Grafana Dashboards</h2>
<p>BGP analytics, RR Loc-RIB diff, IS-IS topology, telemetry, and 27+ dashboards.</p>
</a>
<a href="/exabgp/" class="card">
<span class="icon">&#x1F6E4;</span>
<h2>ExaBGP Route Injector</h2>
<p>Inject and withdraw BGP routes into the lab fabric via ExaBGP API.</p>
</a>
<a href="/traffic/" class="card">
<span class="icon">&#x1F680;</span>
<h2>Traffic Generator</h2>
<p>RFC 2544 throughput, latency, and loss testing across the network.</p>
</a>
</div>
<div class="footer">
OpenBMP Docker Stack &middot; 9 IOS-XR Routers &middot; CML Lab
</div>
</body>
</html>

105
scripts/pg-backup.sh Executable file
View File

@ -0,0 +1,105 @@
#!/usr/bin/env bash
#
# pg-backup.sh — logical backup of the OpenBMP PostgreSQL database.
#
# Performs a `pg_dump` of the `openbmp` database inside the obmp-psql
# container, writes a timestamped compressed dump to a backup directory,
# and prunes dumps older than the configured retention.
#
# Usage:
# ./pg-backup.sh
#
# Configuration (environment variables, all optional):
# OBMP_DATA_ROOT Base data dir. Default: /var/openbmp
# Backups go to ${OBMP_DATA_ROOT}/backups unless
# OBMP_BACKUP_DIR is set.
# OBMP_BACKUP_DIR Explicit backup directory. Overrides the default.
# OBMP_PG_CONTAINER Postgres container name. Default: obmp-psql
# OBMP_PG_DB Database name. Default: openbmp
# OBMP_PG_USER Database user. Default: openbmp
# OBMP_BACKUP_RETENTION_DAYS Prune dumps older than N days. Default: 14
#
# Output format:
# pg_dump custom format (-Fc), gzip-level compressed by pg_dump itself.
# Restore with `pg_restore` — see docs/backup-restore.md.
#
# This script is idempotent and safe to run repeatedly. It does not stop
# the database; pg_dump takes a consistent MVCC snapshot of a live DB.
#
# Make it executable once:
# chmod +x scripts/pg-backup.sh
#
# ----------------------------------------------------------------------
# Scheduling via cron
# ----------------------------------------------------------------------
# Run `crontab -e` and add (daily at 02:30, log to a file):
#
# 30 2 * * * OBMP_DATA_ROOT=/var/openbmp /home/user/obmp-docker/scripts/pg-backup.sh >> /var/openbmp/backups/pg-backup.log 2>&1
#
# The script must be able to reach the Docker daemon, so run it as a user
# in the `docker` group (or root). For systemd-based hosts a
# systemd timer is an equally good alternative to cron.
# ----------------------------------------------------------------------
set -euo pipefail
# --- Configuration -----------------------------------------------------
OBMP_DATA_ROOT="${OBMP_DATA_ROOT:-/var/openbmp}"
BACKUP_DIR="${OBMP_BACKUP_DIR:-${OBMP_DATA_ROOT}/backups}"
PG_CONTAINER="${OBMP_PG_CONTAINER:-obmp-psql}"
PG_DB="${OBMP_PG_DB:-openbmp}"
PG_USER="${OBMP_PG_USER:-openbmp}"
RETENTION_DAYS="${OBMP_BACKUP_RETENTION_DAYS:-14}"
TIMESTAMP="$(date +%Y%m%d-%H%M%S)"
DUMP_NAME="openbmp-${TIMESTAMP}.dump"
DUMP_PATH="${BACKUP_DIR}/${DUMP_NAME}"
DUMP_TMP="${DUMP_PATH}.partial"
log() { printf '%s [pg-backup] %s\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "$*"; }
fail() { log "ERROR: $*" >&2; exit 1; }
# --- Pre-flight checks -------------------------------------------------
command -v docker >/dev/null 2>&1 || fail "docker command not found in PATH"
if ! docker inspect -f '{{.State.Running}}' "${PG_CONTAINER}" 2>/dev/null | grep -q true; then
fail "container '${PG_CONTAINER}' is not running"
fi
mkdir -p "${BACKUP_DIR}" || fail "cannot create backup directory ${BACKUP_DIR}"
# --- Backup ------------------------------------------------------------
# Write to a .partial file first, then atomically rename on success so a
# crashed/interrupted run never leaves a truncated dump that looks valid.
log "starting backup of database '${PG_DB}' from container '${PG_CONTAINER}'"
if docker exec "${PG_CONTAINER}" \
pg_dump -U "${PG_USER}" -d "${PG_DB}" -Fc --no-owner --no-privileges \
> "${DUMP_TMP}"; then
mv -f "${DUMP_TMP}" "${DUMP_PATH}"
else
rm -f "${DUMP_TMP}"
fail "pg_dump failed; no backup written"
fi
DUMP_SIZE="$(du -h "${DUMP_PATH}" | cut -f1)"
log "backup complete: ${DUMP_PATH} (${DUMP_SIZE})"
# --- Prune old backups -------------------------------------------------
# Only prune files matching our own naming pattern, so nothing else in the
# directory (logs, manual dumps) is touched.
log "pruning dumps older than ${RETENTION_DAYS} days"
PRUNED=0
while IFS= read -r -d '' old; do
rm -f "${old}"
log " removed $(basename "${old}")"
PRUNED=$((PRUNED + 1))
done < <(find "${BACKUP_DIR}" -maxdepth 1 -type f \
-name 'openbmp-*.dump' -mtime "+${RETENTION_DAYS}" -print0)
log "pruned ${PRUNED} old dump(s)"
# Also clean up any stale .partial files from previous crashed runs.
find "${BACKUP_DIR}" -maxdepth 1 -type f -name 'openbmp-*.dump.partial' \
-mtime +1 -delete 2>/dev/null || true
log "done"

127
setup.sh Executable file
View File

@ -0,0 +1,127 @@
#!/usr/bin/env bash
#
# OpenBMP stack bootstrap — idempotent. Safe to run against an existing
# deployment: every step is guarded and never overwrites live config.
#
# cp .env.example .env # first run only
# $EDITOR .env # fill in HOST_IP, OBMP_DOMAIN, ...
# ./setup.sh
# docker compose up -d # collector core
# docker compose --profile test --profile auth up -d # full stack
#
set -euo pipefail
cd "$(dirname "$0")"
AUTHELIA_IMAGE="authelia/authelia:4.38"
# --- .env -------------------------------------------------------------------
if [ ! -f .env ]; then
cp .env.example .env
echo "Created .env from .env.example."
echo "Edit it (HOST_IP, OBMP_DOMAIN, OBMP_COOKIE_DOMAIN, credentials), then re-run ./setup.sh"
exit 1
fi
# Read a single KEY=value from .env without sourcing it — .env contains keys
# with hyphens (PROX-CML_*) that a shell `source` would choke on.
get_env() { grep -E "^$1=" .env | head -1 | cut -d= -f2- || true; }
# Set KEY=value in .env: replace the line if present, else append.
set_env() {
local key="$1" val="$2"
if grep -qE "^${key}=" .env; then
# `|` delimiter — values are hex, no `|`.
sed -i "s|^${key}=.*|${key}=${val}|" .env
else
printf '%s=%s\n' "$key" "$val" >> .env
fi
}
OBMP_DATA_ROOT="$(get_env OBMP_DATA_ROOT)"
OBMP_DATA_ROOT="${OBMP_DATA_ROOT:-/var/openbmp}"
OBMP_DOMAIN="$(get_env OBMP_DOMAIN)"
OBMP_COOKIE_DOMAIN="$(get_env OBMP_COOKIE_DOMAIN)"
HOST_IP="$(get_env HOST_IP)"
# --- validate ---------------------------------------------------------------
fail=0
for var in HOST_IP OBMP_DOMAIN OBMP_COOKIE_DOMAIN; do
val="$(get_env "$var")"
if [ -z "$val" ] || [[ "$val" == changeme* ]]; then
echo "ERROR: $var is unset or still 'changeme' in .env" >&2
fail=1
fi
done
[ "$fail" -eq 0 ] || { echo "Fix .env and re-run." >&2; exit 1; }
# --- privilege helper -------------------------------------------------------
# $OBMP_DATA_ROOT is often root-owned (e.g. /var/openbmp). Use sudo only if the
# current user cannot write the parent directory.
parent="$(dirname "$OBMP_DATA_ROOT")"
SUDO=""
if [ ! -w "$parent" ] || { [ -d "$OBMP_DATA_ROOT" ] && [ ! -w "$OBMP_DATA_ROOT" ]; }; then
SUDO="sudo"
echo "Note: using sudo for filesystem setup under $OBMP_DATA_ROOT"
fi
# --- data-root directory tree -----------------------------------------------
echo "Creating data tree under $OBMP_DATA_ROOT ..."
for d in config grafana grafana/provisioning kafka-data zk-data zk-log \
postgres/data postgres/ts influxdb authelia; do
$SUDO mkdir -p "$OBMP_DATA_ROOT/$d"
done
# Container processes run as assorted UIDs; lab-permissive perms.
$SUDO chmod -R 777 "$OBMP_DATA_ROOT" 2>/dev/null || true
# --- Grafana provisioning ---------------------------------------------------
echo "Syncing Grafana provisioning ..."
$SUDO cp -r obmp-grafana/provisioning/. "$OBMP_DATA_ROOT/grafana/provisioning/"
# --- Authelia secrets -------------------------------------------------------
# Generate only if blank/absent; never overwrite an existing value.
for key in AUTHELIA_SESSION_SECRET AUTHELIA_JWT_SECRET AUTHELIA_STORAGE_ENCRYPTION_KEY; do
cur="$(get_env "$key")"
if [ -z "$cur" ]; then
set_env "$key" "$(openssl rand -hex 32)"
echo "Generated $key"
fi
done
AUTHELIA_SESSION_SECRET="$(get_env AUTHELIA_SESSION_SECRET)"
AUTHELIA_JWT_SECRET="$(get_env AUTHELIA_JWT_SECRET)"
AUTHELIA_STORAGE_ENCRYPTION_KEY="$(get_env AUTHELIA_STORAGE_ENCRYPTION_KEY)"
# --- Authelia config (fresh-deploy only — never clobber a live config) ------
export AUTHELIA_SESSION_SECRET AUTHELIA_JWT_SECRET AUTHELIA_STORAGE_ENCRYPTION_KEY \
OBMP_DOMAIN OBMP_COOKIE_DOMAIN
SUBST='${AUTHELIA_SESSION_SECRET} ${AUTHELIA_JWT_SECRET} ${AUTHELIA_STORAGE_ENCRYPTION_KEY} ${OBMP_DOMAIN} ${OBMP_COOKIE_DOMAIN}'
if [ ! -f "$OBMP_DATA_ROOT/authelia/configuration.yml" ]; then
envsubst "$SUBST" < authelia/configuration.yml.template \
| $SUDO tee "$OBMP_DATA_ROOT/authelia/configuration.yml" > /dev/null
echo "Rendered authelia/configuration.yml"
else
echo "authelia/configuration.yml exists — left untouched"
fi
if [ ! -f "$OBMP_DATA_ROOT/authelia/users_database.yml" ]; then
$SUDO cp authelia/users_database.yml.template \
"$OBMP_DATA_ROOT/authelia/users_database.yml"
echo "Rendered authelia/users_database.yml (demo user: openbmp)"
else
echo "authelia/users_database.yml exists — left untouched"
fi
# --- images -----------------------------------------------------------------
echo "Pulling and building images ..."
docker compose pull --quiet
docker compose --profile test --profile auth build
# --- done -------------------------------------------------------------------
cat <<EOF
Setup complete.
docker compose up -d # BMP collector core
docker compose --profile test --profile auth up -d # full stack (lab + auth)
EOF

2
telegraf/Dockerfile Normal file
View File

@ -0,0 +1,2 @@
FROM telegraf:1.28-alpine
COPY telegraf.conf /etc/telegraf/telegraf.conf

75
telegraf/telegraf.conf Normal file
View File

@ -0,0 +1,75 @@
# Telegraf Configuration for gNMI Streaming Telemetry
# Collects interface counters and data rates from IOS-XR routers
[global_tags]
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = "0s"
###############################################################################
# INPUT PLUGINS #
###############################################################################
## gNMI targets — driven by environment variables so the telemetry fleet can
## scale without editing this file. Set in .env:
## GNMI_ADDRESSES — quoted, comma-separated host:port list, e.g.
## GNMI_ADDRESSES="10.0.0.1:57400", "10.0.0.2:57400"
## GNMI_USERNAME / GNMI_PASSWORD — gNMI credentials (uniform across the fleet)
## Every target must have gNMI/grpc enabled and be reachable on the gRPC port.
[[inputs.gnmi]]
addresses = [ ${GNMI_ADDRESSES} ]
username = "${GNMI_USERNAME}"
password = "${GNMI_PASSWORD}"
## No TLS (lab environment)
enable_tls = false
## Use json_ietf encoding (supported by IOS-XR 24.3.1)
encoding = "json_ietf"
## Redial in case of failures after
redial = "10s"
## OpenConfig interface counters (bytes, packets, errors, discards)
[[inputs.gnmi.subscription]]
name = "interface_counters"
origin = "openconfig-interfaces"
path = "/interfaces/interface/state/counters"
subscription_mode = "sample"
sample_interval = "10s"
## OpenConfig interface state (admin/oper status, description, type)
[[inputs.gnmi.subscription]]
name = "interface_state"
origin = "openconfig-interfaces"
path = "/interfaces/interface/state"
subscription_mode = "sample"
sample_interval = "30s"
## Docker container resource metrics — CPU, memory (incl. limit + %), network,
## and block IO for every obmp-* container. Surfaces resource pressure (e.g. a
## container approaching its mem_limit) before it OOM-crashes.
[[inputs.docker]]
endpoint = "unix:///var/run/docker.sock"
gather_services = false
container_name_include = ["obmp-*"]
perdevice = false
total = true
timeout = "10s"
###############################################################################
# OUTPUT PLUGINS #
###############################################################################
[[outputs.influxdb_v2]]
urls = ["http://localhost:8086"]
token = "${INFLUXDB_TOKEN}"
organization = "openbmp"
bucket = "telemetry"

12
traffic-gen-ui/Dockerfile Normal file
View File

@ -0,0 +1,12 @@
FROM node:20-alpine AS build
WORKDIR /app
COPY package.json ./
RUN npm install
COPY . .
RUN npm run build
FROM nginx:alpine
COPY --from=build /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/conf.d/default.conf
EXPOSE 5002
CMD ["nginx", "-g", "daemon off;"]

12
traffic-gen-ui/index.html Normal file
View File

@ -0,0 +1,12 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Traffic Generator</title>
</head>
<body>
<div id="app"></div>
<script type="module" src="/src/main.js"></script>
</body>
</html>

21
traffic-gen-ui/nginx.conf Normal file
View File

@ -0,0 +1,21 @@
server {
listen 5002;
root /usr/share/nginx/html;
index index.html;
location /api/ {
proxy_pass http://localhost:5051/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location / {
try_files $uri $uri/ /index.html;
add_header Cache-Control "no-cache, no-store, must-revalidate";
}
location /assets/ {
expires 1y;
add_header Cache-Control "public, immutable";
}
}

View File

@ -0,0 +1,17 @@
{
"name": "traffic-gen-ui",
"version": "1.0.0",
"private": true,
"scripts": {
"dev": "vite",
"build": "vite build",
"preview": "vite preview"
},
"dependencies": {
"vue": "^3.3.0"
},
"devDependencies": {
"@vitejs/plugin-vue": "^4.2.0",
"vite": "^4.4.0"
}
}

326
traffic-gen-ui/src/App.vue Normal file
View File

@ -0,0 +1,326 @@
<template>
<div class="app-layout">
<!-- HEADER -->
<header class="app-header">
<div class="header-title">
<span class="logo-icon">&#9889;</span>
<h1>Traffic Generator</h1>
</div>
<StatusBar :health="health" :api-error="apiError" @modeChanged="fetchHealth(); fetchAll()" />
</header>
<!-- ERROR BANNER -->
<div v-if="apiError" class="error-banner">
<span class="error-icon">&#9888;</span>
API unreachable: {{ apiError }} &mdash; retrying every 5s
</div>
<!-- MAIN CONTENT -->
<div class="main-content">
<!-- LEFT COLUMN: Flow Builder -->
<aside class="left-col">
<FlowBuilder :key="editFlow ? editFlow.id : 'new'" :editFlow="editFlow"
@created="onFlowSaved" @updated="onFlowSaved" @cancel="editFlow = null" />
</aside>
<!-- RIGHT COLUMN: Tabs -->
<main class="right-col">
<div class="tabs">
<button
v-for="tab in tabs"
:key="tab.id"
class="tab-btn"
:class="{ active: activeTab === tab.id }"
@click="activeTab = tab.id"
>
{{ tab.label }}
</button>
</div>
<div class="tab-content">
<div v-if="activeTab === 'flows'">
<QuickPing />
<FlowTable :flows="flows" @refresh="fetchFlows" @edit="startEdit" />
</div>
<div v-else-if="activeTab === 'tests'">
<TestBuilder @created="fetchTests" @refresh="fetchAll" />
<div style="margin-top: 20px;">
<TestRunner :tests="tests" @refresh="fetchTests" />
</div>
</div>
<ResultsPanel v-else-if="activeTab === 'results'" :tests="tests" />
<StatsMonitor v-else-if="activeTab === 'monitor'" :flows="flows" />
</div>
</main>
</div>
<!-- FOOTER -->
<footer class="app-footer">
<span>Refreshing every 5s (health) / 3s (flows)</span>
<span class="footer-sep">|</span>
<a :href="baseUrl + ':3000'" target="_blank" class="footer-link">Grafana: :3000</a>
<span class="footer-sep">|</span>
<a :href="baseUrl + ':5001'" target="_blank" class="footer-link">Route Injector: :5001</a>
</footer>
</div>
</template>
<script setup>
import { ref, computed, onMounted, onUnmounted } from 'vue'
import { api } from './api.js'
import StatusBar from './components/StatusBar.vue'
import FlowBuilder from './components/FlowBuilder.vue'
import FlowTable from './components/FlowTable.vue'
import QuickPing from './components/QuickPing.vue'
import TestBuilder from './components/TestBuilder.vue'
import TestRunner from './components/TestRunner.vue'
import ResultsPanel from './components/ResultsPanel.vue'
import StatsMonitor from './components/StatsMonitor.vue'
const health = ref(null)
const flows = ref([])
const tests = ref([])
const apiError = ref(null)
const activeTab = ref('flows')
const editFlow = ref(null)
const baseUrl = computed(() => `${window.location.protocol}//${window.location.hostname}`)
function startEdit(flow) { editFlow.value = { ...flow } }
function onFlowSaved() { editFlow.value = null; fetchFlows() }
const tabs = [
{ id: 'flows', label: 'Flows' },
{ id: 'tests', label: 'Tests' },
{ id: 'results', label: 'Results' },
{ id: 'monitor', label: 'Monitor' },
]
async function fetchHealth() {
try {
health.value = await api.health()
apiError.value = null
} catch (e) {
apiError.value = e.message
health.value = null
}
}
async function fetchFlows() {
try {
const data = await api.flows()
flows.value = data.flows || []
} catch (_) {}
}
async function fetchTests() {
try {
const data = await api.tests()
tests.value = data.tests || []
} catch (_) {}
}
async function fetchAll() {
await Promise.all([fetchFlows(), fetchTests()])
}
let healthTimer = null
let dataTimer = null
onMounted(() => {
fetchHealth()
fetchAll()
healthTimer = setInterval(fetchHealth, 5000)
dataTimer = setInterval(fetchAll, 3000)
})
onUnmounted(() => {
clearInterval(healthTimer)
clearInterval(dataTimer)
})
</script>
<style>
:root {
--bg: #0f1117;
--card-bg: #1a1f2e;
--border: #2d3748;
--accent: #4f9cf9;
--success: #48bb78;
--danger: #fc8181;
--warning: #f6ad55;
--text: #e2e8f0;
--muted: #718096;
--radius: 8px;
}
* { box-sizing: border-box; margin: 0; padding: 0; }
body {
background: var(--bg);
color: var(--text);
font-family: 'Segoe UI', system-ui, -apple-system, sans-serif;
font-size: 14px;
line-height: 1.5;
}
button {
cursor: pointer;
font-family: inherit;
font-size: 13px;
border: none;
border-radius: var(--radius);
transition: opacity 0.15s, background 0.15s;
}
button:disabled {
opacity: 0.5;
cursor: not-allowed;
}
input, select {
font-family: inherit;
font-size: 13px;
background: var(--bg);
color: var(--text);
border: 1px solid var(--border);
border-radius: var(--radius);
padding: 6px 10px;
outline: none;
}
input:focus {
border-color: var(--accent);
}
</style>
<style scoped>
.app-layout {
display: grid;
grid-template-rows: auto auto 1fr auto;
min-height: 100vh;
}
.app-header {
display: flex;
align-items: center;
justify-content: space-between;
padding: 12px 20px;
background: var(--card-bg);
border-bottom: 1px solid var(--border);
gap: 16px;
flex-wrap: wrap;
}
.header-title {
display: flex;
align-items: center;
gap: 10px;
}
.logo-icon {
font-size: 22px;
color: var(--warning);
}
.app-header h1 {
font-size: 18px;
font-weight: 600;
color: var(--text);
letter-spacing: 0.02em;
}
.error-banner {
background: rgba(252, 129, 129, 0.12);
border-bottom: 1px solid var(--danger);
color: var(--danger);
padding: 8px 20px;
font-size: 13px;
display: flex;
align-items: center;
gap: 8px;
}
.error-icon {
font-size: 16px;
}
.main-content {
display: grid;
grid-template-columns: 340px 1fr;
overflow: hidden;
height: calc(100vh - 110px);
}
.left-col {
border-right: 1px solid var(--border);
overflow-y: auto;
padding: 16px;
}
.right-col {
display: flex;
flex-direction: column;
overflow: hidden;
}
.tabs {
display: flex;
gap: 2px;
padding: 12px 16px 0;
background: var(--card-bg);
border-bottom: 1px solid var(--border);
}
.tab-btn {
background: transparent;
color: var(--muted);
padding: 8px 18px;
border-radius: var(--radius) var(--radius) 0 0;
border: 1px solid transparent;
border-bottom: none;
font-weight: 500;
}
.tab-btn:hover {
color: var(--text);
background: rgba(79, 156, 249, 0.08);
}
.tab-btn.active {
color: var(--accent);
background: var(--bg);
border-color: var(--border);
border-bottom: 1px solid var(--bg);
margin-bottom: -1px;
}
.tab-content {
flex: 1;
overflow-y: auto;
padding: 16px;
}
.app-footer {
padding: 8px 20px;
background: var(--card-bg);
border-top: 1px solid var(--border);
color: var(--muted);
font-size: 12px;
display: flex;
align-items: center;
gap: 10px;
}
.footer-sep {
color: var(--border);
}
.footer-link {
color: var(--accent);
text-decoration: none;
}
.footer-link:hover {
text-decoration: underline;
}
</style>

48
traffic-gen-ui/src/api.js Normal file
View File

@ -0,0 +1,48 @@
const BASE = '/traffic/api'
async function req(method, path, body) {
const opts = { method, headers: { 'Content-Type': 'application/json' } }
if (body) opts.body = JSON.stringify(body)
const r = await fetch(BASE + path, opts)
if (!r.ok) throw new Error(`${method} ${path} -> ${r.status}`)
return r.json()
}
export const api = {
health: () => req('GET', '/healthz'),
interfaces: () => req('GET', '/interfaces'),
mode: () => req('GET', '/mode'),
setMode: (mode) => req('POST', '/mode', { mode }),
// Flows
flows: () => req('GET', '/flows'),
createFlow: (f) => req('POST', '/flows', f),
getFlow: (id) => req('GET', `/flows/${id}`),
updateFlow: (id, f) => req('PUT', `/flows/${id}`, f),
deleteFlow: (id) => req('DELETE', `/flows/${id}`),
startFlow: (id) => req('POST', `/flows/${id}/start`),
stopFlow: (id) => req('POST', `/flows/${id}/stop`),
flowStats: (id) => req('GET', `/flows/${id}/stats`),
// Tests
tests: () => req('GET', '/tests'),
createTest: (t) => req('POST', '/tests', t),
getTest: (id) => req('GET', `/tests/${id}`),
startTest: (id) => req('POST', `/tests/${id}/start`),
stopTest: (id) => req('POST', `/tests/${id}/stop`),
testResults: (id) => req('GET', `/tests/${id}/results`),
// Presets
presets: () => req('GET', '/presets'),
loadPreset: (name, overrides) => req('POST', `/presets/${name}`, overrides),
// Stats
statsHistory: () => req('GET', '/stats/history'),
// Ping
ping: (target, count) => req('POST', '/ping', { target, count: count || 5 }),
// Responder
responderStats: () => req('GET', '/responder/stats'),
responderReset: () => req('POST', '/responder/reset'),
}

View File

@ -0,0 +1,159 @@
<template>
<div class="flow-builder">
<h3>{{ editing ? 'Edit Flow' : 'Create Flow' }}</h3>
<form @submit.prevent="submit">
<div class="form-row">
<label>Name</label>
<input v-model="form.name" placeholder="My Flow" />
</div>
<div class="form-row">
<label>Destination IP *</label>
<input v-model="form.dst_ip" placeholder="10.100.0.100" required />
</div>
<div class="form-row">
<label>Source IP</label>
<input v-model="form.src_ip" placeholder="auto (from interface)" />
</div>
<div class="form-row">
<label>Dst MAC</label>
<input v-model="form.dst_mac" placeholder="auto" />
</div>
<div class="form-row">
<label>Protocol</label>
<select v-model="form.protocol">
<option value="udp">UDP</option>
<option value="tcp">TCP</option>
<option value="icmp">ICMP</option>
</select>
</div>
<div v-if="form.protocol !== 'icmp'" class="form-row-pair">
<div class="form-row">
<label>Src Port</label>
<input v-model.number="form.src_port" type="number" min="1" max="65535" />
</div>
<div class="form-row">
<label>Dst Port</label>
<input v-model.number="form.dst_port" type="number" min="1" max="65535" />
</div>
</div>
<div class="form-row-pair">
<div class="form-row">
<label>Frame Size (bytes)</label>
<input v-model.number="form.frame_size" type="number" min="64" max="9000" />
</div>
<div class="form-row">
<label>Rate</label>
<input v-model.number="form.rate_val" type="number" min="1" step="any" />
<select v-model="form.rate_unit" class="rate-unit-standalone">
<option value="pps">pps</option>
<option value="kbps">Kbps</option>
<option value="mbps">Mbps</option>
</select>
</div>
</div>
<div class="form-row-pair">
<div class="form-row">
<label>Duration (sec)</label>
<input v-model.number="form.duration" type="number" min="0" :disabled="form.continuous" />
<label class="checkbox-inline">
<input type="checkbox" v-model="form.continuous" @change="onContinuousChange" />
Continuous
</label>
</div>
<div class="form-row">
<label>DSCP</label>
<input v-model.number="form.dscp" type="number" min="0" max="63" />
</div>
</div>
<div class="form-row">
<label>Responder URL (optional)</label>
<input v-model="form.responder_url" placeholder="http://host:5053" />
</div>
<div class="form-actions">
<button type="submit" class="btn btn-accent" :disabled="!form.dst_ip">
{{ editing ? 'Update' : 'Create Flow' }}
</button>
<button v-if="editing" type="button" class="btn btn-muted" @click="$emit('cancel')">Cancel</button>
</div>
</form>
</div>
</template>
<script setup>
import { reactive, computed } from 'vue'
import { api } from '../api.js'
const props = defineProps({ editFlow: Object })
const emit = defineEmits(['created', 'updated', 'cancel'])
const editing = computed(() => !!props.editFlow)
const defaults = {
name: '', dst_ip: '', src_ip: '', dst_mac: '',
protocol: 'udp', src_port: 50000, dst_port: 5001,
frame_size: 512, rate_val: 1000, rate_unit: 'pps', duration: 30,
dscp: 0, responder_url: '', continuous: false,
}
function ppsToDisplay(pps, frameSize) {
// Convert stored PPS to a friendlier unit if it was originally set that way
return { rate_val: pps, rate_unit: 'pps' }
}
const initData = props.editFlow
? { ...props.editFlow, continuous: props.editFlow.duration === 0, ...ppsToDisplay(props.editFlow.rate_pps, props.editFlow.frame_size) }
: {}
const form = reactive({ ...defaults, ...initData })
function onContinuousChange() { if (form.continuous) form.duration = 0 }
function computePps(val, unit, frameSize) {
if (unit === 'kbps') return Math.max(1, Math.round((val * 1000) / (frameSize * 8)))
if (unit === 'mbps') return Math.max(1, Math.round((val * 1_000_000) / (frameSize * 8)))
return Math.round(val)
}
async function submit() {
try {
const payload = { ...form }
payload.rate_pps = computePps(form.rate_val, form.rate_unit, form.frame_size)
delete payload.rate_val
delete payload.rate_unit
delete payload.continuous
if (form.continuous) payload.duration = 0
if (!payload.src_ip) delete payload.src_ip
if (!payload.dst_mac) delete payload.dst_mac
if (!payload.responder_url) delete payload.responder_url
if (!payload.name) payload.name = `${payload.protocol.toUpperCase()} -> ${payload.dst_ip}`
if (editing.value) {
await api.updateFlow(props.editFlow.id, payload)
emit('updated')
} else {
await api.createFlow(payload)
Object.assign(form, defaults)
emit('created')
}
} catch (e) {
alert('Error: ' + e.message)
}
}
</script>
<style scoped>
.flow-builder { padding: 0; }
h3 { font-size: 15px; margin-bottom: 12px; color: var(--accent); }
.form-row { margin-bottom: 10px; }
.form-row label { display: block; font-size: 11px; color: var(--muted); margin-bottom: 3px; text-transform: uppercase; letter-spacing: 0.05em; }
.form-row input, .form-row select { width: 100%; }
.form-row-pair { display: grid; grid-template-columns: 1fr 1fr; gap: 8px; }
.form-actions { display: flex; gap: 8px; margin-top: 14px; }
.btn { padding: 8px 16px; font-weight: 600; font-size: 13px; }
.btn-accent { background: var(--accent); color: #fff; }
.btn-accent:hover { opacity: 0.9; }
.btn-accent:disabled { opacity: 0.4; }
.btn-muted { background: var(--border); color: var(--text); }
.rate-unit-standalone { width: 100%; margin-top: 4px; }
.checkbox-inline { display: inline-flex !important; align-items: center; gap: 4px; margin-top: 4px; font-size: 12px; cursor: pointer; }
.checkbox-inline input { width: auto; }
</style>

View File

@ -0,0 +1,116 @@
<template>
<div class="flow-table">
<div v-if="!flows.length" class="empty">No flows created yet. Use the builder to create one.</div>
<table v-else>
<thead>
<tr>
<th>Name</th>
<th>Dst IP</th>
<th>Proto</th>
<th>Size</th>
<th>Rate</th>
<th>State</th>
<th>TX Pkts</th>
<th>TX pps</th>
<th>RX Pkts</th>
<th>Actions</th>
</tr>
</thead>
<tbody>
<tr v-for="f in flows" :key="f.id" :class="{ running: f.state === 'running' }">
<td>{{ f.name || '-' }}</td>
<td class="mono">{{ f.dst_ip }}</td>
<td>{{ f.protocol.toUpperCase() }}</td>
<td>{{ f.frame_size }}B</td>
<td>{{ formatRate(f) }}</td>
<td>
<span class="state-badge" :class="'state-' + f.state">{{ f.state }}</span>
</td>
<td class="mono">{{ formatNum(f.stats?.tx_packets || 0) }}</td>
<td class="mono">{{ pps[f.id] || 0 }}</td>
<td class="mono">{{ formatNum(f.stats?.rx_packets || 0) }}</td>
<td class="actions">
<button v-if="f.state !== 'running'" class="btn-sm btn-go" @click="start(f.id)">Start</button>
<button v-else class="btn-sm btn-stop" @click="stop(f.id)">Stop</button>
<button class="btn-sm btn-edit" @click="emit('edit', f)" :disabled="f.state === 'running'">Edit</button>
<button class="btn-sm btn-del" @click="del(f.id)" :disabled="f.state === 'running'">Del</button>
</td>
</tr>
</tbody>
</table>
</div>
</template>
<script setup>
import { ref, onMounted, onUnmounted } from 'vue'
import { api } from '../api.js'
const props = defineProps({ flows: Array })
const emit = defineEmits(['refresh', 'edit'])
const pps = ref({})
const prevTx = ref({})
let statsTimer = null
function computePps() {
for (const f of (props.flows || [])) {
const txNow = f.stats?.tx_packets || 0
const prev = prevTx.value[f.id] || 0
if (f.state === 'running' && prev > 0) {
pps.value[f.id] = Math.max(0, txNow - prev)
} else if (f.state !== 'running') {
pps.value[f.id] = 0
}
prevTx.value[f.id] = txNow
}
}
function formatRate(f) {
const pps = f.rate_pps || 0
const mbps = (pps * (f.frame_size || 64) * 8) / 1_000_000
if (mbps >= 1) return mbps.toFixed(1) + ' Mbps'
if (mbps >= 0.001) return (mbps * 1000).toFixed(0) + ' Kbps'
return pps + ' pps'
}
function formatNum(n) {
if (n >= 1000000) return (n / 1000000).toFixed(1) + 'M'
if (n >= 1000) return (n / 1000).toFixed(1) + 'K'
return n
}
onMounted(() => { statsTimer = setInterval(computePps, 1000) })
onUnmounted(() => { clearInterval(statsTimer) })
async function start(id) {
try { await api.startFlow(id); emit('refresh') } catch (e) { alert(e.message) }
}
async function stop(id) {
try { await api.stopFlow(id); emit('refresh') } catch (e) { alert(e.message) }
}
async function del(id) {
try { await api.deleteFlow(id); emit('refresh') } catch (e) { alert(e.message) }
}
</script>
<style scoped>
.flow-table { overflow-x: auto; }
.empty { color: var(--muted); padding: 20px; text-align: center; }
table { width: 100%; border-collapse: collapse; }
th { text-align: left; font-size: 11px; color: var(--muted); text-transform: uppercase; letter-spacing: 0.05em; padding: 6px 8px; border-bottom: 1px solid var(--border); }
td { padding: 8px; border-bottom: 1px solid rgba(45,55,72,0.5); font-size: 13px; }
tr.running { background: rgba(79,156,249,0.05); }
.mono { font-family: 'Cascadia Code', 'Fira Code', monospace; font-size: 12px; }
.state-badge { font-size: 11px; padding: 2px 8px; border-radius: 10px; font-weight: 600; }
.state-idle { background: rgba(113,128,150,0.2); color: var(--muted); }
.state-running { background: rgba(72,187,120,0.15); color: var(--success); }
.state-stopped { background: rgba(246,173,85,0.15); color: var(--warning); }
.actions { display: flex; gap: 4px; }
.btn-sm { padding: 3px 10px; font-size: 11px; font-weight: 600; border-radius: 6px; }
.btn-go { background: var(--success); color: #fff; }
.btn-stop { background: var(--warning); color: #000; }
.btn-del { background: rgba(252,129,129,0.15); color: var(--danger); }
.btn-edit { background: rgba(79,156,249,0.15); color: var(--accent); }
.btn-edit:disabled { opacity: 0.3; }
.btn-del:disabled { opacity: 0.3; }
</style>

View File

@ -0,0 +1,76 @@
<template>
<div class="quick-ping">
<div class="ping-row">
<input
v-model="target"
placeholder="IP address to ping..."
@keyup.enter="runPing"
:disabled="pinging"
/>
<button class="btn-ping" @click="runPing" :disabled="!target || pinging">
{{ pinging ? 'Pinging...' : 'Ping' }}
</button>
</div>
<div v-if="result" class="ping-result" :class="result.reachable ? 'reachable' : 'unreachable'">
<div class="ping-summary">
<span class="ping-target">{{ result.target }}</span>
<span v-if="result.reachable" class="ping-status ok">Reachable</span>
<span v-else class="ping-status fail">Unreachable</span>
</div>
<div v-if="result.reachable && result.stats" class="ping-stats">
<span>{{ result.received }}/{{ result.sent }} replies</span>
<span>Min: {{ result.stats.min_ms }}ms</span>
<span>Avg: {{ result.stats.avg_ms }}ms</span>
<span>Max: {{ result.stats.max_ms }}ms</span>
<span>Loss: {{ result.loss_pct }}%</span>
</div>
<div v-if="result.error" class="ping-error">{{ result.error }}</div>
</div>
</div>
</template>
<script setup>
import { ref } from 'vue'
import { api } from '../api.js'
const target = ref('')
const pinging = ref(false)
const result = ref(null)
async function runPing() {
if (!target.value || pinging.value) return
pinging.value = true
result.value = null
try {
result.value = await api.ping(target.value, 5)
} catch (e) {
result.value = { target: target.value, reachable: false, error: e.message }
} finally {
pinging.value = false
}
}
</script>
<style scoped>
.quick-ping { margin-bottom: 16px; }
.ping-row { display: flex; gap: 6px; }
.ping-row input { flex: 1; }
.btn-ping {
padding: 6px 16px; font-weight: 600; font-size: 13px;
background: var(--accent); color: #fff; white-space: nowrap;
}
.btn-ping:disabled { opacity: 0.4; }
.ping-result {
margin-top: 8px; padding: 8px 12px;
border-radius: var(--radius); font-size: 13px;
}
.ping-result.reachable { background: rgba(72,187,120,0.1); border: 1px solid rgba(72,187,120,0.3); }
.ping-result.unreachable { background: rgba(252,129,129,0.1); border: 1px solid rgba(252,129,129,0.3); }
.ping-summary { display: flex; align-items: center; gap: 10px; }
.ping-target { font-weight: 600; font-family: monospace; }
.ping-status { font-size: 11px; padding: 2px 8px; border-radius: 10px; font-weight: 600; }
.ping-status.ok { background: rgba(72,187,120,0.2); color: var(--success); }
.ping-status.fail { background: rgba(252,129,129,0.2); color: var(--danger); }
.ping-stats { display: flex; gap: 12px; margin-top: 6px; font-size: 12px; color: var(--muted); font-family: monospace; }
.ping-error { margin-top: 4px; color: var(--danger); font-size: 12px; }
</style>

View File

@ -0,0 +1,157 @@
<template>
<div class="results-panel">
<h3>Test Results</h3>
<div v-if="!completedTests.length" class="empty">No completed tests yet.</div>
<div v-for="t in completedTests" :key="t.id" class="result-card">
<div class="result-header">
<strong>{{ t.type }} Test</strong>
<span class="result-time">{{ t.completed_at }}</span>
<div class="export-btns">
<button class="btn-sm btn-export" @click="exportJSON(t)">Export JSON</button>
<button class="btn-sm btn-export" @click="exportCSV(t)">Export CSV</button>
</div>
</div>
<div v-if="t.error" class="error-msg">Error: {{ t.error }}</div>
<div v-if="t.results && Object.keys(t.results).length" class="result-table">
<!-- Frame Loss: array of rate steps per frame size -->
<template v-if="t.type === 'frame_loss'">
<div v-for="(rates, size) in t.results" :key="size" class="fl-section">
<div class="fl-title">Frame Size: {{ size }} B</div>
<table>
<thead>
<tr><th>Rate %</th><th>Rate (pps)</th><th>TX Packets</th><th>RX Packets</th><th>Loss %</th></tr>
</thead>
<tbody>
<tr v-for="r in rates" :key="r.rate_pct">
<td class="mono">{{ r.rate_pct }}%</td>
<td class="mono">{{ r.rate_pps }}</td>
<td class="mono">{{ r.tx_packets }}</td>
<td class="mono">{{ r.rx_packets }}</td>
<td class="mono">{{ r.loss_pct }}%</td>
</tr>
</tbody>
</table>
</div>
</template>
<!-- Other test types -->
<table v-else>
<thead>
<tr>
<th>Frame Size (B)</th>
<th v-for="col in resultColumns(t)" :key="col">{{ col }}</th>
</tr>
</thead>
<tbody>
<tr v-for="(val, size) in t.results" :key="size">
<td>{{ size }}</td>
<td v-for="col in resultColumns(t)" :key="col" class="mono">
{{ formatVal(val, col) }}
</td>
</tr>
</tbody>
</table>
</div>
<div v-if="t.results && t.type !== 'frame_loss'" class="result-chart">
<div class="bar-chart">
<div v-for="(val, size) in t.results" :key="size" class="bar-item">
<div class="bar-fill" :style="{ height: barHeight(t, val) + '%' }"></div>
<span class="bar-label">{{ size }}B</span>
</div>
</div>
</div>
</div>
</div>
</template>
<script setup>
import { computed } from 'vue'
const props = defineProps({ tests: Array })
const completedTests = computed(() =>
(props.tests || []).filter(t => (t.state === 'complete' || t.state === 'error') && (t.results || t.error))
)
function resultColumns(t) {
if (t.type === 'throughput') return ['Max Rate (pps)', 'Throughput (Mbps)']
if (t.type === 'latency') return ['Min (ms)', 'Avg (ms)', 'Max (ms)', 'Jitter (ms)']
if (t.type === 'frame_loss') return ['Loss %']
if (t.type === 'back_to_back') return ['Max Burst']
return ['Value']
}
function formatVal(val, col) {
if (typeof val === 'object') {
if (col.includes('Rate')) return val.max_throughput_pps ?? val.max_rate_pps ?? '-'
if (col.includes('Throughput')) {
const pps = val.max_throughput_pps ?? val.max_rate_pps ?? 0
const fs = val.frame_size ?? 64
return pps ? ((pps * fs * 8) / 1_000_000).toFixed(2) : '-'
}
if (col.includes('Min')) return val.min_ms != null ? val.min_ms.toFixed(2) : '-'
if (col.includes('Avg')) return val.avg_ms != null ? val.avg_ms.toFixed(2) : '-'
if (col.includes('Max') && col.includes('ms')) return val.max_ms != null ? val.max_ms.toFixed(2) : '-'
if (col.includes('Jitter')) return val.jitter_ms != null ? val.jitter_ms.toFixed(2) : '-'
if (col.includes('Loss')) return val.loss_pct ?? '-'
if (col.includes('Burst')) return val.max_burst_frames ?? val.max_burst ?? '-'
return JSON.stringify(val)
}
return val
}
function barHeight(t, val) {
const v = typeof val === 'object' ? (val.max_throughput_pps || val.max_rate_pps || val.avg_ms || val.loss_pct || val.max_burst_frames || 0) : val
const allVals = Object.values(t.results).map(r => typeof r === 'object' ? (r.max_throughput_pps || r.max_rate_pps || r.avg_ms || r.loss_pct || r.max_burst_frames || 0) : r)
const maxVal = Math.max(...allVals, 1)
return Math.min(100, Math.max(5, (v / maxVal) * 100))
}
function exportJSON(t) {
const blob = new Blob([JSON.stringify(t, null, 2)], { type: 'application/json' })
downloadBlob(blob, `test_${t.type}_${t.id}.json`)
}
function exportCSV(t) {
if (!t.results) return
const cols = resultColumns(t)
let csv = 'Frame Size,' + cols.join(',') + '\n'
for (const [size, val] of Object.entries(t.results)) {
csv += size + ',' + cols.map(c => formatVal(val, c)).join(',') + '\n'
}
downloadBlob(new Blob([csv], { type: 'text/csv' }), `test_${t.type}_${t.id}.csv`)
}
function downloadBlob(blob, name) {
const url = URL.createObjectURL(blob)
const a = document.createElement('a')
a.href = url; a.download = name; a.click()
URL.revokeObjectURL(url)
}
</script>
<style scoped>
h3 { font-size: 15px; margin-bottom: 12px; color: var(--accent); }
.empty { color: var(--muted); padding: 20px; text-align: center; }
.result-card { background: var(--card-bg); border: 1px solid var(--border); border-radius: var(--radius); padding: 12px; margin-bottom: 12px; }
.result-header { display: flex; align-items: center; gap: 12px; margin-bottom: 10px; flex-wrap: wrap; }
.result-header strong { font-size: 14px; text-transform: capitalize; }
.result-time { font-size: 11px; color: var(--muted); }
.export-btns { margin-left: auto; display: flex; gap: 4px; }
.btn-sm { padding: 3px 10px; font-size: 11px; font-weight: 600; border-radius: 6px; }
.btn-export { background: rgba(79,156,249,0.12); color: var(--accent); }
.btn-export:hover { background: rgba(79,156,249,0.25); }
table { width: 100%; border-collapse: collapse; }
th { font-size: 11px; color: var(--muted); text-align: left; padding: 4px 8px; border-bottom: 1px solid var(--border); }
td { font-size: 13px; padding: 4px 8px; }
.mono { font-family: monospace; }
.bar-chart { display: flex; align-items: flex-end; gap: 8px; height: 80px; margin-top: 12px; padding: 0 8px; }
.bar-item { flex: 1; display: flex; flex-direction: column; align-items: center; height: 100%; }
.bar-fill { width: 100%; background: var(--accent); border-radius: 3px 3px 0 0; min-height: 4px; transition: height 0.3s; margin-top: auto; }
.bar-label { font-size: 10px; color: var(--muted); margin-top: 4px; }
.fl-section { margin-bottom: 12px; }
.fl-title { font-size: 12px; font-weight: 600; color: var(--accent); margin-bottom: 4px; }
.error-msg { color: var(--danger); font-size: 13px; padding: 8px 0; }
</style>

View File

@ -0,0 +1,196 @@
<template>
<div class="stats-monitor">
<h3>Live Statistics</h3>
<div class="flow-selector">
<label>Flow:</label>
<select v-model="selectedFlow">
<option value="">All Flows</option>
<option v-for="f in flows" :key="f.id" :value="f.id">{{ f.name || f.dst_ip }}</option>
</select>
</div>
<div class="stats-grid">
<div class="stat-card">
<div class="stat-value tx">{{ current.tx_pps || 0 }}</div>
<div class="stat-label">TX pps</div>
</div>
<div class="stat-card">
<div class="stat-value rx">{{ current.rx_pps || 0 }}</div>
<div class="stat-label">RX pps</div>
</div>
<div class="stat-card">
<div class="stat-value tx">{{ (current.tx_mbps || 0).toFixed(2) }}</div>
<div class="stat-label">TX Mbps</div>
</div>
<div class="stat-card">
<div class="stat-value rx">{{ (current.rx_mbps || 0).toFixed(2) }}</div>
<div class="stat-label">RX Mbps</div>
</div>
<div class="stat-card">
<div class="stat-value" :class="lossClass">{{ (current.loss_pct || 0).toFixed(1) }}%</div>
<div class="stat-label">Loss</div>
</div>
<div class="stat-card">
<div class="stat-value">{{ current.avg_latency_ms ? current.avg_latency_ms.toFixed(1) : '-' }}</div>
<div class="stat-label">Avg Latency (ms)</div>
</div>
</div>
<div class="totals">
<span>TX Packets: {{ current.tx_packets || 0 }}</span>
<span>RX Packets: {{ current.rx_packets || 0 }}</span>
<span>TX Bytes: {{ formatBytes(current.tx_bytes || 0) }}</span>
<span>RX Bytes: {{ formatBytes(current.rx_bytes || 0) }}</span>
</div>
<div class="history-chart">
<div class="chart-header">TX/RX Rate History (last 60s)</div>
<div class="sparkline">
<div v-for="(s, i) in history" :key="i" class="spark-bar">
<div class="spark-tx" :style="{ height: sparkHeight(s.tx_pps) + 'px' }"></div>
<div class="spark-rx" :style="{ height: sparkHeight(s.rx_pps) + 'px' }"></div>
</div>
</div>
</div>
</div>
</template>
<script setup>
import { ref, computed, onMounted, onUnmounted } from 'vue'
import { api } from '../api.js'
const props = defineProps({ flows: Array })
const selectedFlow = ref('')
const current = ref({})
const history = ref([])
const lossClass = computed(() => {
const l = current.value.loss_pct || 0
if (l > 5) return 'loss-high'
if (l > 0) return 'loss-med'
return 'loss-ok'
})
let timer = null
async function fetchStats() {
try {
if (selectedFlow.value) {
// Single flow: /flows/<id>/stats returns {flow_id, counters, rates}
// rates contains: tx_pps, rx_pps, tx_mbps, rx_mbps, loss_pct, tx_packets, tx_bytes, etc.
const s = await api.flowStats(selectedFlow.value)
const rates = s.rates || {}
const counters = s.counters || {}
current.value = {
tx_pps: Math.round(rates.tx_pps || 0),
rx_pps: Math.round(rates.rx_pps || 0),
tx_mbps: rates.tx_mbps || 0,
rx_mbps: rates.rx_mbps || 0,
loss_pct: rates.loss_pct || 0,
avg_latency_ms: rates.latency ? rates.latency.avg_ms : null,
tx_packets: counters.tx_packets || 0,
tx_bytes: counters.tx_bytes || 0,
rx_packets: counters.rx_packets || 0,
rx_bytes: counters.rx_bytes || 0,
}
// Append to history for sparkline
history.value.push({ tx_pps: current.value.tx_pps, rx_pps: current.value.rx_pps })
if (history.value.length > 60) history.value = history.value.slice(-60)
} else {
// All flows: /stats/history returns {history: {flow_id: [samples]}}
const h = await api.statsHistory()
const allHistory = h.history || {}
// Aggregate latest sample across all flows
let txPps = 0, rxPps = 0, txMbps = 0, rxMbps = 0
let txPkts = 0, txBytes = 0, rxPkts = 0, rxBytes = 0
let lossPcts = [], latencies = []
for (const [, samples] of Object.entries(allHistory)) {
if (!samples.length) continue
const latest = samples[samples.length - 1]
txPps += latest.tx_pps || 0
rxPps += latest.rx_pps || 0
txMbps += latest.tx_mbps || 0
rxMbps += latest.rx_mbps || 0
txPkts += latest.tx_packets || 0
txBytes += latest.tx_bytes || 0
rxPkts += latest.rx_packets || 0
rxBytes += latest.rx_bytes || 0
if (latest.loss_pct > 0) lossPcts.push(latest.loss_pct)
if (latest.latency && latest.latency.avg_ms) latencies.push(latest.latency.avg_ms)
}
current.value = {
tx_pps: Math.round(txPps),
rx_pps: Math.round(rxPps),
tx_mbps: txMbps,
rx_mbps: rxMbps,
loss_pct: txPkts > 0 ? Math.max(0, ((txPkts - rxPkts) / txPkts) * 100) : 0,
avg_latency_ms: latencies.length ? latencies.reduce((a, b) => a + b, 0) / latencies.length : null,
tx_packets: txPkts,
tx_bytes: txBytes,
rx_packets: rxPkts,
rx_bytes: rxBytes,
}
// Build aggregated sparkline from history samples
// Find max sample count across all flows
const flowIds = Object.keys(allHistory)
if (flowIds.length) {
const maxLen = Math.max(...flowIds.map(id => allHistory[id].length))
const sparkData = []
for (let i = Math.max(0, maxLen - 60); i < maxLen; i++) {
let sTx = 0, sRx = 0
for (const fid of flowIds) {
const s = allHistory[fid][i]
if (s) { sTx += s.tx_pps || 0; sRx += s.rx_pps || 0 }
}
sparkData.push({ tx_pps: Math.round(sTx), rx_pps: Math.round(sRx) })
}
history.value = sparkData
}
}
} catch (_) {}
}
function sparkHeight(val) {
if (!val) return 0
const max = Math.max(...history.value.map(s => Math.max(s.tx_pps || 0, s.rx_pps || 0)), 1)
return Math.max(1, (val / max) * 40)
}
function formatBytes(b) {
if (b < 1024) return b + ' B'
if (b < 1048576) return (b / 1024).toFixed(1) + ' KB'
if (b < 1073741824) return (b / 1048576).toFixed(1) + ' MB'
return (b / 1073741824).toFixed(2) + ' GB'
}
onMounted(() => { fetchStats(); timer = setInterval(fetchStats, 1000) })
onUnmounted(() => { clearInterval(timer) })
</script>
<style scoped>
h3 { font-size: 15px; margin-bottom: 12px; color: var(--accent); }
.flow-selector { margin-bottom: 12px; display: flex; align-items: center; gap: 8px; }
.flow-selector label { font-size: 12px; color: var(--muted); }
.flow-selector select { flex: 1; }
.stats-grid { display: grid; grid-template-columns: repeat(3, 1fr); gap: 10px; margin-bottom: 12px; }
.stat-card { background: var(--card-bg); border: 1px solid var(--border); border-radius: var(--radius); padding: 10px; text-align: center; }
.stat-value { font-size: 22px; font-weight: 700; font-family: monospace; }
.stat-value.tx { color: var(--accent); }
.stat-value.rx { color: var(--success); }
.stat-label { font-size: 11px; color: var(--muted); margin-top: 2px; }
.loss-ok { color: var(--success); }
.loss-med { color: var(--warning); }
.loss-high { color: var(--danger); }
.totals { display: flex; gap: 16px; font-size: 12px; color: var(--muted); margin-bottom: 16px; flex-wrap: wrap; }
.history-chart { background: var(--card-bg); border: 1px solid var(--border); border-radius: var(--radius); padding: 12px; }
.chart-header { font-size: 12px; color: var(--muted); margin-bottom: 8px; }
.sparkline { display: flex; align-items: flex-end; gap: 1px; height: 50px; }
.spark-bar { flex: 1; display: flex; flex-direction: column; justify-content: flex-end; gap: 1px; }
.spark-tx { background: var(--accent); border-radius: 1px; min-width: 2px; }
.spark-rx { background: var(--success); border-radius: 1px; min-width: 2px; }
</style>

View File

@ -0,0 +1,61 @@
<template>
<div class="status-bar">
<div class="status-badges">
<span class="badge" :class="connected ? 'badge-ok' : 'badge-err'">
{{ connected ? 'API Connected' : 'API Offline' }}
</span>
<span v-if="health" class="badge badge-mode" :class="'mode-' + (health.mode || 'sender')" @click="toggleMode">
{{ (health.mode || 'sender').toUpperCase() }}
</span>
<span v-if="health" class="badge badge-info">
Active Flows: {{ health.active_flows || 0 }}
</span>
<span v-if="health" class="badge badge-info">
Active Tests: {{ health.active_tests || 0 }}
</span>
</div>
</div>
</template>
<script setup>
import { computed, ref } from 'vue'
import { api } from '../api.js'
const props = defineProps({ health: Object, apiError: String })
const emit = defineEmits(['modeChanged'])
const connected = computed(() => !props.apiError && props.health)
const switching = ref(false)
async function toggleMode() {
if (switching.value || !props.health) return
const current = props.health.mode || 'sender'
const next = current === 'sender' ? 'responder' : 'sender'
if (!confirm(`Switch to ${next.toUpperCase()} mode? This will stop all active flows/tests.`)) return
switching.value = true
try {
await api.setMode(next)
emit('modeChanged')
} catch (e) {
alert('Failed to switch mode: ' + e.message)
} finally {
switching.value = false
}
}
</script>
<style scoped>
.status-bar { display: flex; align-items: center; gap: 8px; }
.status-badges { display: flex; gap: 6px; flex-wrap: wrap; }
.badge {
font-size: 11px; padding: 3px 10px; border-radius: 12px;
font-weight: 600; letter-spacing: 0.03em;
}
.badge-ok { background: rgba(72,187,120,0.15); color: var(--success); }
.badge-err { background: rgba(252,129,129,0.15); color: var(--danger); animation: pulse 1.5s infinite; }
.badge-info { background: rgba(79,156,249,0.12); color: var(--accent); }
.badge-mode { cursor: pointer; transition: background 0.2s; }
.badge-mode:hover { opacity: 0.8; }
.mode-sender { background: rgba(72,187,120,0.2); color: var(--success); }
.mode-responder { background: rgba(246,173,85,0.2); color: var(--warning); }
@keyframes pulse { 0%,100% { opacity: 1; } 50% { opacity: 0.5; } }
</style>

View File

@ -0,0 +1,166 @@
<template>
<div class="test-builder">
<h3>RFC 2544 Test</h3>
<div class="form-row">
<label>Test Type</label>
<select v-model="form.type">
<option value="throughput">Throughput (binary search for max rate)</option>
<option value="latency">Latency (measure RTT)</option>
<option value="frame_loss">Frame Loss (loss vs rate curve)</option>
<option value="back_to_back">Back-to-Back (max burst)</option>
</select>
</div>
<div class="form-row">
<label>Destination IP</label>
<input v-model="form.dst_ip" placeholder="10.100.0.1" />
</div>
<div class="form-row-pair">
<div class="form-row">
<label>Protocol</label>
<select v-model="form.protocol">
<option value="udp">UDP</option>
<option value="icmp">ICMP</option>
<option value="tcp">TCP</option>
</select>
</div>
<div class="form-row">
<label>Source IP</label>
<input v-model="form.src_ip" placeholder="auto" />
</div>
</div>
<div class="form-row">
<label>Frame Sizes</label>
<div class="frame-sizes">
<label v-for="s in standardSizes" :key="s" class="checkbox-label">
<input type="checkbox" :value="s" v-model="form.frame_sizes" /> {{ s }}
</label>
</div>
</div>
<div class="form-row-pair">
<div class="form-row">
<label>Trial Duration (sec)</label>
<input v-model.number="form.trial_duration" type="number" min="5" max="300" />
</div>
<div class="form-row">
<label>Max Rate</label>
<input v-model.number="form.max_rate_val" type="number" min="1" step="any" />
<select v-model="form.max_rate_unit" class="rate-unit-standalone">
<option value="pps">pps</option>
<option value="kbps">Kbps</option>
<option value="mbps">Mbps</option>
</select>
</div>
</div>
<div class="form-row">
<label>Acceptable Loss %</label>
<input v-model.number="form.acceptable_loss_pct" type="number" min="0" max="100" step="0.1" />
</div>
<button class="btn btn-accent" @click="create" :disabled="!form.dst_ip">
Create & Run Test
</button>
<div class="presets-section">
<h4>Quick Presets</h4>
<div class="preset-list">
<button v-for="(p, name) in presets" :key="name" class="btn-preset" @click="loadPreset(name)">
<strong>{{ name }}</strong>
<span>{{ p.description }}</span>
</button>
</div>
</div>
</div>
</template>
<script setup>
import { reactive, ref, onMounted } from 'vue'
import { api } from '../api.js'
const emit = defineEmits(['created', 'refresh'])
const standardSizes = [64, 128, 256, 512, 1024, 1280, 1518, 2048, 4096, 9000]
const presets = ref({})
const form = reactive({
type: 'throughput',
dst_ip: '',
src_ip: '',
protocol: 'udp',
frame_sizes: [64, 512, 1518],
trial_duration: 30,
max_rate_val: 10,
max_rate_unit: 'mbps',
acceptable_loss_pct: 0.0,
})
function computePps(val, unit) {
if (unit === 'kbps') return Math.max(1, Math.round((val * 1000) / (512 * 8)))
if (unit === 'mbps') return Math.max(1, Math.round((val * 1_000_000) / (512 * 8)))
return Math.round(val)
}
onMounted(async () => {
try { const r = await api.presets(); presets.value = r.presets || r } catch (_) {}
})
async function create() {
try {
const payload = {
type: form.type,
flow_config: {
dst_ip: form.dst_ip,
src_ip: form.src_ip || 'auto',
protocol: form.protocol,
src_port: 50000,
dst_port: 5001,
},
frame_sizes: form.frame_sizes,
trial_duration: form.trial_duration,
max_rate_pps: computePps(form.max_rate_val, form.max_rate_unit),
acceptable_loss_pct: form.acceptable_loss_pct,
}
const test = await api.createTest(payload)
await api.startTest(test.id)
emit('created')
} catch (e) { alert(e.message) }
}
async function loadPreset(name) {
const dstIp = prompt('Destination IP for this preset:', '10.100.0.100')
if (!dstIp) return
try {
await api.loadPreset(name, { dst_ip: dstIp })
emit('refresh')
} catch (e) { alert(e.message) }
}
</script>
<style scoped>
h3 { font-size: 15px; margin-bottom: 12px; color: var(--accent); }
h4 { font-size: 13px; margin: 16px 0 8px; color: var(--muted); }
.form-row { margin-bottom: 10px; }
.form-row label { display: block; font-size: 11px; color: var(--muted); margin-bottom: 3px; text-transform: uppercase; letter-spacing: 0.05em; }
.form-row input, .form-row select { width: 100%; }
.form-row-pair { display: grid; grid-template-columns: 1fr 1fr; gap: 8px; }
.rate-unit-standalone { width: 100%; margin-top: 4px; }
.frame-sizes { display: flex; flex-wrap: wrap; gap: 8px; }
.checkbox-label { font-size: 12px; display: flex; align-items: center; gap: 4px; color: var(--text); cursor: pointer; }
.btn { padding: 8px 16px; font-weight: 600; font-size: 13px; width: 100%; margin-top: 8px; }
.btn-accent { background: var(--accent); color: #fff; }
.btn-accent:disabled { opacity: 0.4; }
.preset-list { display: flex; flex-direction: column; gap: 6px; }
.btn-preset {
display: flex; flex-direction: column; align-items: flex-start;
padding: 8px 12px; background: var(--card-bg); border: 1px solid var(--border);
border-radius: var(--radius); text-align: left;
}
.btn-preset:hover { border-color: var(--accent); }
.btn-preset strong { font-size: 12px; color: var(--accent); }
.btn-preset span { font-size: 11px; color: var(--muted); }
</style>

View File

@ -0,0 +1,195 @@
<template>
<div class="test-runner">
<h3>Running Tests</h3>
<div v-if="!tests.length" class="empty">No tests yet. Create one above and click "Create & Run Test".</div>
<div v-for="t in sortedTests" :key="t.id" class="test-card" :class="'state-' + t.state">
<div class="test-header">
<div class="test-title">
<strong>{{ t.type }}</strong>
<span class="test-state" :class="'ts-' + t.state">{{ t.state }}</span>
<span v-if="t.frame_sizes" class="test-sizes">{{ t.frame_sizes.length }} frame sizes</span>
</div>
<div class="test-actions">
<button v-if="t.state === 'idle'" class="btn-sm btn-go" @click="start(t.id)">Start</button>
<button v-if="t.state === 'running'" class="btn-sm btn-stop" @click="stop(t.id)">Stop</button>
<button v-if="t.state === 'complete' || t.state === 'error'" class="btn-sm btn-del" @click="del(t.id)">Remove</button>
</div>
</div>
<!-- RUNNING: live progress -->
<div v-if="t.state === 'running'" class="progress-section">
<div class="progress-detail">
<span v-if="t.progress">{{ t.progress.message }}</span>
<span v-else>Starting...</span>
<span v-if="t.progress" class="progress-counter">
{{ (t.progress.completed_sizes || []).length }}/{{ t.progress.total_frames }} sizes done
</span>
</div>
<div class="progress-bar">
<div class="progress-fill" :style="{ width: progressPct(t) + '%' }"></div>
</div>
<!-- Show partial results as they come in -->
<div v-if="t.results && Object.keys(t.results).length" class="partial-results">
<div v-for="(val, size) in t.results" :key="size" class="partial-item">
<span class="partial-size">{{ size }}B</span>
<span class="partial-val" v-if="!Array.isArray(val) && val.max_throughput_pps != null">{{ val.max_throughput_pps }} pps</span>
<span class="partial-val" v-else-if="!Array.isArray(val) && val.avg_ms != null">{{ val.avg_ms }}ms avg</span>
<span class="partial-val" v-else-if="!Array.isArray(val) && val.max_burst_frames != null">{{ val.max_burst_frames }} frames</span>
<span class="partial-val" v-else-if="Array.isArray(val)">{{ val.length }} rate steps</span>
</div>
</div>
</div>
<!-- ERROR -->
<div v-if="t.state === 'error'" class="error-msg">{{ t.error || 'Test failed' }}</div>
<!-- COMPLETE: inline results summary -->
<div v-if="t.state === 'complete' && t.results && Object.keys(t.results).length" class="results-preview">
<!-- Frame Loss has a different structure (array per size) -->
<template v-if="t.type === 'frame_loss'">
<div v-for="(rates, size) in t.results" :key="size" class="fl-section">
<div class="fl-title">Frame Size: {{ size }} B</div>
<table>
<thead>
<tr>
<th>Rate %</th>
<th>Rate (pps)</th>
<th>TX Packets</th>
<th>RX Packets</th>
<th>Loss %</th>
</tr>
</thead>
<tbody>
<tr v-for="r in rates" :key="r.rate_pct">
<td class="mono">{{ r.rate_pct }}%</td>
<td class="mono">{{ r.rate_pps }}</td>
<td class="mono">{{ r.tx_packets }}</td>
<td class="mono">{{ r.rx_packets }}</td>
<td class="mono">{{ r.loss_pct }}%</td>
</tr>
</tbody>
</table>
</div>
</template>
<!-- Other test types: single value per size -->
<table v-else>
<thead>
<tr>
<th>Frame Size</th>
<th v-if="t.type === 'throughput'">Max Rate (pps)</th>
<th v-if="t.type === 'throughput'">Throughput</th>
<th v-if="t.type === 'latency'">Avg (ms)</th>
<th v-if="t.type === 'latency'">Min/Max (ms)</th>
<th v-if="t.type === 'back_to_back'">Max Burst</th>
</tr>
</thead>
<tbody>
<tr v-for="(val, size) in t.results" :key="size">
<td>{{ size }} B</td>
<td v-if="t.type === 'throughput'" class="mono">{{ val.max_throughput_pps || '-' }}</td>
<td v-if="t.type === 'throughput'" class="mono">{{ formatMbps(val) }}</td>
<td v-if="t.type === 'latency'" class="mono">{{ val.avg_ms != null ? val.avg_ms.toFixed(2) : '-' }}</td>
<td v-if="t.type === 'latency'" class="mono">{{ val.min_ms != null ? val.min_ms.toFixed(2) + ' / ' + val.max_ms.toFixed(2) : '-' }}</td>
<td v-if="t.type === 'back_to_back'" class="mono">{{ val.max_burst_frames ?? '-' }}</td>
</tr>
</tbody>
</table>
</div>
<div class="test-meta">
<span v-if="t.started_at">Started: {{ t.started_at }}</span>
<span v-if="t.completed_at">Completed: {{ t.completed_at }}</span>
<span v-if="t.state === 'running' && t.started_at">Elapsed: {{ elapsed(t) }}</span>
</div>
</div>
</div>
</template>
<script setup>
import { computed } from 'vue'
import { api } from '../api.js'
const props = defineProps({ tests: Array })
const emit = defineEmits(['refresh'])
const sortedTests = computed(() => {
const order = { running: 0, idle: 1, complete: 2, error: 3 }
return [...(props.tests || [])].sort((a, b) => (order[a.state] ?? 9) - (order[b.state] ?? 9))
})
function progressPct(t) {
if (!t.progress || !t.progress.total_frames) return 10
const done = (t.progress.completed_sizes || []).length
const partial = t.progress.frame_idx > done ? 0.5 : 0
return Math.min(95, ((done + partial) / t.progress.total_frames) * 100)
}
function formatMbps(val) {
const pps = val.max_throughput_pps || 0
const fs = val.frame_size || 64
if (!pps) return '-'
const mbps = (pps * fs * 8) / 1_000_000
return mbps.toFixed(1) + ' Mbps'
}
function elapsed(t) {
if (!t.started_at) return ''
const start = new Date(t.started_at).getTime()
const secs = Math.round((Date.now() - start) / 1000)
const m = Math.floor(secs / 60)
const s = secs % 60
return m > 0 ? `${m}m ${s}s` : `${s}s`
}
async function start(id) {
try { await api.startTest(id); emit('refresh') } catch (e) { alert(e.message) }
}
async function stop(id) {
try { await api.stopTest(id); emit('refresh') } catch (e) { alert(e.message) }
}
async function del(id) {
emit('refresh')
}
</script>
<style scoped>
h3 { font-size: 15px; margin-bottom: 12px; color: var(--accent); }
.empty { color: var(--muted); padding: 16px; text-align: center; font-size: 13px; }
.test-card { background: var(--card-bg); border: 1px solid var(--border); border-radius: var(--radius); padding: 12px; margin-bottom: 10px; }
.test-card.state-running { border-color: var(--accent); }
.test-card.state-complete { border-color: var(--success); }
.test-card.state-error { border-color: var(--danger); }
.test-header { display: flex; justify-content: space-between; align-items: center; }
.test-title { display: flex; align-items: center; gap: 8px; }
.test-title strong { font-size: 14px; text-transform: capitalize; }
.test-state { font-size: 11px; padding: 2px 8px; border-radius: 10px; font-weight: 600; }
.ts-idle { background: rgba(113,128,150,0.2); color: var(--muted); }
.ts-running { background: rgba(79,156,249,0.15); color: var(--accent); }
.ts-complete { background: rgba(72,187,120,0.15); color: var(--success); }
.ts-error { background: rgba(252,129,129,0.15); color: var(--danger); }
.test-sizes { font-size: 11px; color: var(--muted); }
.test-actions { display: flex; gap: 4px; }
.btn-sm { padding: 3px 10px; font-size: 11px; font-weight: 600; border-radius: 6px; }
.btn-go { background: var(--success); color: #fff; }
.btn-stop { background: var(--warning); color: #000; }
.btn-del { background: rgba(252,129,129,0.15); color: var(--danger); }
.progress-section { margin: 10px 0; }
.progress-detail { display: flex; justify-content: space-between; font-size: 12px; color: var(--muted); margin-bottom: 6px; font-family: monospace; }
.progress-counter { color: var(--accent); font-weight: 600; }
.progress-bar { height: 6px; background: var(--border); border-radius: 3px; overflow: hidden; }
.progress-fill { height: 100%; background: var(--accent); border-radius: 3px; transition: width 0.5s; }
.partial-results { display: flex; gap: 8px; flex-wrap: wrap; margin-top: 8px; }
.partial-item { background: rgba(72,187,120,0.1); border: 1px solid rgba(72,187,120,0.2); padding: 2px 8px; border-radius: 6px; font-size: 11px; }
.partial-size { font-weight: 600; color: var(--success); }
.partial-val { color: var(--text); margin-left: 4px; font-family: monospace; }
.error-msg { color: var(--danger); font-size: 13px; padding: 8px 0; }
.results-preview { margin-top: 10px; }
.results-preview table { width: 100%; border-collapse: collapse; }
.results-preview th { font-size: 11px; color: var(--muted); text-align: left; padding: 4px 8px; border-bottom: 1px solid var(--border); }
.results-preview td { font-size: 13px; padding: 4px 8px; }
.mono { font-family: monospace; }
.fl-section { margin-bottom: 12px; }
.fl-title { font-size: 12px; font-weight: 600; color: var(--accent); margin-bottom: 4px; }
.test-meta { display: flex; gap: 12px; margin-top: 8px; font-size: 11px; color: var(--muted); }
</style>

View File

@ -0,0 +1,3 @@
import { createApp } from 'vue'
import App from './App.vue'
createApp(App).mount('#app')

View File

@ -0,0 +1,15 @@
import { defineConfig } from 'vite'
import vue from '@vitejs/plugin-vue'
export default defineConfig({
base: '/traffic/',
plugins: [vue()],
server: {
proxy: {
'/api': {
target: 'http://localhost:5051',
rewrite: path => path.replace(/^\/api/, '')
}
}
}
})

8
traffic-gen/Dockerfile Normal file
View File

@ -0,0 +1,8 @@
FROM python:3.11-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
tcpreplay libpcap-dev procps iputils-ping && rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir flask scapy psutil
COPY . /traffic-gen/
WORKDIR /traffic-gen
EXPOSE 5051
CMD ["python3", "server.py"]

View File

@ -0,0 +1 @@
# traffic-gen engine package

View File

@ -0,0 +1,120 @@
"""
Packet Builder - constructs Scapy packets from flow configuration.
Each generated packet embeds:
- Magic bytes b'TGEN' (4 bytes)
- Sequence number (4 bytes, big-endian)
- Sender timestamp in nanoseconds (8 bytes, big-endian)
- Padding to reach requested frame_size
"""
import struct
import time
from scapy.all import (
Ether, IP, UDP, TCP, ICMP, Dot1Q, Raw, conf,
)
MAGIC = b'TGEN'
HEADER_LEN = 4 + 4 + 8 # magic + seq + timestamp_ns
def _build_payload(seq: int, frame_size: int, header_overhead: int) -> Raw:
"""Build payload with magic bytes, sequence number, timestamp placeholder,
and padding to reach the desired frame_size."""
timestamp_ns = time.time_ns()
header = MAGIC + struct.pack('!I', seq) + struct.pack('!Q', timestamp_ns)
# frame_size includes Ethernet header (14) + FCS (4) in standard accounting,
# but Scapy doesn't add FCS, so we target frame_size - 4 total bytes on wire.
# header_overhead accounts for Ether + IP + L4 headers already present.
pad_len = max(0, frame_size - 4 - header_overhead - HEADER_LEN)
return Raw(load=header + (b'\x00' * pad_len))
def stamp_payload(payload_bytes: bytes, seq: int) -> bytes:
"""Re-stamp an existing payload with a new sequence number and fresh timestamp."""
timestamp_ns = time.time_ns()
return (
MAGIC
+ struct.pack('!I', seq)
+ struct.pack('!Q', timestamp_ns)
+ payload_bytes[HEADER_LEN:]
)
def parse_payload(payload_bytes: bytes):
"""Extract (seq, timestamp_ns) from a TGEN payload, or None if invalid."""
if len(payload_bytes) < HEADER_LEN:
return None
if payload_bytes[:4] != MAGIC:
return None
seq = struct.unpack('!I', payload_bytes[4:8])[0]
timestamp_ns = struct.unpack('!Q', payload_bytes[8:16])[0]
return seq, timestamp_ns
def build_packet(flow_config: dict, seq: int = 0):
"""Build a Scapy packet from a flow configuration dict.
Required keys:
dst_ip, protocol
Optional keys:
src_mac, dst_mac, src_ip, src_port, dst_port, dscp, vlan_id, frame_size
"""
protocol = flow_config.get('protocol', 'udp').lower()
frame_size = flow_config.get('frame_size', 512)
# --- Layer 2 ---
src_mac = flow_config.get('src_mac', 'auto')
dst_mac = flow_config.get('dst_mac')
ether_kwargs = {}
if src_mac and src_mac != 'auto':
ether_kwargs['src'] = src_mac
if dst_mac:
ether_kwargs['dst'] = dst_mac
pkt = Ether(**ether_kwargs)
header_overhead = 14 # Ethernet
# --- VLAN ---
vlan_id = flow_config.get('vlan_id')
if vlan_id is not None:
pkt = pkt / Dot1Q(vlan=int(vlan_id))
header_overhead += 4
# --- Layer 3 ---
ip_kwargs = {'dst': flow_config['dst_ip']}
src_ip = flow_config.get('src_ip')
if src_ip and src_ip != 'auto':
ip_kwargs['src'] = src_ip
dscp = flow_config.get('dscp', 0)
if dscp:
ip_kwargs['tos'] = int(dscp) << 2
pkt = pkt / IP(**ip_kwargs)
header_overhead += 20 # IP (no options)
# --- Layer 4 ---
if protocol == 'udp':
src_port = flow_config.get('src_port') or 12000
dst_port = flow_config.get('dst_port') or 5001
pkt = pkt / UDP(sport=int(src_port), dport=int(dst_port))
header_overhead += 8
elif protocol == 'tcp':
src_port = flow_config.get('src_port') or 12000
dst_port = flow_config.get('dst_port') or 80
pkt = pkt / TCP(sport=int(src_port), dport=int(dst_port), flags='S')
header_overhead += 20
elif protocol == 'icmp':
pkt = pkt / ICMP()
header_overhead += 8
else:
raise ValueError(f'Unsupported protocol: {protocol}')
# --- Payload ---
pkt = pkt / _build_payload(seq, frame_size, header_overhead)
return pkt

View File

@ -0,0 +1,204 @@
"""
Responder - high-performance UDP packet receiver for TGEN traffic.
Uses multiple receiver threads on SO_REUSEPORT UDP sockets for parallel
packet processing. Each thread has its own socket and stats counters
to avoid contention.
"""
import logging
import os
import socket
import struct
import threading
import time
from engine.packet_builder import MAGIC, HEADER_LEN
log = logging.getLogger(__name__)
DEFAULT_LISTEN_PORT = 5001
RECV_BUF_SIZE = 16 * 1024 * 1024
NUM_WORKERS = int(os.environ.get('RESPONDER_WORKERS', 4))
class _WorkerStats:
"""Per-worker stats — no sharing, no locks."""
__slots__ = ('rx_packets', 'rx_bytes', 'out_of_order', 'duplicates',
'last_seq', 'lat_buf', 'lat_idx', 'lat_count')
def __init__(self):
self.rx_packets = 0
self.rx_bytes = 0
self.out_of_order = 0
self.duplicates = 0
self.last_seq = -1
self.lat_buf = [0.0] * 4096
self.lat_idx = 0
self.lat_count = 0
class Responder:
def __init__(self, mode: str = 'log', listen_port: int = DEFAULT_LISTEN_PORT):
self._mode = mode
self._listen_port = listen_port
self._sockets = []
self._threads = []
self._workers = []
self._running = False
self._stop_event = threading.Event()
def start(self, interface: str = None):
if self._running:
return
self._stop_event.clear()
n = NUM_WORKERS
for i in range(n):
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
try:
sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, RECV_BUF_SIZE)
except OSError:
pass
sock.settimeout(0.5)
sock.bind(('0.0.0.0', self._listen_port))
self._sockets.append(sock)
ws = _WorkerStats()
self._workers.append(ws)
t = threading.Thread(target=self._recv_loop, args=(sock, ws),
daemon=True, name=f'responder-rx-{i}')
self._threads.append(t)
t.start()
actual_buf = self._sockets[0].getsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF)
self._running = True
log.info('Responder started on port=%d mode=%s workers=%d rcvbuf=%d',
self._listen_port, self._mode, n, actual_buf)
def stop(self):
self._stop_event.set()
for t in self._threads:
if t.is_alive():
t.join(timeout=3)
for s in self._sockets:
try:
s.close()
except Exception:
pass
self._sockets.clear()
self._threads.clear()
self._workers.clear()
self._running = False
log.info('Responder stopped')
def is_running(self) -> bool:
return self._running
def get_stats(self) -> dict:
rx_packets = 0
rx_bytes = 0
out_of_order = 0
duplicates = 0
all_lat = []
for ws in self._workers:
rx_packets += ws.rx_packets
rx_bytes += ws.rx_bytes
out_of_order += ws.out_of_order
duplicates += ws.duplicates
n = min(ws.lat_count, len(ws.lat_buf))
if n > 0:
all_lat.extend(ws.lat_buf[:n] if ws.lat_count <= len(ws.lat_buf) else ws.lat_buf[:])
latency = {}
if all_lat:
avg = sum(all_lat) / len(all_lat)
mn = min(all_lat)
mx = max(all_lat)
jitter = 0.0
if len(all_lat) > 1:
jitter = sum(abs(all_lat[i] - all_lat[i-1]) for i in range(1, len(all_lat))) / (len(all_lat) - 1)
latency = {
'min_ms': round(mn, 3),
'max_ms': round(mx, 3),
'avg_ms': round(avg, 3),
'jitter_ms': round(jitter, 3),
'samples': len(all_lat),
}
return {
'rx_packets': rx_packets,
'rx_bytes': rx_bytes,
'out_of_order': out_of_order,
'duplicates': duplicates,
'latency': latency,
'running': self._running,
}
def reset_stats(self):
for ws in self._workers:
ws.rx_packets = 0
ws.rx_bytes = 0
ws.last_seq = -1
ws.out_of_order = 0
ws.duplicates = 0
ws.lat_idx = 0
ws.lat_count = 0
log.info('Responder stats reset')
def _recv_loop(self, sock, ws):
"""Per-worker receive loop."""
echo = self._mode == 'echo'
recvfrom = sock.recvfrom
time_ns = time.time_ns
stop_is_set = self._stop_event.is_set
lat_buf = ws.lat_buf
lat_buf_len = len(lat_buf)
magic = MAGIC
while not stop_is_set():
try:
data, addr = recvfrom(65535)
except socket.timeout:
continue
except OSError:
if stop_is_set():
break
raise
rx_ns = time_ns()
dlen = len(data)
if dlen < HEADER_LEN or data[:4] != magic:
continue
seq = int.from_bytes(data[4:8], 'big')
sender_ns = int.from_bytes(data[8:16], 'big')
ws.rx_packets += 1
ws.rx_bytes += dlen
last = ws.last_seq
if seq == last:
ws.duplicates += 1
elif seq < last:
ws.out_of_order += 1
ws.last_seq = seq
lat_ms = (rx_ns - sender_ns) / 1_000_000
if 0 < lat_ms < 60000:
idx = ws.lat_idx
lat_buf[idx] = lat_ms
ws.lat_idx = (idx + 1) % lat_buf_len
ws.lat_count += 1
if echo:
try:
sock.sendto(data + struct.pack('!Q', rx_ns), addr)
except Exception:
pass

View File

@ -0,0 +1,435 @@
"""
RFC 2544 test implementations:
- ThroughputTest: binary search for max zero-loss throughput
- LatencyTest: measure latency at a given rate
- FrameLossTest: measure loss at decreasing rates
- BackToBackTest: find max burst length with zero loss
"""
import logging
import socket
import struct
import threading
import time
import urllib.request
import json
from typing import Dict, List, Optional
from scapy.all import send, sr, conf, IP, ICMP
from engine.packet_builder import build_packet, parse_payload, MAGIC
log = logging.getLogger(__name__)
conf.verb = 0
class _BaseTest:
"""Base class for RFC 2544 tests."""
def __init__(self, test_id: str, flow_config: dict, frame_sizes: List[int],
trial_duration: float = 60, max_rate_pps: int = 10000,
acceptable_loss_pct: float = 0.0, responder_url: str = None):
self.test_id = test_id
self.flow_config = dict(flow_config)
self.frame_sizes = frame_sizes
self.trial_duration = trial_duration
self.max_rate_pps = max_rate_pps
self.acceptable_loss_pct = acceptable_loss_pct
self.responder_url = responder_url # e.g. "http://172.30.0.10:5053"
self.state = 'idle' # idle -> running -> complete/error
self.results = {}
self.error = None
self.started_at = None
self.completed_at = None
# Progress tracking
self._progress_msg = ''
self._current_frame_idx = 0
self._current_trial_tx = 0
self._thread: Optional[threading.Thread] = None
self._stop_event = threading.Event()
self._lock = threading.Lock()
def start(self):
if self.state == 'running':
return
self._stop_event.clear()
self.state = 'running'
self.started_at = time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime())
self.results = {}
self.error = None
self._thread = threading.Thread(target=self._run_safe, daemon=True,
name=f'test-{self.test_id[:8]}')
self._thread.start()
def stop(self):
self._stop_event.set()
if self._thread and self._thread.is_alive():
self._thread.join(timeout=self.trial_duration + 5)
if self.state == 'running':
self.state = 'error'
self.error = 'Cancelled by user'
def _run_safe(self):
try:
self._run()
if self.state == 'running':
self.state = 'complete'
except Exception as e:
log.error('Test %s error: %s', self.test_id[:8], e)
self.state = 'error'
self.error = str(e)
finally:
self.completed_at = time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime())
def _run(self):
raise NotImplementedError
def _is_stopped(self) -> bool:
return self._stop_event.is_set()
def _responder_reset(self):
"""Reset responder stats before a trial."""
if not self.responder_url:
return
try:
req = urllib.request.Request(
f'{self.responder_url}/responder/reset', method='POST',
data=b'{}', headers={'Content-Type': 'application/json'})
urllib.request.urlopen(req, timeout=3)
except Exception as e:
log.warning('Responder reset failed: %s', e)
def _responder_stats(self) -> Optional[dict]:
"""Query responder for rx stats after a trial."""
if not self.responder_url:
return None
try:
req = urllib.request.Request(f'{self.responder_url}/responder/stats')
resp = urllib.request.urlopen(req, timeout=5)
return json.loads(resp.read())
except Exception as e:
log.warning('Responder stats query failed: %s', e)
return None
def _send_trial(self, frame_size: int, rate_pps: int, duration: float):
"""Send packets at a given rate for a duration. Returns (tx_count, rx_count, latencies)."""
flow = dict(self.flow_config)
flow['frame_size'] = frame_size
protocol = flow.get('protocol', 'udp').lower()
# Reset responder counters before trial
self._responder_reset()
tx_count = 0
rx_count = 0
latencies = []
start = time.time()
if protocol == 'icmp':
# ICMP: use sr() to measure latency from responses
seq = 0
while (time.time() - start) < duration and not self._is_stopped():
pkt = build_packet(flow, seq=seq)
seq += 1
answered, _ = sr(pkt[IP], timeout=1, verbose=0)
tx_count += 1
for sent_pkt, recv_pkt in answered:
rx_count += 1
rtt_ms = (recv_pkt.time - sent_pkt.sent_time) * 1000
latencies.append(rtt_ms)
elapsed = time.time() - start
expected = elapsed * rate_pps
if tx_count > expected:
sleep_time = (tx_count - expected) / rate_pps
if sleep_time > 0:
self._stop_event.wait(min(sleep_time, 0.1))
else:
# UDP/TCP: high-performance raw socket path
dst_ip = flow['dst_ip']
pkt_template = build_packet(flow, seq=0)
ip_template = bytes(pkt_template[pkt_template.firstlayer().payload.__class__])
magic_offset = ip_template.find(MAGIC)
# Find and zero UDP checksum in template so receivers accept packets
# IP header length from IHL field (byte 0, low nibble) * 4
ip_ihl = (ip_template[0] & 0x0F) * 4
ip_proto = ip_template[9] # protocol field
udp_csum_offset = ip_ihl + 6 if ip_proto == 17 else -1 # 17 = UDP
raw_sock = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_RAW)
raw_sock.setsockopt(socket.IPPROTO_IP, socket.IP_HDRINCL, 1)
batch_size = max(1, min(rate_pps // 5, 500))
interval = batch_size / rate_pps if rate_pps > 0 else 1.0
seq = 0
try:
while (time.time() - start) < duration and not self._is_stopped():
batch_start = time.time()
for _ in range(batch_size):
pkt_bytes = bytearray(ip_template)
if magic_offset >= 0:
struct.pack_into('!I', pkt_bytes, magic_offset + 4, seq)
struct.pack_into('!Q', pkt_bytes, magic_offset + 8, time.time_ns())
pkt_bytes[10:12] = b'\x00\x00' # zero IP checksum
if udp_csum_offset > 0:
pkt_bytes[udp_csum_offset:udp_csum_offset + 2] = b'\x00\x00'
try:
raw_sock.sendto(bytes(pkt_bytes), (dst_ip, 0))
tx_count += 1
except Exception:
pass
seq += 1
batch_elapsed = time.time() - batch_start
sleep_time = interval - batch_elapsed
if sleep_time > 0:
self._stop_event.wait(sleep_time)
finally:
raw_sock.close()
# Query responder for actual rx stats (UDP/TCP path)
if protocol != 'icmp':
resp_stats = self._responder_stats()
if resp_stats and resp_stats.get('rx_packets', 0) > 0:
rx_count = resp_stats['rx_packets']
lat = resp_stats.get('latency', {})
if lat.get('samples', 0) > 0:
latencies = [lat['avg_ms']] # Use avg as representative
return tx_count, rx_count, latencies
def get_info(self) -> dict:
# Reverse-lookup the slug for this test class
type_slug = next((k for k, v in TEST_TYPES.items() if v is self.__class__), self.__class__.__name__)
info = {
'id': self.test_id,
'test_id': self.test_id,
'type': type_slug,
'state': self.state,
'results': self.results,
'error': self.error,
'frame_sizes': self.frame_sizes,
'started_at': self.started_at,
'completed_at': self.completed_at,
}
if self.state == 'running':
info['progress'] = {
'frame_idx': self._current_frame_idx,
'total_frames': len(self.frame_sizes),
'message': self._progress_msg,
'completed_sizes': list(self.results.keys()),
}
return info
class ThroughputTest(_BaseTest):
"""Binary search for maximum throughput with acceptable loss."""
def _run(self):
for idx, fs in enumerate(self.frame_sizes):
if self._is_stopped():
break
self._current_frame_idx = idx
low = 0
high = self.max_rate_pps
best_rate = 0
convergence_threshold = max(1, int(self.max_rate_pps * 0.01))
step = 0
log.info('Throughput test: frame_size=%d, searching [%d, %d] pps', fs, low, high)
while (high - low) > convergence_threshold and not self._is_stopped():
mid = (low + high) // 2
if mid == 0:
break
step += 1
self._progress_msg = f'Frame {fs}B: trial {step}, testing {mid} pps [{low}-{high}]'
tx, rx, _ = self._send_trial(fs, mid, self.trial_duration)
if tx == 0:
loss_pct = 100.0
elif rx > 0:
loss_pct = ((tx - rx) / tx) * 100
else:
loss_pct = 0.0 # No responder/ICMP — assume success
log.info(' frame=%d rate=%d tx=%d rx=%d loss=%.2f%%',
fs, mid, tx, rx, loss_pct)
if loss_pct <= self.acceptable_loss_pct:
best_rate = mid
low = mid + 1
else:
high = mid - 1
self.results[str(fs)] = {
'max_throughput_pps': best_rate,
'frame_size': fs,
}
log.info('Throughput result: frame_size=%d -> %d pps', fs, best_rate)
class LatencyTest(_BaseTest):
"""Measure latency at a specified rate."""
def _run(self):
rate = self.flow_config.get('rate_pps', 100)
for idx, fs in enumerate(self.frame_sizes):
if self._is_stopped():
break
self._current_frame_idx = idx
self._progress_msg = f'Frame {fs}B: sending at {rate} pps for {self.trial_duration}s'
log.info('Latency test: frame_size=%d at %d pps for %ds', fs, rate, self.trial_duration)
_, _, latencies = self._send_trial(fs, rate, self.trial_duration)
if latencies:
avg_ms = sum(latencies) / len(latencies)
min_ms = min(latencies)
max_ms = max(latencies)
jitter_ms = (
sum(abs(latencies[i] - latencies[i - 1]) for i in range(1, len(latencies)))
/ max(1, len(latencies) - 1)
) if len(latencies) > 1 else 0.0
self.results[str(fs)] = {
'frame_size': fs,
'min_ms': round(min_ms, 3),
'avg_ms': round(avg_ms, 3),
'max_ms': round(max_ms, 3),
'jitter_ms': round(jitter_ms, 3),
'samples': len(latencies),
}
else:
self.results[str(fs)] = {
'frame_size': fs,
'min_ms': None, 'avg_ms': None, 'max_ms': None,
'jitter_ms': None, 'samples': 0,
'note': 'No responses received (use ICMP or configure responder)',
}
log.info('Latency result: frame_size=%d -> %s', fs, self.results[str(fs)])
class FrameLossTest(_BaseTest):
"""Measure frame loss at decreasing rates (100%, 90%, 80%, ...)."""
def _run(self):
for idx, fs in enumerate(self.frame_sizes):
if self._is_stopped():
break
self._current_frame_idx = idx
results_for_size = []
for pct in range(100, 0, -10):
if self._is_stopped():
break
rate = int(self.max_rate_pps * pct / 100)
if rate == 0:
continue
self._progress_msg = f'Frame {fs}B: testing at {pct}% rate ({rate} pps)'
log.info('FrameLoss test: frame_size=%d rate=%d (%d%%)', fs, rate, pct)
tx, rx, _ = self._send_trial(fs, rate, self.trial_duration)
if tx > 0 and rx > 0:
loss_pct = ((tx - rx) / tx) * 100
elif tx > 0 and rx == 0:
loss_pct = 0.0 # No responder — cannot measure
else:
loss_pct = 100.0
results_for_size.append({
'rate_pct': pct,
'rate_pps': rate,
'tx_packets': tx,
'rx_packets': rx,
'loss_pct': round(loss_pct, 3),
})
self.results[str(fs)] = results_for_size
class BackToBackTest(_BaseTest):
"""Find maximum burst length with zero loss."""
def _run(self):
for idx, fs in enumerate(self.frame_sizes):
if self._is_stopped():
break
low = 1
high = self.max_rate_pps # Use max_rate_pps as max burst length
best_burst = 0
convergence = max(1, high // 100)
self._current_frame_idx = idx
log.info('BackToBack test: frame_size=%d searching burst [%d, %d]', fs, low, high)
while (high - low) > convergence and not self._is_stopped():
mid = (low + high) // 2
if mid == 0:
break
flow = dict(self.flow_config)
flow['frame_size'] = fs
protocol = flow.get('protocol', 'udp').lower()
# Send burst of 'mid' packets as fast as possible
tx_count = 0
rx_count = 0
for seq in range(mid):
if self._is_stopped():
break
pkt = build_packet(flow, seq=seq)
if protocol == 'icmp':
answered, _ = sr(pkt[IP], timeout=0.5, verbose=0)
tx_count += 1
rx_count += len(answered)
else:
send(pkt[IP], verbose=0)
tx_count += 1
if tx_count > 0 and rx_count > 0:
loss_pct = ((tx_count - rx_count) / tx_count) * 100
elif tx_count > 0:
loss_pct = 0.0 # No responder — cannot measure
else:
loss_pct = 100.0
log.info(' burst=%d tx=%d rx=%d loss=%.2f%%', mid, tx_count, rx_count, loss_pct)
if loss_pct <= self.acceptable_loss_pct:
best_burst = mid
low = mid + 1
else:
high = mid - 1
self.results[str(fs)] = {
'frame_size': fs,
'max_burst_frames': best_burst,
}
log.info('BackToBack result: frame_size=%d -> %d frames', fs, best_burst)
# Factory function
TEST_TYPES = {
'throughput': ThroughputTest,
'latency': LatencyTest,
'frame_loss': FrameLossTest,
'back_to_back': BackToBackTest,
}
def create_test(test_id: str, test_type: str, flow_config: dict, **kwargs):
"""Create an RFC 2544 test instance by type name."""
cls = TEST_TYPES.get(test_type)
if cls is None:
raise ValueError(f'Unknown test type: {test_type}. Available: {list(TEST_TYPES)}')
return cls(test_id=test_id, flow_config=flow_config, **kwargs)

View File

@ -0,0 +1,318 @@
"""
FlowSender - manages traffic generation with background threads per flow.
"""
import logging
import shutil
import socket
import struct
import threading
import time
import urllib.request
import json
from scapy.all import send, sendpfast, sr, conf
from engine.packet_builder import build_packet, stamp_payload, MAGIC, HEADER_LEN
log = logging.getLogger(__name__)
# Suppress Scapy verbosity globally
conf.verb = 0
HAS_TCPREPLAY = shutil.which('tcpreplay') is not None
class FlowSender:
"""Manages sending threads for multiple flows."""
def __init__(self):
self._lock = threading.Lock()
self._flows = {} # flow_id -> flow_config dict
self._threads = {} # flow_id -> Thread
self._stop_events = {} # flow_id -> Event
self._stats = {} # flow_id -> {tx_packets, tx_bytes, ...}
# ------------------------------------------------------------------
# Flow CRUD
# ------------------------------------------------------------------
def add_flow(self, flow_id: str, flow_config: dict):
with self._lock:
self._flows[flow_id] = flow_config
self._stats[flow_id] = {
'tx_packets': 0, 'tx_bytes': 0,
'rx_packets': 0, 'rx_bytes': 0,
'latency_samples': [],
}
def get_flow(self, flow_id: str):
with self._lock:
return self._flows.get(flow_id)
def get_all_flows(self):
with self._lock:
return dict(self._flows)
def update_flow(self, flow_id: str, updates: dict):
with self._lock:
if flow_id not in self._flows:
return False
self._flows[flow_id].update(updates)
return True
def remove_flow(self, flow_id: str):
self.stop(flow_id)
with self._lock:
self._flows.pop(flow_id, None)
self._stats.pop(flow_id, None)
# ------------------------------------------------------------------
# Start / Stop
# ------------------------------------------------------------------
def start(self, flow_id: str):
with self._lock:
if flow_id not in self._flows:
raise KeyError(f'Flow {flow_id} not found')
if flow_id in self._threads and self._threads[flow_id].is_alive():
return # already running
self._flows[flow_id]['state'] = 'running'
self._stats[flow_id] = {
'tx_packets': 0, 'tx_bytes': 0,
'rx_packets': 0, 'rx_bytes': 0,
'latency_samples': [],
}
stop_event = threading.Event()
self._stop_events[flow_id] = stop_event
t = threading.Thread(
target=self._send_loop,
args=(flow_id, stop_event),
daemon=True,
name=f'sender-{flow_id[:8]}',
)
self._threads[flow_id] = t
t.start()
def stop(self, flow_id: str):
with self._lock:
ev = self._stop_events.pop(flow_id, None)
if ev:
ev.set()
t = self._threads.pop(flow_id, None)
if flow_id in self._flows:
self._flows[flow_id]['state'] = 'stopped'
if t and t.is_alive():
t.join(timeout=5)
def is_running(self, flow_id: str) -> bool:
with self._lock:
t = self._threads.get(flow_id)
return t is not None and t.is_alive()
# ------------------------------------------------------------------
# Stats
# ------------------------------------------------------------------
def get_stats(self, flow_id: str) -> dict:
with self._lock:
s = self._stats.get(flow_id, {})
return dict(s)
def get_all_stats(self) -> dict:
with self._lock:
return {fid: dict(s) for fid, s in self._stats.items()}
# ------------------------------------------------------------------
# Internal send loop
# ------------------------------------------------------------------
def _send_loop(self, flow_id: str, stop_event: threading.Event):
with self._lock:
flow = dict(self._flows[flow_id])
rate_pps = flow.get('rate_pps', 1000)
duration = flow.get('duration', 30)
protocol = flow.get('protocol', 'udp').lower()
responder_url = flow.get('responder_url')
use_icmp_sr = (protocol == 'icmp' and not responder_url)
# Build template packet
pkt_template = build_packet(flow, seq=0)
pkt_bytes_len = len(bytes(pkt_template))
seq = 0
start_time = time.time()
last_responder_poll = 0
log.info('Flow %s: starting send loop at %d pps for %ds',
flow_id[:8], rate_pps, duration)
# Capture responder baseline so we report deltas, not cumulative totals
responder_baseline_rx = 0
responder_baseline_bytes = 0
if responder_url:
try:
base = self._fetch_responder(responder_url)
responder_baseline_rx = base.get('rx_packets', 0)
responder_baseline_bytes = base.get('rx_bytes', 0)
# Also reset responder so baseline is clean
self._reset_responder(responder_url)
responder_baseline_rx = 0
responder_baseline_bytes = 0
except Exception:
pass
raw_sock = None
try:
if use_icmp_sr:
self._send_loop_icmp(flow_id, flow, stop_event, rate_pps, duration)
return
# --- High-performance path: raw socket ---
dst_ip = flow['dst_ip']
# Build template as raw IP bytes (strip Ethernet layer)
ip_template = bytes(pkt_template[pkt_template.firstlayer().payload.__class__])
# Find where TGEN magic starts in the IP-layer bytes
magic_offset = ip_template.find(MAGIC)
# Find and zero UDP checksum offset in template
ip_ihl = (ip_template[0] & 0x0F) * 4
ip_proto = ip_template[9]
udp_csum_offset = ip_ihl + 6 if ip_proto == 17 else -1 # 17 = UDP
raw_sock = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_RAW)
raw_sock.setsockopt(socket.IPPROTO_IP, socket.IP_HDRINCL, 1)
# Adaptive batching: send bursts then sleep to hit target rate
batch_size = max(1, min(rate_pps // 5, 500))
interval = batch_size / rate_pps if rate_pps > 0 else 1.0
while not stop_event.is_set():
elapsed = time.time() - start_time
if duration and elapsed >= duration:
break
batch_start = time.time()
sent_this_batch = 0
for _ in range(batch_size):
pkt_bytes = bytearray(ip_template)
if magic_offset >= 0:
struct.pack_into('!I', pkt_bytes, magic_offset + 4, seq)
struct.pack_into('!Q', pkt_bytes, magic_offset + 8, time.time_ns())
pkt_bytes[10:12] = b'\x00\x00' # zero IP checksum
if udp_csum_offset > 0:
pkt_bytes[udp_csum_offset:udp_csum_offset + 2] = b'\x00\x00'
try:
raw_sock.sendto(bytes(pkt_bytes), (dst_ip, 0))
sent_this_batch += 1
except Exception:
pass
seq += 1
with self._lock:
stats = self._stats.get(flow_id)
if stats:
stats['tx_packets'] += sent_this_batch
stats['tx_bytes'] += pkt_bytes_len * sent_this_batch
# Poll responder for rx stats periodically
if responder_url and (time.time() - last_responder_poll) >= 1.0:
self._poll_responder(flow_id, responder_url,
responder_baseline_rx, responder_baseline_bytes)
last_responder_poll = time.time()
# Precise rate limiting: sleep remaining time for this batch
batch_elapsed = time.time() - batch_start
sleep_time = interval - batch_elapsed
if sleep_time > 0:
stop_event.wait(sleep_time)
except Exception as e:
log.error('Flow %s: send loop error: %s', flow_id[:8], e)
finally:
if raw_sock:
raw_sock.close()
with self._lock:
if flow_id in self._flows:
self._flows[flow_id]['state'] = 'stopped'
# Final responder poll
if responder_url:
self._poll_responder(flow_id, responder_url,
responder_baseline_rx, responder_baseline_bytes)
log.info('Flow %s: send loop finished. seq=%d', flow_id[:8], seq)
def _send_loop_icmp(self, flow_id, flow, stop_event, rate_pps, duration):
"""ICMP mode: use sr() to measure latency from router responses."""
pkt_template = build_packet(flow, seq=0)
pkt_bytes_len = len(bytes(pkt_template))
seq = 0
start_time = time.time()
try:
while not stop_event.is_set():
elapsed = time.time() - start_time
if duration and elapsed >= duration:
break
pkt = build_packet(flow, seq=seq)
answered, _ = sr(pkt[pkt.firstlayer().payload.__class__],
timeout=1, verbose=0)
with self._lock:
stats = self._stats.get(flow_id)
if stats:
stats['tx_packets'] += 1
stats['tx_bytes'] += pkt_bytes_len
for sent_pkt, recv_pkt in answered:
rtt_ms = (recv_pkt.time - sent_pkt.sent_time) * 1000
stats['rx_packets'] += 1
stats['rx_bytes'] += len(bytes(recv_pkt))
stats['latency_samples'].append(rtt_ms)
if len(stats['latency_samples']) > 1000:
stats['latency_samples'] = stats['latency_samples'][-1000:]
seq += 1
sleep_time = (1.0 / rate_pps) - (time.time() - start_time - elapsed)
if sleep_time > 0:
stop_event.wait(sleep_time)
except Exception as e:
log.error('Flow %s: ICMP send error: %s', flow_id[:8], e)
finally:
with self._lock:
if flow_id in self._flows:
self._flows[flow_id]['state'] = 'stopped'
def _fetch_responder(self, responder_url: str) -> dict:
"""Fetch raw stats from the responder."""
url = responder_url.rstrip('/') + '/responder/stats'
req = urllib.request.Request(url, method='GET')
req.add_header('Accept', 'application/json')
with urllib.request.urlopen(req, timeout=2) as resp:
return json.loads(resp.read().decode())
def _reset_responder(self, responder_url: str):
"""Reset responder counters."""
url = responder_url.rstrip('/') + '/responder/reset'
req = urllib.request.Request(url, method='POST')
req.add_header('Content-Type', 'application/json')
with urllib.request.urlopen(req, timeout=2) as resp:
resp.read()
def _poll_responder(self, flow_id: str, responder_url: str,
baseline_rx: int = 0, baseline_bytes: int = 0):
"""Poll a responder's /responder/stats endpoint for rx metrics."""
try:
data = self._fetch_responder(responder_url)
rx_pkts = data.get('rx_packets', 0) - baseline_rx
rx_bytes = data.get('rx_bytes', 0) - baseline_bytes
with self._lock:
stats = self._stats.get(flow_id)
if stats:
stats['rx_packets'] = max(0, rx_pkts)
stats['rx_bytes'] = max(0, rx_bytes)
lat = data.get('latency', {})
if lat.get('avg_ms') is not None:
stats['latency_samples'].append(lat['avg_ms'])
if len(stats['latency_samples']) > 1000:
stats['latency_samples'] = stats['latency_samples'][-1000:]
except Exception as e:
log.debug('Responder poll error for flow %s: %s', flow_id[:8], e)

Some files were not shown because too many files have changed in this diff Show More