obmp-docker/DOCS.md
sam dcebf15bb3 Add Phase 4: gNMI streaming telemetry and traffic generator
- gNMI integration: NETCONF script to enable gRPC on all 9 routers,
  Telegraf container with gnmi input plugin, InfluxDB for time-series
  storage, 3 Grafana telemetry dashboards (utilization, errors, combined)
- Traffic generator: Scapy-based dual-mode container (sender/responder)
  with Flask API, RFC 2544 test suite (throughput, latency, frame-loss,
  back-to-back), Vue 3 web UI with flow builder, test runner, real-time
  stats monitor, and results export
- docker-compose.yml updated with influxdb, telegraf, traffic-gen,
  traffic-gen-ui services
- Full documentation in DOCS.md sections 15-16

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 15:29:44 -07:00

35 KiB
Raw Permalink Blame History

OpenBMP + ExaBGP Route Injector — Full Documentation

Table of Contents

  1. What Is This Project?
  2. Architecture
  3. Prerequisites
  4. Initial Setup (First Time)
  5. IOS-XR Router Configuration
  6. Starting and Stopping
  7. Route Injection User Guide
  8. ExaBGP Control Panel (Web UI)
  9. Grafana Dashboards
  10. Sanity Checks
  11. Relevant Commands Reference
  12. Troubleshooting
  13. Data Retention
  14. Environment Variables Reference
  15. gNMI Streaming Telemetry (Phase 4)
  16. Traffic Generator (Phase 4)

1. What Is This Project?

This is a BGP Monitoring Platform (BMP) lab stack deployed via Docker Compose. It collects, stores, and visualizes BGP routing data from a Cisco IOS-XR lab network (running in Cisco Modeling Labs / CML).

What it does:

  • Receives BMP (BGP Monitoring Protocol, RFC 7854) telemetry from routers on TCP port 5000
  • Streams BMP data through Kafka into a TimescaleDB/PostgreSQL database
  • Provides 30 Grafana dashboards (17 operational + 6 learning + 4 advanced analytics + 3 streaming telemetry) for real-time and historical BGP analysis
  • Includes an ExaBGP route injector that peers with the two CORE routers and injects synthetic BGP routes, enabling testing of BGP policy, route propagation, and Grafana dashboards without needing internet connectivity
  • Provides a Vue 3 web UI at :5001 for point-and-click scenario management, live route tables, and peer monitoring

The lab network:

  • AS 65020 — 9 Cisco IOS-XR routers in CML (iBGP full mesh via route-reflectors)
  • AS 65100 — ExaBGP container (eBGP peer to both CORE routers)
  • CORE-01: 10.100.0.100 (CML-R9K-CORE-01)
  • CORE-02: 10.100.0.200 (CML-R9K-CORE-02)
  • Host IP: 10.40.40.202 (ExaBGP binds here; reachable from CML management network)

2. Architecture

IOS-XR Routers (9x, AS 65020)
  BMP telemetry on TCP 5000
         |
         v
  obmp-collector (openbmp/collector:2.2.3)
         |
         v
  obmp-kafka (confluentinc/cp-kafka:7.1.1)
    + obmp-zookeeper (confluentinc/cp-zookeeper:7.1.1)
         |
         v
  obmp-psql-app (openbmp/psql-app:2.2.2)
    Java consumer — writes parsed BGP data to PostgreSQL
         |
         v
  obmp-psql (openbmp/postgres:2.2.1)
    PostgreSQL 14 + TimescaleDB
         |
         +---------> obmp-grafana (grafana/grafana:9.1.7)  :3000
         |              30 dashboards, PostgreSQL + InfluxDB datasources
         +---------> obmp-whois (openbmp/whois:2.2.0)      :4300
                       WHOIS query server backed by the DB

ExaBGP (obmp-exabgp, built locally)
  python:3.11-slim + exabgp 5.x + Flask API
  Peers eBGP to CORE-01 and CORE-02 (AS 65100 -> AS 65020)
  HTTP API on :5050 — inject/withdraw routes on demand
  Routes propagate via iBGP mesh to all 9 routers -> BMP -> DB -> Grafana

gNMI Streaming Telemetry (Phase 4):
  IOS-XR Routers (gRPC :57400)
         |
         v
  obmp-telegraf (telegraf:1.28 + gnmi plugin)
         |
         v
  obmp-influxdb (influxdb:2.7)  :8086
         |
         v
  obmp-grafana (InfluxDB datasource -> Telemetry dashboards)

Traffic Generator (Phase 4):
  obmp-traffic-gen (python:3.11 + Scapy + Flask)  :5051
    Dual-mode: sender (generate traffic) / responder (echo/log)
    RFC 2544 testing, custom packet flows
  obmp-traffic-gen-ui (Vue 3 + NGINX)  :5002

Container Summary

Container Image Port(s) Role
obmp-zookeeper confluentinc/cp-zookeeper:7.1.1 2181 (internal) Kafka coordination
obmp-kafka confluentinc/cp-kafka:7.1.1 9092 Message broker
obmp-collector openbmp/collector:2.2.3 5000 BMP receiver
obmp-psql-app openbmp/psql-app:2.2.2 9005 Kafka→PostgreSQL consumer
obmp-psql openbmp/postgres:2.2.1 5432 TimescaleDB storage
obmp-grafana grafana/grafana:9.1.7 3000 Visualization
obmp-whois openbmp/whois:2.2.0 4300 WHOIS query server
obmp-exabgp local build 5050 (host net) BGP route injector
obmp-exabgp-ui local build 5001 (host net) Route injector web UI
obmp-influxdb influxdb:2.7 8086 Time-series DB for telemetry
obmp-telegraf local build - (host net) gNMI telemetry collector
obmp-traffic-gen local build 5051 (host net) Scapy traffic generator
obmp-traffic-gen-ui local build 5002 (host net) Traffic generator web UI

3. Prerequisites

  • Docker Engine (20.10+) and Docker Compose v2
  • Host IP 10.40.40.202 reachable from the CML management network
  • CML routers with BMP configured pointing to 10.40.40.202:5000
  • CML CORE routers configured with ExaBGP as eBGP neighbor (see Section 5)
  • OBMP_DATA_ROOT directory created (default: /var/openbmp)

4. Initial Setup (First Time)

4.1 Clone the repository

git clone <this-repo-url>
cd obmp-docker

4.2 Create persistent data directories

export OBMP_DATA_ROOT=/var/openbmp
sudo mkdir -p $OBMP_DATA_ROOT
mkdir -p ${OBMP_DATA_ROOT}/config
mkdir -p ${OBMP_DATA_ROOT}/kafka-data
mkdir -p ${OBMP_DATA_ROOT}/zk-data
mkdir -p ${OBMP_DATA_ROOT}/zk-log
mkdir -p ${OBMP_DATA_ROOT}/postgres/data
mkdir -p ${OBMP_DATA_ROOT}/postgres/ts
mkdir -p ${OBMP_DATA_ROOT}/grafana
mkdir -p ${OBMP_DATA_ROOT}/grafana/dashboards
sudo chmod -R 777 $OBMP_DATA_ROOT

4.3 Initialise the database (first run only)

Create the init trigger file — this causes psql-app to create all tables on startup:

touch ${OBMP_DATA_ROOT}/config/init_db

Warning: Do not create this file on subsequent runs unless you want to wipe and recreate the entire database.

4.4 Copy Grafana provisioning files

cp -r obmp-grafana/provisioning ${OBMP_DATA_ROOT}/grafana/
cp -r obmp-grafana/dashboards   ${OBMP_DATA_ROOT}/grafana/

4.5 Start the stack

OBMP_DATA_ROOT=/var/openbmp docker compose -p obmp up -d

Wait ~2 minutes for all services to initialise (especially PostgreSQL and psql-app which run schema migrations).

4.6 Verify everything is running

docker compose -p obmp ps
docker compose -p obmp logs --tail=20 psql-app

5. IOS-XR Router Configuration

The ExaBGP container peers eBGP with both CORE routers. Each CORE router must be configured with:

5.1 Route policies (apply once per router)

route-policy EXABGP_IN
  pass
end-policy

route-policy EXABGP_OUT
  drop
end-policy

5.2 BGP neighbor block

router bgp 65020
 neighbor 10.40.40.202
  remote-as 65100
  description ExaBGP-Route-Injector
  ebgp-multihop 5
  update-source MgmtEth0/RP0/CPU0/0
  !
  address-family ipv4 unicast
   route-policy EXABGP_IN in
   route-policy EXABGP_OUT out
   next-hop-self
  !
 !
!

5.3 Static route for next-hop resolution

IOS-XR BGP does not use the default route (0.0.0.0/0) to resolve BGP next-hops. A more-specific static route for the ExaBGP host subnet is required in the default VRF:

router static
 address-family ipv4 unicast
  10.40.40.0/24 10.100.0.254
 !
!

5.4 Config notes

Knob Why
remote-as 65100 ExaBGP presents as AS 65100 (eBGP to your AS 65020 mesh)
ebgp-multihop 5 Host and router are on different subnets
update-source MgmtEth0/RP0/CPU0/0 ExaBGP is reachable via the management interface
next-hop-self Replace ExaBGP's next-hop (10.40.40.202) with the CORE router's address when reflecting into iBGP — ensures all routers can resolve the next-hop
EXABGP_OUT drops Prevents the lab from advertising its own prefixes back to ExaBGP
Static route Required: IOS-XR BGP will not install injected routes as bestpaths without a specific route to the next-hop

5.5 NETCONF alternative

See exabgp/iosxr_bgp_config.md for a Python/ncclient script that pushes all of the above config programmatically.

Credentials: username=webui, password=cisco, port 830.


6. Starting and Stopping

Start all services

OBMP_DATA_ROOT=/var/openbmp docker compose -p obmp up -d

Stop all services (preserve data)

docker compose -p obmp down

Stop and remove all data (full reset)

docker compose -p obmp down -v
sudo rm -rf /var/openbmp

Rebuild the ExaBGP container (after code changes)

docker compose -p obmp build exabgp
docker compose -p obmp up -d exabgp

Restart a single service

docker compose -p obmp restart <service>
# e.g.:
docker compose -p obmp restart exabgp
docker compose -p obmp restart psql-app

7. Route Injection User Guide

The ExaBGP container exposes a Flask REST API on port 5050 (host network). The inject.py CLI wraps this API.

7.1 Setup

cd exabgp
pip install requests   # only needed if running inject.py from the host

7.2 Check status

python3 inject.py status

Output shows API health, active route count, and peer states:

{
  "status": "ok",
  "active_routes": 77,
  "peers": {
    "10.100.0.100": {"state": "up", "updated": "2026-03-05T10:00:00Z"},
    "10.100.0.200": {"state": "up", "updated": "2026-03-05T10:00:00Z"}
  }
}

7.3 List available scenarios

python3 inject.py scenarios
Scenario Routes Description
internet_sample ~94 Partial internet table — real public prefixes, realistic AS paths (Cloudflare, Google, AWS, Azure, etc.)
churn 30 RFC documentation prefixes for announce/withdraw churn testing
blackhole 5 /32 prefixes with RTBH community (65100:666 + 65535:666)
anycast 3 Same prefixes with varying AS paths and MEDs (best-path testing)
full_table 500+ Large partial internet table with synthetic /24s
lab_prefixes 8 Enterprise/SP-style routes with communities and local-pref
convergence_test 10 Prefixes for timing BGP convergence — announce then check ip_rib_log timestamps
route_leak 10 Real prefixes re-announced with short AS paths — simulates a route leak (community 65100:999)
hijack_simulation 10 Prefixes claimed directly by AS 65100 — simulates a prefix hijack (community 65100:hijack)
te_community_steering 15 Routes tagged with TE communities for color-based steering (65020:100=red, 65020:200=blue, 65020:300=green)
origin_shift 5 Prefixes with changed origin AS — simulates origin migration for anomaly detection
path_diversity 10 Same prefixes with different AS paths/MEDs — demonstrates best-path selection

7.4 Load a scenario

python3 inject.py scenario internet_sample

Routes propagate: ExaBGP → CORE-01/CORE-02 (eBGP) → all 9 routers (iBGP) → BMP → Kafka → PostgreSQL → Grafana.

7.5 Withdraw a scenario

python3 inject.py withdraw-scenario internet_sample

7.6 Announce individual prefixes

python3 inject.py announce 10.0.0.0/8 \
  --as-path 65100 3356 15169 \
  --community 65100:100 \
  --med 100

7.7 Withdraw individual prefixes

python3 inject.py withdraw 10.0.0.0/8

7.8 Withdraw everything

python3 inject.py withdraw-all

7.9 Generate route churn (populate history tables)

The churn command cycles the churn scenario repeatedly, generating ip_rib_log and stats_chg_* entries that power Grafana's history dashboards.

# 5 cycles, 30 seconds apart
python3 inject.py churn --count 5 --interval 30

# Run indefinitely until Ctrl+C
python3 inject.py churn

7.10 REST API directly (curl)

BASE=http://localhost:5050

# Health
curl $BASE/healthz

# List scenarios
curl $BASE/scenarios

# Load scenario
curl -X POST $BASE/scenario/internet_sample

# Announce custom prefix
curl -X POST $BASE/announce \
  -H 'Content-Type: application/json' \
  -d '{"prefixes":["10.0.0.0/8"],"as_path":[65100,3356,15169],"communities":["65100:100"]}'

# Withdraw all
curl -X POST $BASE/withdraw/all

# Peer state
curl $BASE/peers

7.11 Adding custom scenarios

Edit exabgp/scenarios/__init__.py. Add an entry to SCENARIOS following the existing pattern:

SCENARIOS['my_scenario'] = {
    'description': 'My custom routes',
    'routes': [
        _r('192.0.2.0/24', [65100, 65200], communities=['65100:100']),
    ],
}

The scenarios/ directory is volume-mounted into the container, so changes are live without rebuilding. However, the Python module is imported at container start — restart the container after editing:

docker compose -p obmp restart exabgp

8. ExaBGP Control Panel (Web UI)

Access: http://10.40.40.202:5001

A Vue 3 single-page app served by NGINX that proxies /api/ to the ExaBGP Flask API on port 5050. No login required.

Layout

┌─────────────────────────────────────────────────────────────┐
│  OpenBMP Route Injector   [API OK] [77 routes] [2/2 UP]    │
├──────────────────────┬──────────────────────────────────────┤
│  SCENARIOS           │  [Routes] [Inject] [Peers] tabs      │
│                      │                                      │
│  [internet_sample]   │  Routes tab: searchable/paginated    │
│  [LOAD] [UNLOAD]     │  table with per-row Withdraw button  │
│                      │                                      │
│  [churn]             │  Inject tab: manual prefix form      │
│  [LOAD] [START CHURN]│  (prefix, AS path, communities, MED) │
│                      │                                      │
│  [blackhole] ...     │  Peers tab: per-peer UP/DOWN cards   │
├──────────────────────┴──────────────────────────────────────┤
│  Refreshing every 5s                                        │
└─────────────────────────────────────────────────────────────┘

Features

  • Live status bar — API health, active route count, peer UP/DOWN badges; auto-refreshes every 5 seconds
  • Scenario panel — Load/Unload buttons for all 9 scenarios with loading states and feedback
  • Churn control — Start/stop churn cycles with configurable count and interval sliders directly in the browser
  • Route table — Searchable, paginated (20/page) table of active routes; per-row Withdraw button; Withdraw All
  • Manual inject form — Announce any prefix with custom AS path, communities, MED, local-pref
  • Peer cards — Per-peer state display with UP (green) / DOWN (red pulsing) indicators

Rebuild after code changes

docker compose -p obmp build exabgp-ui
docker compose -p obmp up -d exabgp-ui

9. Grafana Dashboards

Access: http://10.40.40.202:3000 Default credentials: admin / openbmp (anonymous access also enabled)

Dashboard Categories

Category Dashboard Description
General OBMP Home Overview / landing page
Base Inventory Router and peer inventory
Base Looking Glass Real-time RIB lookup by prefix
Base ASN View ASN-level routing view
History Prefix History Route change history for a prefix
History Prefix History by ASN Filtered by origin AS
History Prefix History by Community Filtered by BGP community
Tops Top Prefixes Most-updated prefixes
Tops Top L3VPN Prefixes L3VPN equivalent
Link State LS Nodes IS-IS link-state node database
Link State LS Links IS-IS link-state link database
Link State LS Topology Network topology map
Link State LS Prefixes Link-state prefix database
Link State LS History Link-state change history
L3VPN L3VPN Looking Glass VPN RIB lookup
L3VPN L3VPN Prefix History VPN route change history
L3VPN L3VPN RIB Browser Full VPN RIB browser

History dashboards require ip_rib_log and stats_chg_* table data. Run inject.py churn to populate these.

OBMP-Learning Dashboards (folder: OBMP-Learning)

Six learning-focused dashboards in a separate folder, designed to teach BGP concepts using live lab data.

Dashboard UID What it teaches
BGP Update Rate & Churn obmp-learn-01 Network stability — advertisements vs withdrawals over time from ip_rib_log; per-peer update counts
Peer Session Health & Flap Analysis obmp-learn-02 BGP session stability — state timeline, flap count, uptime %, last reset reason
AS Path Analysis obmp-learn-03 Internet topology — path length distribution, longest paths, top origin ASNs, transit frequency
RPKI Validation Status obmp-learn-04 BGP security — Valid / Invalid / NotFound breakdown; invalid routes (potential hijacks) table
Route Churn & Stability Score obmp-learn-05 Prefix stability — tiered churn score (Very Stable / Stable / Moderate / Unstable) per prefix
BGP Attribute Explorer obmp-learn-06 BGP path attributes — community list distribution, MED values, local-pref spread per peer

RPKI note: The rpki_validator table is populated by a cron job in psql-app every 2 hours. Dashboard obmp-learn-04 will show zero counts until the cron runs — check ENABLE_RPKI=1 in docker-compose.yml.

Advanced Analytics Dashboards (folder: OBMP-Learning)

Four advanced dashboards that go beyond basic BMP monitoring, unlocking TE/SR data and providing heuristic analysis.

Dashboard UID What it provides
Database Schema Map obmp-learn-07 Interactive schema reference — live table row counts, entity relationships, column details for all 33 tables and 11 views
TE & Segment Routing Analytics obmp-learn-08 Exposes TE/SR fields from BGP-LS: link bandwidth, admin groups, SRLG, SR SIDs, adjacency SIDs, protection types
Topology Change & Anomaly Detection obmp-learn-09 Heuristic analysis: link state changes over time, origin AS hijack detection, convergence timeline, route consistency
Link Utilization & TE Thought Experiment obmp-learn-10 BGP-LS capacity data (bandwidth, TE metrics) + integration guide for streaming telemetry (gNMI/MDT)

TE/SR data note: Some TE fields (admin_group, max_link_bw, srlg, sr_adjacency_sids) may be NULL if routers don't advertise those TLVs. Enable mpls traffic-eng under IS-IS and segment-routing mpls for full data.

Database Schema Reference

A standalone database schema reference is also available at DB_SCHEMA.md in the repo root. It documents all 33 tables, 11 views, TE/SR columns, enum types, and common query patterns.


10. Sanity Checks

9.1 All containers running

docker compose -p obmp ps

All containers should show running. If any are restarting, check logs:

docker compose -p obmp logs --tail=50 <service>

9.2 ExaBGP peers up

python3 exabgp/inject.py status

Both 10.100.0.100 and 10.100.0.200 should show "state": "up".

Or check from the router side:

show bgp neighbors 10.40.40.202
show bgp summary | inc 10.40.40.202

9.3 Routes accepted by CORE routers

After loading internet_sample:

# On CORE-01 or CORE-02:
show bgp summary
# Expect: 77 accepted prefixes, 77 are bestpaths from 10.40.40.202

show bgp 8.8.8.0/24
# Expect: best path via 10.40.40.202 (eBGP), also iBGP copies from other routers

9.4 Routes in OpenBMP database

docker exec -it obmp-psql psql -U openbmp -c "
  SELECT count(DISTINCT prefix) AS unique_prefixes,
         count(DISTINCT peer_hash_id) AS peers_reporting
  FROM ip_rib
  WHERE isIPv4 = true AND isWithdrawn = false;
"

Expect ~129 unique prefixes and 56 peers_reporting (9 routers × ~6 peers each) after loading internet_sample.

9.5 Kafka is healthy

docker exec -it obmp-kafka kafka-topics --bootstrap-server localhost:29092 --list

Should show topics like openbmp.parsed.unicast_prefix, openbmp.parsed.peer, etc.

9.6 Grafana datasource

Open http://10.40.40.202:3000 → Configuration → Data Sources → OpenBMP → Test. Should return "Database Connection OK".

9.7 BMP collector receiving data

docker compose -p obmp logs --tail=30 collector

Should show connections from router management IPs.

9.8 psql-app consumer is caught up

docker compose -p obmp logs --tail=30 psql-app

Should show periodic cron job outputs (RPKI sync, IRR sync, global_ip_rib updates).


11. Relevant Commands Reference

Docker Compose

# Start stack
OBMP_DATA_ROOT=/var/openbmp docker compose -p obmp up -d

# Stop stack
docker compose -p obmp down

# Show status
docker compose -p obmp ps

# Follow logs (all services)
docker compose -p obmp logs -f

# Follow logs (specific service)
docker compose -p obmp logs -f exabgp
docker compose -p obmp logs -f psql-app
docker compose -p obmp logs -f collector

# Rebuild and restart ExaBGP
docker compose -p obmp build exabgp && docker compose -p obmp up -d exabgp

# Restart a service
docker compose -p obmp restart psql-app

Route Injection (from exabgp/ directory)

# API health and peer states
python3 inject.py status

# List active routes
python3 inject.py routes

# List scenarios
python3 inject.py scenarios

# Load a scenario
python3 inject.py scenario internet_sample
python3 inject.py scenario churn
python3 inject.py scenario blackhole
python3 inject.py scenario full_table
python3 inject.py scenario lab_prefixes

# Withdraw a scenario
python3 inject.py withdraw-scenario internet_sample

# Withdraw all active routes
python3 inject.py withdraw-all

# Announce a specific prefix
python3 inject.py announce 10.0.0.0/8 --as-path 65100 3356 15169 --community 65100:100

# Withdraw a specific prefix
python3 inject.py withdraw 10.0.0.0/8

# Run churn (populate history tables)
python3 inject.py churn --count 5 --interval 30

Database Queries

# Connect to database
docker exec -it obmp-psql psql -U openbmp -d openbmp

# Count unique prefixes in RIB
SELECT count(DISTINCT prefix) FROM ip_rib WHERE isIPv4=true AND isWithdrawn=false;

# Show recent route changes
SELECT prefix, origin_as, iswithdrawn, timestamp
FROM ip_rib_log
ORDER BY timestamp DESC LIMIT 20;

# Show peer summary
SELECT name, state, timestamp_last_updated
FROM bgp_peers
ORDER BY state, name;

# Show routes from ExaBGP peer
SELECT prefix, origin_as, as_path
FROM ip_rib
WHERE peer_hash_id IN (
  SELECT hash_id FROM bgp_peers WHERE peer_addr = '10.40.40.202'
)
AND isWithdrawn = false;

IOS-XR Verification (on router CLI)

show bgp neighbors 10.40.40.202
show bgp neighbors 10.40.40.202 received routes
show bgp summary
show bgp 8.8.8.0/24
show bgp 1.1.1.0/24
show route 8.8.8.0/24

12. Troubleshooting

ExaBGP container keeps restarting

Check logs:

docker compose -p obmp logs --tail=50 exabgp

Common causes and fixes:

Symptom Cause Fix
Exits after "welcome" banner Missing or wrong env file path startup.sh generates /usr/local/etc/exabgp/exabgp.env — verify this path exists in container
Process api killed 5 times Wrong Python path in conf Conf uses /usr/local/bin/python3 — correct for python:3.11-slim
drop = true in env ExaBGP drops privileges to nobody, can't bind 179 startup.sh patches drop = false — check the sed lines ran
__pycache__ Permission denied during build Root-owned cache from previous container run .dockerignore excludes **/__pycache__ — confirm file exists

BGP sessions not establishing

  1. Verify host IP 10.40.40.202 is reachable from CML management network: ping 10.40.40.202 from router
  2. Check ExaBGP peer state: python3 exabgp/inject.py status
  3. On router: show bgp neighbors 10.40.40.202 — look for error codes
  4. Common IOS-XR errors:
    • no-update-source-config — add update-source MgmtEth0/RP0/CPU0/0
    • no-ipv6-address — ensure only IPv4 unicast AF is configured (no IPv6)
    • TCP refused — check port 179 is reachable (ExaBGP uses network_mode: host)

Routes received but not bestpath

IOS-XR BGP requires a specific route to resolve the BGP next-hop (10.40.40.202). The default route (0.0.0.0/0) is insufficient.

router static
 address-family ipv4 unicast
  10.40.40.0/24 10.100.0.254

Verify: show bgp 1.1.1.0/24 — should show Status: s (active), bestpath.

Grafana shows no data

  1. Check datasource: Configuration → Data Sources → OpenBMP → Test
  2. Verify psql-app is writing: docker compose -p obmp logs psql-app
  3. Check the database directly (see database queries above)
  4. History dashboards need route churn — run python3 inject.py churn

Kafka not starting

Zookeeper must be healthy first. Check:

docker compose -p obmp logs zookeeper
docker compose -p obmp restart kafka

psql-app fails to start

Usually a PostgreSQL connection issue or schema mismatch. Check:

docker compose -p obmp logs psql-app
# If "relation does not exist" errors: re-trigger DB init
touch /var/openbmp/config/init_db
docker compose -p obmp restart psql-app

13. Data Retention

Configured in docker-compose.yml via POSTGRES_DROP_* environment variables:

Table Default Retention
peer_event_log 1 year
stat_reports 4 weeks
ip_rib_log 4 weeks
alerts 4 weeks
ls_nodes_log 4 months
ls_links_log 4 months
ls_prefixes_log 4 months
stats_chg_byprefix 4 weeks
stats_chg_byasn 4 weeks
stats_chg_bypeer 4 weeks
stats_ip_origins 4 weeks
stats_peer_rib 4 weeks
stats_peer_update_counts 4 weeks

Adjust in docker-compose.yml under the psql-app service environment block.


14. Environment Variables Reference

ExaBGP container

Variable Default Description
EXABGP_LOCAL_IP 10.40.40.202 Host IP ExaBGP binds to and uses as router-id
EXABGP_LOCAL_AS 65100 ExaBGP's AS number
EXABGP_PEER_AS 65020 AS of the IOS-XR lab
EXABGP_PEER_1 10.100.0.100 First CORE router to peer with
EXABGP_PEER_2 10.100.0.200 Second CORE router to peer with
EXABGP_API_PORT 5050 Flask API port

psql-app container (key variables)

Variable Default Description
MEM 3 JVM heap in GB
ENABLE_RPKI 1 Enable RPKI sync from Cloudflare
ENABLE_IRR 1 Enable IRR sync
ENABLE_DBIP 1 Enable DB-IP geolocation import
POSTGRES_REPORT_WINDOW 8 minute Aggregation window for summary tables

inject.py (CLI)

Variable Default Description
EXABGP_API http://localhost:5050 ExaBGP API base URL

15. gNMI Streaming Telemetry (Phase 4)

Overview

gNMI (gRPC Network Management Interface) adds data-plane visibility alongside BMP's control-plane monitoring. Telegraf collects real-time interface counters from all 9 IOS-XR routers via gNMI subscriptions and stores them in InfluxDB. Grafana queries InfluxDB for telemetry dashboards.

Architecture

IOS-XR Routers (9x, gRPC port 57400)
         |
    gNMI subscriptions (10s sample)
         |
         v
  obmp-telegraf (telegraf:1.28 + gnmi input plugin)
    host networking → reaches routers on 10.100.0.x
         |
         v
  obmp-influxdb (influxdb:2.7, port 8086)
    bucket: "telemetry", org: "openbmp"
         |
         v
  obmp-grafana (InfluxDB datasource, Flux queries)
    3 dashboards in OBMP-Telemetry folder

Enabling gRPC on Routers

The routers need gRPC enabled before Telegraf can collect telemetry. A NETCONF script is provided:

# From the host (requires ncclient: pip install ncclient)
cd /home/user/obmp-docker/gnmi
python3 gnmi_grpc_config.py

This connects to all 9 routers via NETCONF (port 830, credentials webui/cisco) and pushes:

grpc
 port 57400
 no-tls

Verify on router:

show grpc status

Expected: gRPC listening on port 57400.

Telemetry Data Collected

Telegraf subscribes to two IOS-XR YANG paths at 10-second intervals:

Subscription YANG Path Data
interface_counters Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/generic-counters bytes/packets in/out, errors, drops, CRC
interface_rates Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/data-rate bits/sec in/out, packet rate

InfluxDB Access

  • URL: http://localhost:8086
  • Org: openbmp
  • Bucket: telemetry
  • Token: openbmp-telemetry-token
  • Retention: 30 days

Grafana Telemetry Dashboards

Three dashboards in the OBMP-Telemetry folder:

Dashboard UID Description
Interface Utilization obmp-telem-01 Input/output bytes rate, packets rate, top interfaces by throughput
Interface Errors obmp-telem-02 CRC errors, input/output errors, drops, overruns
Combined BMP + Telemetry obmp-telem-03 Mixed datasource — BGP peer status (PostgreSQL) alongside interface counters (InfluxDB)

All dashboards have $router and $interface template variables for filtering.

Troubleshooting gNMI

# Check Telegraf logs for gNMI connection status
docker logs obmp-telegraf --tail 50

# Verify InfluxDB has data
curl -s -H "Authorization: Token openbmp-telemetry-token" \
  "http://localhost:8086/api/v2/query?org=openbmp" \
  --data-urlencode 'q=from(bucket:"telemetry") |> range(start: -5m) |> limit(n:5)'

# Check InfluxDB health
curl http://localhost:8086/health

16. Traffic Generator (Phase 4)

Overview

A portable, containerized traffic generator with a web UI for RFC 2544 testing and custom packet flows. Built with Scapy + Flask (backend) and Vue 3 + NGINX (frontend). The container supports dual-mode operation: sender (generate traffic) or responder (receive/echo packets).

Accessing the UI

  • Web UI: http://localhost:5002
  • API: http://localhost:5051

Dual-Mode Operation

Set via TRAFFIC_GEN_MODE environment variable in docker-compose.yml:

Mode Description
sender (default) Generates traffic, runs RFC 2544 tests, sends custom flows
responder Listens for incoming test packets, echoes/timestamps them, reports receive stats

Typical deployment: One instance as sender on the host, optionally a second instance as responder on another endpoint. Without a responder, the sender uses ICMP echo for latency measurement (routers respond natively).

Creating Flows

Use the Flow Builder panel (left sidebar) in the UI:

Field Default Description
Name - Human-readable flow name
Destination IP 10.100.0.100 Target router IP
Source IP 10.40.40.202 Host IP
Protocol UDP UDP, TCP, or ICMP
Source Port 50000 (UDP/TCP only)
Destination Port 5001 (UDP/TCP only)
Frame Size 512 Packet size in bytes
Rate (pps) 1000 Packets per second
Duration 30 Seconds (0 = infinite)
DSCP 0 Differentiated Services Code Point

After creating a flow, use the Flows tab to Start/Stop/Delete flows.

RFC 2544 Testing

Use the Tests tab to configure and run RFC 2544 tests:

Test Type Description
Throughput Binary search for maximum zero-loss forwarding rate
Latency Measure round-trip time at determined throughput rate
Frame Loss Loss percentage vs. offered load curve
Back-to-Back Maximum burst length at line rate with zero loss

Parameters:

  • Base Flow: Select a previously created flow as the test template
  • Frame Sizes: Standard sizes: 64, 128, 256, 512, 1024, 1280, 1518 bytes
  • Trial Duration: Per-frame-size test duration (5300 sec)
  • Max Rate (pps): Upper bound for binary search
  • Acceptable Loss %: Threshold for pass/fail

Quick Presets

Six built-in presets are available in the Tests tab:

Preset Description
quick_icmp ICMP ping to CORE-01 at 10 pps
udp_flood_small 64-byte UDP at 5000 pps
udp_flood_large 1518-byte UDP at 1000 pps
rfc2544_throughput Full throughput test with standard frame sizes
rfc2544_latency Latency measurement with standard frame sizes
tcp_session TCP flow at 500 pps

API Reference

Method Path Description
GET /healthz Health check + engine status
GET /interfaces Available network interfaces
GET /mode Current mode (sender/responder)
GET/POST /flows List / create flows
GET/PUT/DELETE /flows/<id> Get / update / delete flow
POST /flows/<id>/start Start sending
POST /flows/<id>/stop Stop sending
GET /flows/<id>/stats Real-time stats for a flow
GET/POST /tests List / create RFC 2544 tests
GET /tests/<id> Test details + results
POST /tests/<id>/start Start test execution
POST /tests/<id>/stop Abort test
GET /tests/<id>/results Exportable results
GET /presets Available test presets
POST /presets/<name> Create flow + test from preset
GET /stats/history Stats ring buffer (300 samples)
GET /responder/stats Responder-mode receive stats
POST /responder/reset Reset responder counters

Integration with gNMI Telemetry

The key value of combining the traffic generator with gNMI: send traffic while watching real-time interface counters.

  1. Create a UDP flow targeting a router (e.g., R9K-01 at 10.100.0.1)
  2. Open the Grafana Interface Utilization dashboard, select that router
  3. Start the flow — gNMI counters show traffic appearing on the interface
  4. Run an RFC 2544 throughput test — Grafana shows the stepped traffic pattern from binary search iterations
  5. Compare Scapy-reported stats with gNMI-reported counters for cross-validation

The Combined BMP + Telemetry dashboard shows both control-plane (BMP BGP updates) and data-plane (gNMI interface counters) side by side, enabling correlation of BGP changes with traffic impact.

Environment Variables

Variable Default Description
TRAFFIC_GEN_API_PORT 5051 Flask API listen port
TRAFFIC_GEN_MODE sender Operating mode: sender or responder
INFLUXDB_TOKEN openbmp-telemetry-token InfluxDB auth token (Telegraf)