- gNMI integration: NETCONF script to enable gRPC on all 9 routers, Telegraf container with gnmi input plugin, InfluxDB for time-series storage, 3 Grafana telemetry dashboards (utilization, errors, combined) - Traffic generator: Scapy-based dual-mode container (sender/responder) with Flask API, RFC 2544 test suite (throughput, latency, frame-loss, back-to-back), Vue 3 web UI with flow builder, test runner, real-time stats monitor, and results export - docker-compose.yml updated with influxdb, telegraf, traffic-gen, traffic-gen-ui services - Full documentation in DOCS.md sections 15-16 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
35 KiB
OpenBMP + ExaBGP Route Injector — Full Documentation
Table of Contents
- What Is This Project?
- Architecture
- Prerequisites
- Initial Setup (First Time)
- IOS-XR Router Configuration
- Starting and Stopping
- Route Injection User Guide
- ExaBGP Control Panel (Web UI)
- Grafana Dashboards
- Sanity Checks
- Relevant Commands Reference
- Troubleshooting
- Data Retention
- Environment Variables Reference
- gNMI Streaming Telemetry (Phase 4)
- Traffic Generator (Phase 4)
1. What Is This Project?
This is a BGP Monitoring Platform (BMP) lab stack deployed via Docker Compose. It collects, stores, and visualizes BGP routing data from a Cisco IOS-XR lab network (running in Cisco Modeling Labs / CML).
What it does:
- Receives BMP (BGP Monitoring Protocol, RFC 7854) telemetry from routers on TCP port 5000
- Streams BMP data through Kafka into a TimescaleDB/PostgreSQL database
- Provides 30 Grafana dashboards (17 operational + 6 learning + 4 advanced analytics + 3 streaming telemetry) for real-time and historical BGP analysis
- Includes an ExaBGP route injector that peers with the two CORE routers and injects synthetic BGP routes, enabling testing of BGP policy, route propagation, and Grafana dashboards without needing internet connectivity
- Provides a Vue 3 web UI at
:5001for point-and-click scenario management, live route tables, and peer monitoring
The lab network:
- AS 65020 — 9 Cisco IOS-XR routers in CML (iBGP full mesh via route-reflectors)
- AS 65100 — ExaBGP container (eBGP peer to both CORE routers)
- CORE-01:
10.100.0.100(CML-R9K-CORE-01) - CORE-02:
10.100.0.200(CML-R9K-CORE-02) - Host IP:
10.40.40.202(ExaBGP binds here; reachable from CML management network)
2. Architecture
IOS-XR Routers (9x, AS 65020)
BMP telemetry on TCP 5000
|
v
obmp-collector (openbmp/collector:2.2.3)
|
v
obmp-kafka (confluentinc/cp-kafka:7.1.1)
+ obmp-zookeeper (confluentinc/cp-zookeeper:7.1.1)
|
v
obmp-psql-app (openbmp/psql-app:2.2.2)
Java consumer — writes parsed BGP data to PostgreSQL
|
v
obmp-psql (openbmp/postgres:2.2.1)
PostgreSQL 14 + TimescaleDB
|
+---------> obmp-grafana (grafana/grafana:9.1.7) :3000
| 30 dashboards, PostgreSQL + InfluxDB datasources
+---------> obmp-whois (openbmp/whois:2.2.0) :4300
WHOIS query server backed by the DB
ExaBGP (obmp-exabgp, built locally)
python:3.11-slim + exabgp 5.x + Flask API
Peers eBGP to CORE-01 and CORE-02 (AS 65100 -> AS 65020)
HTTP API on :5050 — inject/withdraw routes on demand
Routes propagate via iBGP mesh to all 9 routers -> BMP -> DB -> Grafana
gNMI Streaming Telemetry (Phase 4):
IOS-XR Routers (gRPC :57400)
|
v
obmp-telegraf (telegraf:1.28 + gnmi plugin)
|
v
obmp-influxdb (influxdb:2.7) :8086
|
v
obmp-grafana (InfluxDB datasource -> Telemetry dashboards)
Traffic Generator (Phase 4):
obmp-traffic-gen (python:3.11 + Scapy + Flask) :5051
Dual-mode: sender (generate traffic) / responder (echo/log)
RFC 2544 testing, custom packet flows
obmp-traffic-gen-ui (Vue 3 + NGINX) :5002
Container Summary
| Container | Image | Port(s) | Role |
|---|---|---|---|
| obmp-zookeeper | confluentinc/cp-zookeeper:7.1.1 | 2181 (internal) | Kafka coordination |
| obmp-kafka | confluentinc/cp-kafka:7.1.1 | 9092 | Message broker |
| obmp-collector | openbmp/collector:2.2.3 | 5000 | BMP receiver |
| obmp-psql-app | openbmp/psql-app:2.2.2 | 9005 | Kafka→PostgreSQL consumer |
| obmp-psql | openbmp/postgres:2.2.1 | 5432 | TimescaleDB storage |
| obmp-grafana | grafana/grafana:9.1.7 | 3000 | Visualization |
| obmp-whois | openbmp/whois:2.2.0 | 4300 | WHOIS query server |
| obmp-exabgp | local build | 5050 (host net) | BGP route injector |
| obmp-exabgp-ui | local build | 5001 (host net) | Route injector web UI |
| obmp-influxdb | influxdb:2.7 | 8086 | Time-series DB for telemetry |
| obmp-telegraf | local build | - (host net) | gNMI telemetry collector |
| obmp-traffic-gen | local build | 5051 (host net) | Scapy traffic generator |
| obmp-traffic-gen-ui | local build | 5002 (host net) | Traffic generator web UI |
3. Prerequisites
- Docker Engine (20.10+) and Docker Compose v2
- Host IP
10.40.40.202reachable from the CML management network - CML routers with BMP configured pointing to
10.40.40.202:5000 - CML CORE routers configured with ExaBGP as eBGP neighbor (see Section 5)
OBMP_DATA_ROOTdirectory created (default:/var/openbmp)
4. Initial Setup (First Time)
4.1 Clone the repository
git clone <this-repo-url>
cd obmp-docker
4.2 Create persistent data directories
export OBMP_DATA_ROOT=/var/openbmp
sudo mkdir -p $OBMP_DATA_ROOT
mkdir -p ${OBMP_DATA_ROOT}/config
mkdir -p ${OBMP_DATA_ROOT}/kafka-data
mkdir -p ${OBMP_DATA_ROOT}/zk-data
mkdir -p ${OBMP_DATA_ROOT}/zk-log
mkdir -p ${OBMP_DATA_ROOT}/postgres/data
mkdir -p ${OBMP_DATA_ROOT}/postgres/ts
mkdir -p ${OBMP_DATA_ROOT}/grafana
mkdir -p ${OBMP_DATA_ROOT}/grafana/dashboards
sudo chmod -R 777 $OBMP_DATA_ROOT
4.3 Initialise the database (first run only)
Create the init trigger file — this causes psql-app to create all tables on startup:
touch ${OBMP_DATA_ROOT}/config/init_db
Warning: Do not create this file on subsequent runs unless you want to wipe and recreate the entire database.
4.4 Copy Grafana provisioning files
cp -r obmp-grafana/provisioning ${OBMP_DATA_ROOT}/grafana/
cp -r obmp-grafana/dashboards ${OBMP_DATA_ROOT}/grafana/
4.5 Start the stack
OBMP_DATA_ROOT=/var/openbmp docker compose -p obmp up -d
Wait ~2 minutes for all services to initialise (especially PostgreSQL and psql-app which run schema migrations).
4.6 Verify everything is running
docker compose -p obmp ps
docker compose -p obmp logs --tail=20 psql-app
5. IOS-XR Router Configuration
The ExaBGP container peers eBGP with both CORE routers. Each CORE router must be configured with:
5.1 Route policies (apply once per router)
route-policy EXABGP_IN
pass
end-policy
route-policy EXABGP_OUT
drop
end-policy
5.2 BGP neighbor block
router bgp 65020
neighbor 10.40.40.202
remote-as 65100
description ExaBGP-Route-Injector
ebgp-multihop 5
update-source MgmtEth0/RP0/CPU0/0
!
address-family ipv4 unicast
route-policy EXABGP_IN in
route-policy EXABGP_OUT out
next-hop-self
!
!
!
5.3 Static route for next-hop resolution
IOS-XR BGP does not use the default route (0.0.0.0/0) to resolve BGP next-hops. A more-specific static route for the ExaBGP host subnet is required in the default VRF:
router static
address-family ipv4 unicast
10.40.40.0/24 10.100.0.254
!
!
5.4 Config notes
| Knob | Why |
|---|---|
remote-as 65100 |
ExaBGP presents as AS 65100 (eBGP to your AS 65020 mesh) |
ebgp-multihop 5 |
Host and router are on different subnets |
update-source MgmtEth0/RP0/CPU0/0 |
ExaBGP is reachable via the management interface |
next-hop-self |
Replace ExaBGP's next-hop (10.40.40.202) with the CORE router's address when reflecting into iBGP — ensures all routers can resolve the next-hop |
EXABGP_OUT drops |
Prevents the lab from advertising its own prefixes back to ExaBGP |
| Static route | Required: IOS-XR BGP will not install injected routes as bestpaths without a specific route to the next-hop |
5.5 NETCONF alternative
See exabgp/iosxr_bgp_config.md for a Python/ncclient script that pushes all of the above config programmatically.
Credentials: username=webui, password=cisco, port 830.
6. Starting and Stopping
Start all services
OBMP_DATA_ROOT=/var/openbmp docker compose -p obmp up -d
Stop all services (preserve data)
docker compose -p obmp down
Stop and remove all data (full reset)
docker compose -p obmp down -v
sudo rm -rf /var/openbmp
Rebuild the ExaBGP container (after code changes)
docker compose -p obmp build exabgp
docker compose -p obmp up -d exabgp
Restart a single service
docker compose -p obmp restart <service>
# e.g.:
docker compose -p obmp restart exabgp
docker compose -p obmp restart psql-app
7. Route Injection User Guide
The ExaBGP container exposes a Flask REST API on port 5050 (host network). The inject.py CLI wraps this API.
7.1 Setup
cd exabgp
pip install requests # only needed if running inject.py from the host
7.2 Check status
python3 inject.py status
Output shows API health, active route count, and peer states:
{
"status": "ok",
"active_routes": 77,
"peers": {
"10.100.0.100": {"state": "up", "updated": "2026-03-05T10:00:00Z"},
"10.100.0.200": {"state": "up", "updated": "2026-03-05T10:00:00Z"}
}
}
7.3 List available scenarios
python3 inject.py scenarios
| Scenario | Routes | Description |
|---|---|---|
internet_sample |
~94 | Partial internet table — real public prefixes, realistic AS paths (Cloudflare, Google, AWS, Azure, etc.) |
churn |
30 | RFC documentation prefixes for announce/withdraw churn testing |
blackhole |
5 | /32 prefixes with RTBH community (65100:666 + 65535:666) |
anycast |
3 | Same prefixes with varying AS paths and MEDs (best-path testing) |
full_table |
500+ | Large partial internet table with synthetic /24s |
lab_prefixes |
8 | Enterprise/SP-style routes with communities and local-pref |
convergence_test |
10 | Prefixes for timing BGP convergence — announce then check ip_rib_log timestamps |
route_leak |
10 | Real prefixes re-announced with short AS paths — simulates a route leak (community 65100:999) |
hijack_simulation |
10 | Prefixes claimed directly by AS 65100 — simulates a prefix hijack (community 65100:hijack) |
te_community_steering |
15 | Routes tagged with TE communities for color-based steering (65020:100=red, 65020:200=blue, 65020:300=green) |
origin_shift |
5 | Prefixes with changed origin AS — simulates origin migration for anomaly detection |
path_diversity |
10 | Same prefixes with different AS paths/MEDs — demonstrates best-path selection |
7.4 Load a scenario
python3 inject.py scenario internet_sample
Routes propagate: ExaBGP → CORE-01/CORE-02 (eBGP) → all 9 routers (iBGP) → BMP → Kafka → PostgreSQL → Grafana.
7.5 Withdraw a scenario
python3 inject.py withdraw-scenario internet_sample
7.6 Announce individual prefixes
python3 inject.py announce 10.0.0.0/8 \
--as-path 65100 3356 15169 \
--community 65100:100 \
--med 100
7.7 Withdraw individual prefixes
python3 inject.py withdraw 10.0.0.0/8
7.8 Withdraw everything
python3 inject.py withdraw-all
7.9 Generate route churn (populate history tables)
The churn command cycles the churn scenario repeatedly, generating ip_rib_log and stats_chg_* entries that power Grafana's history dashboards.
# 5 cycles, 30 seconds apart
python3 inject.py churn --count 5 --interval 30
# Run indefinitely until Ctrl+C
python3 inject.py churn
7.10 REST API directly (curl)
BASE=http://localhost:5050
# Health
curl $BASE/healthz
# List scenarios
curl $BASE/scenarios
# Load scenario
curl -X POST $BASE/scenario/internet_sample
# Announce custom prefix
curl -X POST $BASE/announce \
-H 'Content-Type: application/json' \
-d '{"prefixes":["10.0.0.0/8"],"as_path":[65100,3356,15169],"communities":["65100:100"]}'
# Withdraw all
curl -X POST $BASE/withdraw/all
# Peer state
curl $BASE/peers
7.11 Adding custom scenarios
Edit exabgp/scenarios/__init__.py. Add an entry to SCENARIOS following the existing pattern:
SCENARIOS['my_scenario'] = {
'description': 'My custom routes',
'routes': [
_r('192.0.2.0/24', [65100, 65200], communities=['65100:100']),
],
}
The scenarios/ directory is volume-mounted into the container, so changes are live without rebuilding. However, the Python module is imported at container start — restart the container after editing:
docker compose -p obmp restart exabgp
8. ExaBGP Control Panel (Web UI)
Access: http://10.40.40.202:5001
A Vue 3 single-page app served by NGINX that proxies /api/ to the ExaBGP Flask API on port 5050. No login required.
Layout
┌─────────────────────────────────────────────────────────────┐
│ OpenBMP Route Injector [API OK] [77 routes] [2/2 UP] │
├──────────────────────┬──────────────────────────────────────┤
│ SCENARIOS │ [Routes] [Inject] [Peers] tabs │
│ │ │
│ [internet_sample] │ Routes tab: searchable/paginated │
│ [LOAD] [UNLOAD] │ table with per-row Withdraw button │
│ │ │
│ [churn] │ Inject tab: manual prefix form │
│ [LOAD] [START CHURN]│ (prefix, AS path, communities, MED) │
│ │ │
│ [blackhole] ... │ Peers tab: per-peer UP/DOWN cards │
├──────────────────────┴──────────────────────────────────────┤
│ Refreshing every 5s │
└─────────────────────────────────────────────────────────────┘
Features
- Live status bar — API health, active route count, peer UP/DOWN badges; auto-refreshes every 5 seconds
- Scenario panel — Load/Unload buttons for all 9 scenarios with loading states and feedback
- Churn control — Start/stop churn cycles with configurable count and interval sliders directly in the browser
- Route table — Searchable, paginated (20/page) table of active routes; per-row Withdraw button; Withdraw All
- Manual inject form — Announce any prefix with custom AS path, communities, MED, local-pref
- Peer cards — Per-peer state display with UP (green) / DOWN (red pulsing) indicators
Rebuild after code changes
docker compose -p obmp build exabgp-ui
docker compose -p obmp up -d exabgp-ui
9. Grafana Dashboards
Access: http://10.40.40.202:3000
Default credentials: admin / openbmp (anonymous access also enabled)
Dashboard Categories
| Category | Dashboard | Description |
|---|---|---|
| General | OBMP Home | Overview / landing page |
| Base | Inventory | Router and peer inventory |
| Base | Looking Glass | Real-time RIB lookup by prefix |
| Base | ASN View | ASN-level routing view |
| History | Prefix History | Route change history for a prefix |
| History | Prefix History by ASN | Filtered by origin AS |
| History | Prefix History by Community | Filtered by BGP community |
| Tops | Top Prefixes | Most-updated prefixes |
| Tops | Top L3VPN Prefixes | L3VPN equivalent |
| Link State | LS Nodes | IS-IS link-state node database |
| Link State | LS Links | IS-IS link-state link database |
| Link State | LS Topology | Network topology map |
| Link State | LS Prefixes | Link-state prefix database |
| Link State | LS History | Link-state change history |
| L3VPN | L3VPN Looking Glass | VPN RIB lookup |
| L3VPN | L3VPN Prefix History | VPN route change history |
| L3VPN | L3VPN RIB Browser | Full VPN RIB browser |
History dashboards require
ip_rib_logandstats_chg_*table data. Runinject.py churnto populate these.
OBMP-Learning Dashboards (folder: OBMP-Learning)
Six learning-focused dashboards in a separate folder, designed to teach BGP concepts using live lab data.
| Dashboard | UID | What it teaches |
|---|---|---|
| BGP Update Rate & Churn | obmp-learn-01 |
Network stability — advertisements vs withdrawals over time from ip_rib_log; per-peer update counts |
| Peer Session Health & Flap Analysis | obmp-learn-02 |
BGP session stability — state timeline, flap count, uptime %, last reset reason |
| AS Path Analysis | obmp-learn-03 |
Internet topology — path length distribution, longest paths, top origin ASNs, transit frequency |
| RPKI Validation Status | obmp-learn-04 |
BGP security — Valid / Invalid / NotFound breakdown; invalid routes (potential hijacks) table |
| Route Churn & Stability Score | obmp-learn-05 |
Prefix stability — tiered churn score (Very Stable / Stable / Moderate / Unstable) per prefix |
| BGP Attribute Explorer | obmp-learn-06 |
BGP path attributes — community list distribution, MED values, local-pref spread per peer |
RPKI note: The
rpki_validatortable is populated by a cron job inpsql-appevery 2 hours. Dashboardobmp-learn-04will show zero counts until the cron runs — checkENABLE_RPKI=1indocker-compose.yml.
Advanced Analytics Dashboards (folder: OBMP-Learning)
Four advanced dashboards that go beyond basic BMP monitoring, unlocking TE/SR data and providing heuristic analysis.
| Dashboard | UID | What it provides |
|---|---|---|
| Database Schema Map | obmp-learn-07 |
Interactive schema reference — live table row counts, entity relationships, column details for all 33 tables and 11 views |
| TE & Segment Routing Analytics | obmp-learn-08 |
Exposes TE/SR fields from BGP-LS: link bandwidth, admin groups, SRLG, SR SIDs, adjacency SIDs, protection types |
| Topology Change & Anomaly Detection | obmp-learn-09 |
Heuristic analysis: link state changes over time, origin AS hijack detection, convergence timeline, route consistency |
| Link Utilization & TE Thought Experiment | obmp-learn-10 |
BGP-LS capacity data (bandwidth, TE metrics) + integration guide for streaming telemetry (gNMI/MDT) |
TE/SR data note: Some TE fields (admin_group, max_link_bw, srlg, sr_adjacency_sids) may be NULL if routers don't advertise those TLVs. Enable
mpls traffic-engunder IS-IS andsegment-routing mplsfor full data.
Database Schema Reference
A standalone database schema reference is also available at DB_SCHEMA.md in the repo root. It documents all 33 tables, 11 views, TE/SR columns, enum types, and common query patterns.
10. Sanity Checks
9.1 All containers running
docker compose -p obmp ps
All containers should show running. If any are restarting, check logs:
docker compose -p obmp logs --tail=50 <service>
9.2 ExaBGP peers up
python3 exabgp/inject.py status
Both 10.100.0.100 and 10.100.0.200 should show "state": "up".
Or check from the router side:
show bgp neighbors 10.40.40.202
show bgp summary | inc 10.40.40.202
9.3 Routes accepted by CORE routers
After loading internet_sample:
# On CORE-01 or CORE-02:
show bgp summary
# Expect: 77 accepted prefixes, 77 are bestpaths from 10.40.40.202
show bgp 8.8.8.0/24
# Expect: best path via 10.40.40.202 (eBGP), also iBGP copies from other routers
9.4 Routes in OpenBMP database
docker exec -it obmp-psql psql -U openbmp -c "
SELECT count(DISTINCT prefix) AS unique_prefixes,
count(DISTINCT peer_hash_id) AS peers_reporting
FROM ip_rib
WHERE isIPv4 = true AND isWithdrawn = false;
"
Expect ~129 unique prefixes and 56 peers_reporting (9 routers × ~6 peers each) after loading internet_sample.
9.5 Kafka is healthy
docker exec -it obmp-kafka kafka-topics --bootstrap-server localhost:29092 --list
Should show topics like openbmp.parsed.unicast_prefix, openbmp.parsed.peer, etc.
9.6 Grafana datasource
Open http://10.40.40.202:3000 → Configuration → Data Sources → OpenBMP → Test.
Should return "Database Connection OK".
9.7 BMP collector receiving data
docker compose -p obmp logs --tail=30 collector
Should show connections from router management IPs.
9.8 psql-app consumer is caught up
docker compose -p obmp logs --tail=30 psql-app
Should show periodic cron job outputs (RPKI sync, IRR sync, global_ip_rib updates).
11. Relevant Commands Reference
Docker Compose
# Start stack
OBMP_DATA_ROOT=/var/openbmp docker compose -p obmp up -d
# Stop stack
docker compose -p obmp down
# Show status
docker compose -p obmp ps
# Follow logs (all services)
docker compose -p obmp logs -f
# Follow logs (specific service)
docker compose -p obmp logs -f exabgp
docker compose -p obmp logs -f psql-app
docker compose -p obmp logs -f collector
# Rebuild and restart ExaBGP
docker compose -p obmp build exabgp && docker compose -p obmp up -d exabgp
# Restart a service
docker compose -p obmp restart psql-app
Route Injection (from exabgp/ directory)
# API health and peer states
python3 inject.py status
# List active routes
python3 inject.py routes
# List scenarios
python3 inject.py scenarios
# Load a scenario
python3 inject.py scenario internet_sample
python3 inject.py scenario churn
python3 inject.py scenario blackhole
python3 inject.py scenario full_table
python3 inject.py scenario lab_prefixes
# Withdraw a scenario
python3 inject.py withdraw-scenario internet_sample
# Withdraw all active routes
python3 inject.py withdraw-all
# Announce a specific prefix
python3 inject.py announce 10.0.0.0/8 --as-path 65100 3356 15169 --community 65100:100
# Withdraw a specific prefix
python3 inject.py withdraw 10.0.0.0/8
# Run churn (populate history tables)
python3 inject.py churn --count 5 --interval 30
Database Queries
# Connect to database
docker exec -it obmp-psql psql -U openbmp -d openbmp
# Count unique prefixes in RIB
SELECT count(DISTINCT prefix) FROM ip_rib WHERE isIPv4=true AND isWithdrawn=false;
# Show recent route changes
SELECT prefix, origin_as, iswithdrawn, timestamp
FROM ip_rib_log
ORDER BY timestamp DESC LIMIT 20;
# Show peer summary
SELECT name, state, timestamp_last_updated
FROM bgp_peers
ORDER BY state, name;
# Show routes from ExaBGP peer
SELECT prefix, origin_as, as_path
FROM ip_rib
WHERE peer_hash_id IN (
SELECT hash_id FROM bgp_peers WHERE peer_addr = '10.40.40.202'
)
AND isWithdrawn = false;
IOS-XR Verification (on router CLI)
show bgp neighbors 10.40.40.202
show bgp neighbors 10.40.40.202 received routes
show bgp summary
show bgp 8.8.8.0/24
show bgp 1.1.1.0/24
show route 8.8.8.0/24
12. Troubleshooting
ExaBGP container keeps restarting
Check logs:
docker compose -p obmp logs --tail=50 exabgp
Common causes and fixes:
| Symptom | Cause | Fix |
|---|---|---|
| Exits after "welcome" banner | Missing or wrong env file path | startup.sh generates /usr/local/etc/exabgp/exabgp.env — verify this path exists in container |
Process api killed 5 times |
Wrong Python path in conf | Conf uses /usr/local/bin/python3 — correct for python:3.11-slim |
drop = true in env |
ExaBGP drops privileges to nobody, can't bind 179 | startup.sh patches drop = false — check the sed lines ran |
__pycache__ Permission denied during build |
Root-owned cache from previous container run | .dockerignore excludes **/__pycache__ — confirm file exists |
BGP sessions not establishing
- Verify host IP
10.40.40.202is reachable from CML management network:ping 10.40.40.202from router - Check ExaBGP peer state:
python3 exabgp/inject.py status - On router:
show bgp neighbors 10.40.40.202— look for error codes - Common IOS-XR errors:
no-update-source-config— addupdate-source MgmtEth0/RP0/CPU0/0no-ipv6-address— ensure only IPv4 unicast AF is configured (no IPv6)- TCP refused — check port 179 is reachable (ExaBGP uses
network_mode: host)
Routes received but not bestpath
IOS-XR BGP requires a specific route to resolve the BGP next-hop (10.40.40.202). The default route (0.0.0.0/0) is insufficient.
router static
address-family ipv4 unicast
10.40.40.0/24 10.100.0.254
Verify: show bgp 1.1.1.0/24 — should show Status: s (active), bestpath.
Grafana shows no data
- Check datasource: Configuration → Data Sources → OpenBMP → Test
- Verify psql-app is writing:
docker compose -p obmp logs psql-app - Check the database directly (see database queries above)
- History dashboards need route churn — run
python3 inject.py churn
Kafka not starting
Zookeeper must be healthy first. Check:
docker compose -p obmp logs zookeeper
docker compose -p obmp restart kafka
psql-app fails to start
Usually a PostgreSQL connection issue or schema mismatch. Check:
docker compose -p obmp logs psql-app
# If "relation does not exist" errors: re-trigger DB init
touch /var/openbmp/config/init_db
docker compose -p obmp restart psql-app
13. Data Retention
Configured in docker-compose.yml via POSTGRES_DROP_* environment variables:
| Table | Default Retention |
|---|---|
| peer_event_log | 1 year |
| stat_reports | 4 weeks |
| ip_rib_log | 4 weeks |
| alerts | 4 weeks |
| ls_nodes_log | 4 months |
| ls_links_log | 4 months |
| ls_prefixes_log | 4 months |
| stats_chg_byprefix | 4 weeks |
| stats_chg_byasn | 4 weeks |
| stats_chg_bypeer | 4 weeks |
| stats_ip_origins | 4 weeks |
| stats_peer_rib | 4 weeks |
| stats_peer_update_counts | 4 weeks |
Adjust in docker-compose.yml under the psql-app service environment block.
14. Environment Variables Reference
ExaBGP container
| Variable | Default | Description |
|---|---|---|
EXABGP_LOCAL_IP |
10.40.40.202 |
Host IP ExaBGP binds to and uses as router-id |
EXABGP_LOCAL_AS |
65100 |
ExaBGP's AS number |
EXABGP_PEER_AS |
65020 |
AS of the IOS-XR lab |
EXABGP_PEER_1 |
10.100.0.100 |
First CORE router to peer with |
EXABGP_PEER_2 |
10.100.0.200 |
Second CORE router to peer with |
EXABGP_API_PORT |
5050 |
Flask API port |
psql-app container (key variables)
| Variable | Default | Description |
|---|---|---|
MEM |
3 |
JVM heap in GB |
ENABLE_RPKI |
1 |
Enable RPKI sync from Cloudflare |
ENABLE_IRR |
1 |
Enable IRR sync |
ENABLE_DBIP |
1 |
Enable DB-IP geolocation import |
POSTGRES_REPORT_WINDOW |
8 minute |
Aggregation window for summary tables |
inject.py (CLI)
| Variable | Default | Description |
|---|---|---|
EXABGP_API |
http://localhost:5050 |
ExaBGP API base URL |
15. gNMI Streaming Telemetry (Phase 4)
Overview
gNMI (gRPC Network Management Interface) adds data-plane visibility alongside BMP's control-plane monitoring. Telegraf collects real-time interface counters from all 9 IOS-XR routers via gNMI subscriptions and stores them in InfluxDB. Grafana queries InfluxDB for telemetry dashboards.
Architecture
IOS-XR Routers (9x, gRPC port 57400)
|
gNMI subscriptions (10s sample)
|
v
obmp-telegraf (telegraf:1.28 + gnmi input plugin)
host networking → reaches routers on 10.100.0.x
|
v
obmp-influxdb (influxdb:2.7, port 8086)
bucket: "telemetry", org: "openbmp"
|
v
obmp-grafana (InfluxDB datasource, Flux queries)
3 dashboards in OBMP-Telemetry folder
Enabling gRPC on Routers
The routers need gRPC enabled before Telegraf can collect telemetry. A NETCONF script is provided:
# From the host (requires ncclient: pip install ncclient)
cd /home/user/obmp-docker/gnmi
python3 gnmi_grpc_config.py
This connects to all 9 routers via NETCONF (port 830, credentials webui/cisco) and pushes:
grpc
port 57400
no-tls
Verify on router:
show grpc status
Expected: gRPC listening on port 57400.
Telemetry Data Collected
Telegraf subscribes to two IOS-XR YANG paths at 10-second intervals:
| Subscription | YANG Path | Data |
|---|---|---|
| interface_counters | Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/generic-counters |
bytes/packets in/out, errors, drops, CRC |
| interface_rates | Cisco-IOS-XR-infra-statsd-oper:infra-statistics/interfaces/interface/latest/data-rate |
bits/sec in/out, packet rate |
InfluxDB Access
- URL:
http://localhost:8086 - Org:
openbmp - Bucket:
telemetry - Token:
openbmp-telemetry-token - Retention: 30 days
Grafana Telemetry Dashboards
Three dashboards in the OBMP-Telemetry folder:
| Dashboard | UID | Description |
|---|---|---|
| Interface Utilization | obmp-telem-01 | Input/output bytes rate, packets rate, top interfaces by throughput |
| Interface Errors | obmp-telem-02 | CRC errors, input/output errors, drops, overruns |
| Combined BMP + Telemetry | obmp-telem-03 | Mixed datasource — BGP peer status (PostgreSQL) alongside interface counters (InfluxDB) |
All dashboards have $router and $interface template variables for filtering.
Troubleshooting gNMI
# Check Telegraf logs for gNMI connection status
docker logs obmp-telegraf --tail 50
# Verify InfluxDB has data
curl -s -H "Authorization: Token openbmp-telemetry-token" \
"http://localhost:8086/api/v2/query?org=openbmp" \
--data-urlencode 'q=from(bucket:"telemetry") |> range(start: -5m) |> limit(n:5)'
# Check InfluxDB health
curl http://localhost:8086/health
16. Traffic Generator (Phase 4)
Overview
A portable, containerized traffic generator with a web UI for RFC 2544 testing and custom packet flows. Built with Scapy + Flask (backend) and Vue 3 + NGINX (frontend). The container supports dual-mode operation: sender (generate traffic) or responder (receive/echo packets).
Accessing the UI
- Web UI:
http://localhost:5002 - API:
http://localhost:5051
Dual-Mode Operation
Set via TRAFFIC_GEN_MODE environment variable in docker-compose.yml:
| Mode | Description |
|---|---|
sender (default) |
Generates traffic, runs RFC 2544 tests, sends custom flows |
responder |
Listens for incoming test packets, echoes/timestamps them, reports receive stats |
Typical deployment: One instance as sender on the host, optionally a second instance as responder on another endpoint. Without a responder, the sender uses ICMP echo for latency measurement (routers respond natively).
Creating Flows
Use the Flow Builder panel (left sidebar) in the UI:
| Field | Default | Description |
|---|---|---|
| Name | - | Human-readable flow name |
| Destination IP | 10.100.0.100 |
Target router IP |
| Source IP | 10.40.40.202 |
Host IP |
| Protocol | UDP | UDP, TCP, or ICMP |
| Source Port | 50000 | (UDP/TCP only) |
| Destination Port | 5001 | (UDP/TCP only) |
| Frame Size | 512 | Packet size in bytes |
| Rate (pps) | 1000 | Packets per second |
| Duration | 30 | Seconds (0 = infinite) |
| DSCP | 0 | Differentiated Services Code Point |
After creating a flow, use the Flows tab to Start/Stop/Delete flows.
RFC 2544 Testing
Use the Tests tab to configure and run RFC 2544 tests:
| Test Type | Description |
|---|---|
| Throughput | Binary search for maximum zero-loss forwarding rate |
| Latency | Measure round-trip time at determined throughput rate |
| Frame Loss | Loss percentage vs. offered load curve |
| Back-to-Back | Maximum burst length at line rate with zero loss |
Parameters:
- Base Flow: Select a previously created flow as the test template
- Frame Sizes: Standard sizes: 64, 128, 256, 512, 1024, 1280, 1518 bytes
- Trial Duration: Per-frame-size test duration (5–300 sec)
- Max Rate (pps): Upper bound for binary search
- Acceptable Loss %: Threshold for pass/fail
Quick Presets
Six built-in presets are available in the Tests tab:
| Preset | Description |
|---|---|
| quick_icmp | ICMP ping to CORE-01 at 10 pps |
| udp_flood_small | 64-byte UDP at 5000 pps |
| udp_flood_large | 1518-byte UDP at 1000 pps |
| rfc2544_throughput | Full throughput test with standard frame sizes |
| rfc2544_latency | Latency measurement with standard frame sizes |
| tcp_session | TCP flow at 500 pps |
API Reference
| Method | Path | Description |
|---|---|---|
| GET | /healthz |
Health check + engine status |
| GET | /interfaces |
Available network interfaces |
| GET | /mode |
Current mode (sender/responder) |
| GET/POST | /flows |
List / create flows |
| GET/PUT/DELETE | /flows/<id> |
Get / update / delete flow |
| POST | /flows/<id>/start |
Start sending |
| POST | /flows/<id>/stop |
Stop sending |
| GET | /flows/<id>/stats |
Real-time stats for a flow |
| GET/POST | /tests |
List / create RFC 2544 tests |
| GET | /tests/<id> |
Test details + results |
| POST | /tests/<id>/start |
Start test execution |
| POST | /tests/<id>/stop |
Abort test |
| GET | /tests/<id>/results |
Exportable results |
| GET | /presets |
Available test presets |
| POST | /presets/<name> |
Create flow + test from preset |
| GET | /stats/history |
Stats ring buffer (300 samples) |
| GET | /responder/stats |
Responder-mode receive stats |
| POST | /responder/reset |
Reset responder counters |
Integration with gNMI Telemetry
The key value of combining the traffic generator with gNMI: send traffic while watching real-time interface counters.
- Create a UDP flow targeting a router (e.g., R9K-01 at 10.100.0.1)
- Open the Grafana Interface Utilization dashboard, select that router
- Start the flow — gNMI counters show traffic appearing on the interface
- Run an RFC 2544 throughput test — Grafana shows the stepped traffic pattern from binary search iterations
- Compare Scapy-reported stats with gNMI-reported counters for cross-validation
The Combined BMP + Telemetry dashboard shows both control-plane (BMP BGP updates) and data-plane (gNMI interface counters) side by side, enabling correlation of BGP changes with traffic impact.
Environment Variables
| Variable | Default | Description |
|---|---|---|
TRAFFIC_GEN_API_PORT |
5051 |
Flask API listen port |
TRAFFIC_GEN_MODE |
sender |
Operating mode: sender or responder |
INFLUXDB_TOKEN |
openbmp-telemetry-token |
InfluxDB auth token (Telegraf) |