- Add exabgp/ container: ExaBGP 5.x + Flask REST API for on-demand BGP route injection into CML IOS-XR lab (AS 65020 via eBGP from AS 65100) - Add 6 injection scenarios: internet_sample, churn, blackhole, anycast, full_table, lab_prefixes - Add inject.py CLI wrapper for the ExaBGP API - Add iosxr_bgp_config.md with IOS-XR neighbor config and NETCONF script - Add obmp-grafana/ dashboards and provisioning (17 dashboards) - Update docker-compose.yml: add exabgp service, fix Kafka external listener IP, extend log retention from 90min to 720min - Add DOCS.md: full project documentation including architecture, setup, user guide, sanity checks, troubleshooting, and command reference - Update .gitignore: exclude .env and .claude/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
747 lines
20 KiB
Markdown
747 lines
20 KiB
Markdown
# OpenBMP + ExaBGP Route Injector — Full Documentation
|
||
|
||
## Table of Contents
|
||
|
||
1. [What Is This Project?](#1-what-is-this-project)
|
||
2. [Architecture](#2-architecture)
|
||
3. [Prerequisites](#3-prerequisites)
|
||
4. [Initial Setup (First Time)](#4-initial-setup-first-time)
|
||
5. [IOS-XR Router Configuration](#5-ios-xr-router-configuration)
|
||
6. [Starting and Stopping](#6-starting-and-stopping)
|
||
7. [Route Injection User Guide](#7-route-injection-user-guide)
|
||
8. [Grafana Dashboards](#8-grafana-dashboards)
|
||
9. [Sanity Checks](#9-sanity-checks)
|
||
10. [Relevant Commands Reference](#10-relevant-commands-reference)
|
||
11. [Troubleshooting](#11-troubleshooting)
|
||
12. [Data Retention](#12-data-retention)
|
||
13. [Environment Variables Reference](#13-environment-variables-reference)
|
||
|
||
---
|
||
|
||
## 1. What Is This Project?
|
||
|
||
This is a **BGP Monitoring Platform (BMP) lab stack** deployed via Docker Compose. It collects, stores, and visualizes BGP routing data from a Cisco IOS-XR lab network (running in Cisco Modeling Labs / CML).
|
||
|
||
**What it does:**
|
||
|
||
- Receives BMP (BGP Monitoring Protocol, RFC 7854) telemetry from routers on TCP port 5000
|
||
- Streams BMP data through Kafka into a TimescaleDB/PostgreSQL database
|
||
- Provides 17 Grafana dashboards for real-time and historical BGP analysis
|
||
- Includes an **ExaBGP route injector** that peers with the two CORE routers and injects synthetic BGP routes, enabling testing of BGP policy, route propagation, and Grafana dashboards without needing internet connectivity
|
||
|
||
**The lab network:**
|
||
|
||
- AS 65020 — 9 Cisco IOS-XR routers in CML (iBGP full mesh via route-reflectors)
|
||
- AS 65100 — ExaBGP container (eBGP peer to both CORE routers)
|
||
- CORE-01: `10.100.0.100` (CML-R9K-CORE-01)
|
||
- CORE-02: `10.100.0.200` (CML-R9K-CORE-02)
|
||
- Host IP: `10.40.40.202` (ExaBGP binds here; reachable from CML management network)
|
||
|
||
---
|
||
|
||
## 2. Architecture
|
||
|
||
```
|
||
IOS-XR Routers (9x, AS 65020)
|
||
BMP telemetry on TCP 5000
|
||
|
|
||
v
|
||
obmp-collector (openbmp/collector:2.2.3)
|
||
|
|
||
v
|
||
obmp-kafka (confluentinc/cp-kafka:7.1.1)
|
||
+ obmp-zookeeper (confluentinc/cp-zookeeper:7.1.1)
|
||
|
|
||
v
|
||
obmp-psql-app (openbmp/psql-app:2.2.2)
|
||
Java consumer — writes parsed BGP data to PostgreSQL
|
||
|
|
||
v
|
||
obmp-psql (openbmp/postgres:2.2.1)
|
||
PostgreSQL 14 + TimescaleDB
|
||
|
|
||
+---------> obmp-grafana (grafana/grafana:9.1.7) :3000
|
||
| 17 dashboards, PostgreSQL datasource
|
||
+---------> obmp-whois (openbmp/whois:2.2.0) :4300
|
||
WHOIS query server backed by the DB
|
||
|
||
ExaBGP (obmp-exabgp, built locally)
|
||
python:3.11-slim + exabgp 5.x + Flask API
|
||
Peers eBGP to CORE-01 and CORE-02 (AS 65100 -> AS 65020)
|
||
HTTP API on :5050 — inject/withdraw routes on demand
|
||
Routes propagate via iBGP mesh to all 9 routers -> BMP -> DB -> Grafana
|
||
```
|
||
|
||
### Container Summary
|
||
|
||
| Container | Image | Port(s) | Role |
|
||
|-----------|-------|---------|------|
|
||
| obmp-zookeeper | confluentinc/cp-zookeeper:7.1.1 | 2181 (internal) | Kafka coordination |
|
||
| obmp-kafka | confluentinc/cp-kafka:7.1.1 | 9092 | Message broker |
|
||
| obmp-collector | openbmp/collector:2.2.3 | 5000 | BMP receiver |
|
||
| obmp-psql-app | openbmp/psql-app:2.2.2 | 9005 | Kafka→PostgreSQL consumer |
|
||
| obmp-psql | openbmp/postgres:2.2.1 | 5432 | TimescaleDB storage |
|
||
| obmp-grafana | grafana/grafana:9.1.7 | 3000 | Visualization |
|
||
| obmp-whois | openbmp/whois:2.2.0 | 4300 | WHOIS query server |
|
||
| obmp-exabgp | local build | 5050 (host net) | BGP route injector |
|
||
|
||
---
|
||
|
||
## 3. Prerequisites
|
||
|
||
- Docker Engine (20.10+) and Docker Compose v2
|
||
- Host IP `10.40.40.202` reachable from the CML management network
|
||
- CML routers with BMP configured pointing to `10.40.40.202:5000`
|
||
- CML CORE routers configured with ExaBGP as eBGP neighbor (see Section 5)
|
||
- `OBMP_DATA_ROOT` directory created (default: `/var/openbmp`)
|
||
|
||
---
|
||
|
||
## 4. Initial Setup (First Time)
|
||
|
||
### 4.1 Clone the repository
|
||
|
||
```bash
|
||
git clone <this-repo-url>
|
||
cd obmp-docker
|
||
```
|
||
|
||
### 4.2 Create persistent data directories
|
||
|
||
```bash
|
||
export OBMP_DATA_ROOT=/var/openbmp
|
||
sudo mkdir -p $OBMP_DATA_ROOT
|
||
mkdir -p ${OBMP_DATA_ROOT}/config
|
||
mkdir -p ${OBMP_DATA_ROOT}/kafka-data
|
||
mkdir -p ${OBMP_DATA_ROOT}/zk-data
|
||
mkdir -p ${OBMP_DATA_ROOT}/zk-log
|
||
mkdir -p ${OBMP_DATA_ROOT}/postgres/data
|
||
mkdir -p ${OBMP_DATA_ROOT}/postgres/ts
|
||
mkdir -p ${OBMP_DATA_ROOT}/grafana
|
||
mkdir -p ${OBMP_DATA_ROOT}/grafana/dashboards
|
||
sudo chmod -R 777 $OBMP_DATA_ROOT
|
||
```
|
||
|
||
### 4.3 Initialise the database (first run only)
|
||
|
||
Create the init trigger file — this causes psql-app to create all tables on startup:
|
||
|
||
```bash
|
||
touch ${OBMP_DATA_ROOT}/config/init_db
|
||
```
|
||
|
||
> **Warning:** Do not create this file on subsequent runs unless you want to wipe and recreate the entire database.
|
||
|
||
### 4.4 Copy Grafana provisioning files
|
||
|
||
```bash
|
||
cp -r obmp-grafana/provisioning ${OBMP_DATA_ROOT}/grafana/
|
||
cp -r obmp-grafana/dashboards ${OBMP_DATA_ROOT}/grafana/
|
||
```
|
||
|
||
### 4.5 Start the stack
|
||
|
||
```bash
|
||
OBMP_DATA_ROOT=/var/openbmp docker compose -p obmp up -d
|
||
```
|
||
|
||
Wait ~2 minutes for all services to initialise (especially PostgreSQL and psql-app which run schema migrations).
|
||
|
||
### 4.6 Verify everything is running
|
||
|
||
```bash
|
||
docker compose -p obmp ps
|
||
docker compose -p obmp logs --tail=20 psql-app
|
||
```
|
||
|
||
---
|
||
|
||
## 5. IOS-XR Router Configuration
|
||
|
||
The ExaBGP container peers eBGP with both CORE routers. Each CORE router must be configured with:
|
||
|
||
### 5.1 Route policies (apply once per router)
|
||
|
||
```
|
||
route-policy EXABGP_IN
|
||
pass
|
||
end-policy
|
||
|
||
route-policy EXABGP_OUT
|
||
drop
|
||
end-policy
|
||
```
|
||
|
||
### 5.2 BGP neighbor block
|
||
|
||
```
|
||
router bgp 65020
|
||
neighbor 10.40.40.202
|
||
remote-as 65100
|
||
description ExaBGP-Route-Injector
|
||
ebgp-multihop 5
|
||
update-source MgmtEth0/RP0/CPU0/0
|
||
!
|
||
address-family ipv4 unicast
|
||
route-policy EXABGP_IN in
|
||
route-policy EXABGP_OUT out
|
||
next-hop-self
|
||
!
|
||
!
|
||
!
|
||
```
|
||
|
||
### 5.3 Static route for next-hop resolution
|
||
|
||
IOS-XR BGP does not use the default route (0.0.0.0/0) to resolve BGP next-hops. A more-specific static route for the ExaBGP host subnet is required in the default VRF:
|
||
|
||
```
|
||
router static
|
||
address-family ipv4 unicast
|
||
10.40.40.0/24 10.100.0.254
|
||
!
|
||
!
|
||
```
|
||
|
||
### 5.4 Config notes
|
||
|
||
| Knob | Why |
|
||
|------|-----|
|
||
| `remote-as 65100` | ExaBGP presents as AS 65100 (eBGP to your AS 65020 mesh) |
|
||
| `ebgp-multihop 5` | Host and router are on different subnets |
|
||
| `update-source MgmtEth0/RP0/CPU0/0` | ExaBGP is reachable via the management interface |
|
||
| `next-hop-self` | Replace ExaBGP's next-hop (10.40.40.202) with the CORE router's address when reflecting into iBGP — ensures all routers can resolve the next-hop |
|
||
| `EXABGP_OUT` drops | Prevents the lab from advertising its own prefixes back to ExaBGP |
|
||
| Static route | Required: IOS-XR BGP will not install injected routes as bestpaths without a specific route to the next-hop |
|
||
|
||
### 5.5 NETCONF alternative
|
||
|
||
See `exabgp/iosxr_bgp_config.md` for a Python/ncclient script that pushes all of the above config programmatically.
|
||
|
||
Credentials: `username=webui`, `password=cisco`, port 830.
|
||
|
||
---
|
||
|
||
## 6. Starting and Stopping
|
||
|
||
### Start all services
|
||
|
||
```bash
|
||
OBMP_DATA_ROOT=/var/openbmp docker compose -p obmp up -d
|
||
```
|
||
|
||
### Stop all services (preserve data)
|
||
|
||
```bash
|
||
docker compose -p obmp down
|
||
```
|
||
|
||
### Stop and remove all data (full reset)
|
||
|
||
```bash
|
||
docker compose -p obmp down -v
|
||
sudo rm -rf /var/openbmp
|
||
```
|
||
|
||
### Rebuild the ExaBGP container (after code changes)
|
||
|
||
```bash
|
||
docker compose -p obmp build exabgp
|
||
docker compose -p obmp up -d exabgp
|
||
```
|
||
|
||
### Restart a single service
|
||
|
||
```bash
|
||
docker compose -p obmp restart <service>
|
||
# e.g.:
|
||
docker compose -p obmp restart exabgp
|
||
docker compose -p obmp restart psql-app
|
||
```
|
||
|
||
---
|
||
|
||
## 7. Route Injection User Guide
|
||
|
||
The ExaBGP container exposes a Flask REST API on port 5050 (host network). The `inject.py` CLI wraps this API.
|
||
|
||
### 7.1 Setup
|
||
|
||
```bash
|
||
cd exabgp
|
||
pip install requests # only needed if running inject.py from the host
|
||
```
|
||
|
||
### 7.2 Check status
|
||
|
||
```bash
|
||
python3 inject.py status
|
||
```
|
||
|
||
Output shows API health, active route count, and peer states:
|
||
|
||
```json
|
||
{
|
||
"status": "ok",
|
||
"active_routes": 77,
|
||
"peers": {
|
||
"10.100.0.100": {"state": "up", "updated": "2026-03-05T10:00:00Z"},
|
||
"10.100.0.200": {"state": "up", "updated": "2026-03-05T10:00:00Z"}
|
||
}
|
||
}
|
||
```
|
||
|
||
### 7.3 List available scenarios
|
||
|
||
```bash
|
||
python3 inject.py scenarios
|
||
```
|
||
|
||
| Scenario | Routes | Description |
|
||
|----------|--------|-------------|
|
||
| `internet_sample` | ~94 | Partial internet table — real public prefixes, realistic AS paths (Cloudflare, Google, AWS, Azure, etc.) |
|
||
| `churn` | 30 | RFC documentation prefixes for announce/withdraw churn testing |
|
||
| `blackhole` | 5 | /32 prefixes with RTBH community (65100:666 + 65535:666) |
|
||
| `anycast` | 3 | Same prefixes with varying AS paths and MEDs (best-path testing) |
|
||
| `full_table` | 500+ | Large partial internet table with synthetic /24s |
|
||
| `lab_prefixes` | 8 | Enterprise/SP-style routes with communities and local-pref |
|
||
|
||
### 7.4 Load a scenario
|
||
|
||
```bash
|
||
python3 inject.py scenario internet_sample
|
||
```
|
||
|
||
Routes propagate: ExaBGP → CORE-01/CORE-02 (eBGP) → all 9 routers (iBGP) → BMP → Kafka → PostgreSQL → Grafana.
|
||
|
||
### 7.5 Withdraw a scenario
|
||
|
||
```bash
|
||
python3 inject.py withdraw-scenario internet_sample
|
||
```
|
||
|
||
### 7.6 Announce individual prefixes
|
||
|
||
```bash
|
||
python3 inject.py announce 10.0.0.0/8 \
|
||
--as-path 65100 3356 15169 \
|
||
--community 65100:100 \
|
||
--med 100
|
||
```
|
||
|
||
### 7.7 Withdraw individual prefixes
|
||
|
||
```bash
|
||
python3 inject.py withdraw 10.0.0.0/8
|
||
```
|
||
|
||
### 7.8 Withdraw everything
|
||
|
||
```bash
|
||
python3 inject.py withdraw-all
|
||
```
|
||
|
||
### 7.9 Generate route churn (populate history tables)
|
||
|
||
The `churn` command cycles the churn scenario repeatedly, generating `ip_rib_log` and `stats_chg_*` entries that power Grafana's history dashboards.
|
||
|
||
```bash
|
||
# 5 cycles, 30 seconds apart
|
||
python3 inject.py churn --count 5 --interval 30
|
||
|
||
# Run indefinitely until Ctrl+C
|
||
python3 inject.py churn
|
||
```
|
||
|
||
### 7.10 REST API directly (curl)
|
||
|
||
```bash
|
||
BASE=http://localhost:5050
|
||
|
||
# Health
|
||
curl $BASE/healthz
|
||
|
||
# List scenarios
|
||
curl $BASE/scenarios
|
||
|
||
# Load scenario
|
||
curl -X POST $BASE/scenario/internet_sample
|
||
|
||
# Announce custom prefix
|
||
curl -X POST $BASE/announce \
|
||
-H 'Content-Type: application/json' \
|
||
-d '{"prefixes":["10.0.0.0/8"],"as_path":[65100,3356,15169],"communities":["65100:100"]}'
|
||
|
||
# Withdraw all
|
||
curl -X POST $BASE/withdraw/all
|
||
|
||
# Peer state
|
||
curl $BASE/peers
|
||
```
|
||
|
||
### 7.11 Adding custom scenarios
|
||
|
||
Edit `exabgp/scenarios/__init__.py`. Add an entry to `SCENARIOS` following the existing pattern:
|
||
|
||
```python
|
||
SCENARIOS['my_scenario'] = {
|
||
'description': 'My custom routes',
|
||
'routes': [
|
||
_r('192.0.2.0/24', [65100, 65200], communities=['65100:100']),
|
||
],
|
||
}
|
||
```
|
||
|
||
The `scenarios/` directory is volume-mounted into the container, so changes are live without rebuilding. However, the Python module is imported at container start — **restart the container** after editing:
|
||
|
||
```bash
|
||
docker compose -p obmp restart exabgp
|
||
```
|
||
|
||
---
|
||
|
||
## 8. Grafana Dashboards
|
||
|
||
Access: `http://10.40.40.202:3000`
|
||
Default credentials: `admin` / `openbmp` (anonymous access also enabled)
|
||
|
||
### Dashboard Categories
|
||
|
||
| Category | Dashboard | Description |
|
||
|----------|-----------|-------------|
|
||
| General | OBMP Home | Overview / landing page |
|
||
| Base | Inventory | Router and peer inventory |
|
||
| Base | Looking Glass | Real-time RIB lookup by prefix |
|
||
| Base | ASN View | ASN-level routing view |
|
||
| History | Prefix History | Route change history for a prefix |
|
||
| History | Prefix History by ASN | Filtered by origin AS |
|
||
| History | Prefix History by Community | Filtered by BGP community |
|
||
| Tops | Top Prefixes | Most-updated prefixes |
|
||
| Tops | Top L3VPN Prefixes | L3VPN equivalent |
|
||
| Link State | LS Nodes | IS-IS link-state node database |
|
||
| Link State | LS Links | IS-IS link-state link database |
|
||
| Link State | LS Topology | Network topology map |
|
||
| Link State | LS Prefixes | Link-state prefix database |
|
||
| Link State | LS History | Link-state change history |
|
||
| L3VPN | L3VPN Looking Glass | VPN RIB lookup |
|
||
| L3VPN | L3VPN Prefix History | VPN route change history |
|
||
| L3VPN | L3VPN RIB Browser | Full VPN RIB browser |
|
||
|
||
> History dashboards require `ip_rib_log` and `stats_chg_*` table data. Run `inject.py churn` to populate these.
|
||
|
||
---
|
||
|
||
## 9. Sanity Checks
|
||
|
||
### 9.1 All containers running
|
||
|
||
```bash
|
||
docker compose -p obmp ps
|
||
```
|
||
|
||
All containers should show `running`. If any are restarting, check logs:
|
||
|
||
```bash
|
||
docker compose -p obmp logs --tail=50 <service>
|
||
```
|
||
|
||
### 9.2 ExaBGP peers up
|
||
|
||
```bash
|
||
python3 exabgp/inject.py status
|
||
```
|
||
|
||
Both `10.100.0.100` and `10.100.0.200` should show `"state": "up"`.
|
||
|
||
Or check from the router side:
|
||
|
||
```
|
||
show bgp neighbors 10.40.40.202
|
||
show bgp summary | inc 10.40.40.202
|
||
```
|
||
|
||
### 9.3 Routes accepted by CORE routers
|
||
|
||
After loading `internet_sample`:
|
||
|
||
```bash
|
||
# On CORE-01 or CORE-02:
|
||
show bgp summary
|
||
# Expect: 77 accepted prefixes, 77 are bestpaths from 10.40.40.202
|
||
|
||
show bgp 8.8.8.0/24
|
||
# Expect: best path via 10.40.40.202 (eBGP), also iBGP copies from other routers
|
||
```
|
||
|
||
### 9.4 Routes in OpenBMP database
|
||
|
||
```bash
|
||
docker exec -it obmp-psql psql -U openbmp -c "
|
||
SELECT count(DISTINCT prefix) AS unique_prefixes,
|
||
count(DISTINCT peer_hash_id) AS peers_reporting
|
||
FROM ip_rib
|
||
WHERE isIPv4 = true AND isWithdrawn = false;
|
||
"
|
||
```
|
||
|
||
Expect `~129 unique prefixes` and `56 peers_reporting` (9 routers × ~6 peers each) after loading `internet_sample`.
|
||
|
||
### 9.5 Kafka is healthy
|
||
|
||
```bash
|
||
docker exec -it obmp-kafka kafka-topics --bootstrap-server localhost:29092 --list
|
||
```
|
||
|
||
Should show topics like `openbmp.parsed.unicast_prefix`, `openbmp.parsed.peer`, etc.
|
||
|
||
### 9.6 Grafana datasource
|
||
|
||
Open `http://10.40.40.202:3000` → Configuration → Data Sources → OpenBMP → Test.
|
||
Should return "Database Connection OK".
|
||
|
||
### 9.7 BMP collector receiving data
|
||
|
||
```bash
|
||
docker compose -p obmp logs --tail=30 collector
|
||
```
|
||
|
||
Should show connections from router management IPs.
|
||
|
||
### 9.8 psql-app consumer is caught up
|
||
|
||
```bash
|
||
docker compose -p obmp logs --tail=30 psql-app
|
||
```
|
||
|
||
Should show periodic cron job outputs (RPKI sync, IRR sync, global_ip_rib updates).
|
||
|
||
---
|
||
|
||
## 10. Relevant Commands Reference
|
||
|
||
### Docker Compose
|
||
|
||
```bash
|
||
# Start stack
|
||
OBMP_DATA_ROOT=/var/openbmp docker compose -p obmp up -d
|
||
|
||
# Stop stack
|
||
docker compose -p obmp down
|
||
|
||
# Show status
|
||
docker compose -p obmp ps
|
||
|
||
# Follow logs (all services)
|
||
docker compose -p obmp logs -f
|
||
|
||
# Follow logs (specific service)
|
||
docker compose -p obmp logs -f exabgp
|
||
docker compose -p obmp logs -f psql-app
|
||
docker compose -p obmp logs -f collector
|
||
|
||
# Rebuild and restart ExaBGP
|
||
docker compose -p obmp build exabgp && docker compose -p obmp up -d exabgp
|
||
|
||
# Restart a service
|
||
docker compose -p obmp restart psql-app
|
||
```
|
||
|
||
### Route Injection (from `exabgp/` directory)
|
||
|
||
```bash
|
||
# API health and peer states
|
||
python3 inject.py status
|
||
|
||
# List active routes
|
||
python3 inject.py routes
|
||
|
||
# List scenarios
|
||
python3 inject.py scenarios
|
||
|
||
# Load a scenario
|
||
python3 inject.py scenario internet_sample
|
||
python3 inject.py scenario churn
|
||
python3 inject.py scenario blackhole
|
||
python3 inject.py scenario full_table
|
||
python3 inject.py scenario lab_prefixes
|
||
|
||
# Withdraw a scenario
|
||
python3 inject.py withdraw-scenario internet_sample
|
||
|
||
# Withdraw all active routes
|
||
python3 inject.py withdraw-all
|
||
|
||
# Announce a specific prefix
|
||
python3 inject.py announce 10.0.0.0/8 --as-path 65100 3356 15169 --community 65100:100
|
||
|
||
# Withdraw a specific prefix
|
||
python3 inject.py withdraw 10.0.0.0/8
|
||
|
||
# Run churn (populate history tables)
|
||
python3 inject.py churn --count 5 --interval 30
|
||
```
|
||
|
||
### Database Queries
|
||
|
||
```bash
|
||
# Connect to database
|
||
docker exec -it obmp-psql psql -U openbmp -d openbmp
|
||
|
||
# Count unique prefixes in RIB
|
||
SELECT count(DISTINCT prefix) FROM ip_rib WHERE isIPv4=true AND isWithdrawn=false;
|
||
|
||
# Show recent route changes
|
||
SELECT prefix, origin_as, iswithdrawn, timestamp
|
||
FROM ip_rib_log
|
||
ORDER BY timestamp DESC LIMIT 20;
|
||
|
||
# Show peer summary
|
||
SELECT name, state, timestamp_last_updated
|
||
FROM bgp_peers
|
||
ORDER BY state, name;
|
||
|
||
# Show routes from ExaBGP peer
|
||
SELECT prefix, origin_as, as_path
|
||
FROM ip_rib
|
||
WHERE peer_hash_id IN (
|
||
SELECT hash_id FROM bgp_peers WHERE peer_addr = '10.40.40.202'
|
||
)
|
||
AND isWithdrawn = false;
|
||
```
|
||
|
||
### IOS-XR Verification (on router CLI)
|
||
|
||
```
|
||
show bgp neighbors 10.40.40.202
|
||
show bgp neighbors 10.40.40.202 received routes
|
||
show bgp summary
|
||
show bgp 8.8.8.0/24
|
||
show bgp 1.1.1.0/24
|
||
show route 8.8.8.0/24
|
||
```
|
||
|
||
---
|
||
|
||
## 11. Troubleshooting
|
||
|
||
### ExaBGP container keeps restarting
|
||
|
||
Check logs:
|
||
|
||
```bash
|
||
docker compose -p obmp logs --tail=50 exabgp
|
||
```
|
||
|
||
Common causes and fixes:
|
||
|
||
| Symptom | Cause | Fix |
|
||
|---------|-------|-----|
|
||
| Exits after "welcome" banner | Missing or wrong env file path | `startup.sh` generates `/usr/local/etc/exabgp/exabgp.env` — verify this path exists in container |
|
||
| Process `api` killed 5 times | Wrong Python path in conf | Conf uses `/usr/local/bin/python3` — correct for python:3.11-slim |
|
||
| `drop = true` in env | ExaBGP drops privileges to nobody, can't bind 179 | `startup.sh` patches `drop = false` — check the sed lines ran |
|
||
| `__pycache__ Permission denied` during build | Root-owned cache from previous container run | `.dockerignore` excludes `**/__pycache__` — confirm file exists |
|
||
|
||
### BGP sessions not establishing
|
||
|
||
1. Verify host IP `10.40.40.202` is reachable from CML management network: `ping 10.40.40.202` from router
|
||
2. Check ExaBGP peer state: `python3 exabgp/inject.py status`
|
||
3. On router: `show bgp neighbors 10.40.40.202` — look for error codes
|
||
4. Common IOS-XR errors:
|
||
- `no-update-source-config` — add `update-source MgmtEth0/RP0/CPU0/0`
|
||
- `no-ipv6-address` — ensure only IPv4 unicast AF is configured (no IPv6)
|
||
- TCP refused — check port 179 is reachable (ExaBGP uses `network_mode: host`)
|
||
|
||
### Routes received but not bestpath
|
||
|
||
IOS-XR BGP requires a specific route to resolve the BGP next-hop (10.40.40.202). The default route (0.0.0.0/0) is insufficient.
|
||
|
||
```
|
||
router static
|
||
address-family ipv4 unicast
|
||
10.40.40.0/24 10.100.0.254
|
||
```
|
||
|
||
Verify: `show bgp 1.1.1.0/24` — should show `Status: s (active), bestpath`.
|
||
|
||
### Grafana shows no data
|
||
|
||
1. Check datasource: Configuration → Data Sources → OpenBMP → Test
|
||
2. Verify psql-app is writing: `docker compose -p obmp logs psql-app`
|
||
3. Check the database directly (see database queries above)
|
||
4. History dashboards need route churn — run `python3 inject.py churn`
|
||
|
||
### Kafka not starting
|
||
|
||
Zookeeper must be healthy first. Check:
|
||
|
||
```bash
|
||
docker compose -p obmp logs zookeeper
|
||
docker compose -p obmp restart kafka
|
||
```
|
||
|
||
### psql-app fails to start
|
||
|
||
Usually a PostgreSQL connection issue or schema mismatch. Check:
|
||
|
||
```bash
|
||
docker compose -p obmp logs psql-app
|
||
# If "relation does not exist" errors: re-trigger DB init
|
||
touch /var/openbmp/config/init_db
|
||
docker compose -p obmp restart psql-app
|
||
```
|
||
|
||
---
|
||
|
||
## 12. Data Retention
|
||
|
||
Configured in `docker-compose.yml` via `POSTGRES_DROP_*` environment variables:
|
||
|
||
| Table | Default Retention |
|
||
|-------|-------------------|
|
||
| peer_event_log | 1 year |
|
||
| stat_reports | 4 weeks |
|
||
| ip_rib_log | 4 weeks |
|
||
| alerts | 4 weeks |
|
||
| ls_nodes_log | 4 months |
|
||
| ls_links_log | 4 months |
|
||
| ls_prefixes_log | 4 months |
|
||
| stats_chg_byprefix | 4 weeks |
|
||
| stats_chg_byasn | 4 weeks |
|
||
| stats_chg_bypeer | 4 weeks |
|
||
| stats_ip_origins | 4 weeks |
|
||
| stats_peer_rib | 4 weeks |
|
||
| stats_peer_update_counts | 4 weeks |
|
||
|
||
Adjust in `docker-compose.yml` under the `psql-app` service environment block.
|
||
|
||
---
|
||
|
||
## 13. Environment Variables Reference
|
||
|
||
### ExaBGP container
|
||
|
||
| Variable | Default | Description |
|
||
|----------|---------|-------------|
|
||
| `EXABGP_LOCAL_IP` | `10.40.40.202` | Host IP ExaBGP binds to and uses as router-id |
|
||
| `EXABGP_LOCAL_AS` | `65100` | ExaBGP's AS number |
|
||
| `EXABGP_PEER_AS` | `65020` | AS of the IOS-XR lab |
|
||
| `EXABGP_PEER_1` | `10.100.0.100` | First CORE router to peer with |
|
||
| `EXABGP_PEER_2` | `10.100.0.200` | Second CORE router to peer with |
|
||
| `EXABGP_API_PORT` | `5050` | Flask API port |
|
||
|
||
### psql-app container (key variables)
|
||
|
||
| Variable | Default | Description |
|
||
|----------|---------|-------------|
|
||
| `MEM` | `3` | JVM heap in GB |
|
||
| `ENABLE_RPKI` | `1` | Enable RPKI sync from Cloudflare |
|
||
| `ENABLE_IRR` | `1` | Enable IRR sync |
|
||
| `ENABLE_DBIP` | `1` | Enable DB-IP geolocation import |
|
||
| `POSTGRES_REPORT_WINDOW` | `8 minute` | Aggregation window for summary tables |
|
||
|
||
### inject.py (CLI)
|
||
|
||
| Variable | Default | Description |
|
||
|----------|---------|-------------|
|
||
| `EXABGP_API` | `http://localhost:5050` | ExaBGP API base URL |
|