Two recurring layout issues across dashboards I built this session:
1) Right-placed legend tables ate 30% of each panel width.
2) Default h:9 panels left ~50% of the viewport empty on a 1080p
display (total dashboard height ~18 grid rows vs ~30 available).
Stack Resources (Telemetry-3001/stack_resources.json):
* 3 timeseries: legend placement right -> bottom, calcs [max] -> [last,max],
added sortBy: Max desc so top consumers float to the top of the legend.
* Bumped all 4 panels h: 9 -> 14 (dashboard total 18 -> 28 rows).
Kafka Ingestion Lag and Live BGP Churn (Telemetry-3001/*):
* Bumped timeseries panels h: 9 -> 12; second-row y: 13 -> 16.
Dashboard total 22 -> 28 rows.
Policy Diff (obmp/History-1002/policy_diff.json):
* Bumped bottom-row panels h: 8 -> 11. Total 24 -> 27 rows.
Untouched (already adequate, scrollable by design, or built earlier):
evpn_rib (30 rows), global_table (38), router_diff (52), and the
Maps-1006 dashboards (already h:22-28 single panels).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
RCA: the exabgp container was OOM-killed — its 512m mem_limit was far too
small for the full-table feature (900K route objects in memory). Raises the
limit to a parameterized 6g default (EXABGP_MEM_LIMIT).
Adds Docker healthchecks to 14 services (port/HTTP probes) so unhealthy
containers are visible. Adds a Telegraf docker input that collects per-
container CPU/memory/IO into InfluxDB, plus a "Stack Resources" dashboard —
so resource pressure is caught before it causes an OOM crash. telegraf runs
with an overridden entrypoint so it keeps root and can read the docker socket.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>