nXsi
HomeProductsBlogGuidesMCP ServersServicesAbout
HomeProductsBlogGuidesMCP ServersServicesAbout
nXsi

Practical guides, automation tools, and self-hosted products for developers and homelabbers.

Content

  • Blog
  • Products
  • Guides
  • MCP Servers

Resources

  • About
  • Services
  • Support
  • Privacy Policy

Newsletter

Weekly AI architecture insights. No spam.

© 2026 nXsi Intelligence. All rights reserved.
  1. Home
  2. Blog
  3. 9 Docker Compose Patterns I Use in Every…
GuideintermediateFebruary 21, 2026·7 min read·9 min read hands-on

9DockerComposePatternsIUseinEveryHomelabStack

Practical patterns from a production 9-service monitoring stack — environment defaults, health checks, memory limits, and the small details that prevent 3 AM debugging sessions.

docker-composehomelabdockerself-hostingguide
Series: Homelab Monitoring StackPart 4 of 4
← Previous

Monitoring Proxmox with Grafana and Prometheus: A Practical Setup

Share
XLinkedIn
Table of Contents

Most Docker Compose files I see in homelab repos are fragile. They work on the author's machine, break on yours, and give you zero information about why. The patterns below come from a 9-service monitoring stack I built and tested on a box with 8 GB of RAM. Every snippet is pulled from that real compose file — nothing hypothetical.


Environment Variables With Defaults

The single most impactful pattern for making a compose file portable: ${VAR:-default} syntax. Users who care about customization edit .env. Everyone else gets working defaults without touching YAML.

ports:
  - "${GRAFANA_PORT:-3000}:3000"
environment:
  - GF_SECURITY_ADMIN_USER=${GRAFANA_ADMIN_USER:-admin}
  - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD:-admin}
  - GF_AUTH_ANONYMOUS_ENABLED=${GRAFANA_ANONYMOUS_ENABLED:-false}
command:
  - "--storage.tsdb.retention.time=${PROMETHEUS_RETENTION:-15d}"
  - "--storage.tsdb.retention.size=${PROMETHEUS_RETENTION_SIZE:-10GB}"

The monitoring stack has 9 services with roughly 20 configurable values. Not one requires editing docker-compose.yml. Copy .env.example to .env, change what matters, run docker compose up -d. That .env.example becomes the documentation — inline comments explain every variable:

# How long to keep metrics (default: 15 days)
PROMETHEUS_RETENTION=15d
# Max disk space for metrics (default: 10GB)
PROMETHEUS_RETENTION_SIZE=10GB

Ship an .env.example, never a populated .env. Users who get a pre-filled .env will inevitably run with your test credentials.


Health Checks on Every Service

Without health checks, Docker has no idea whether your container is actually functioning or just technically alive with a running PID. The difference matters when other services depend on it.

Grafana's health check:

healthcheck:
  test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000/api/health"]
  interval: 30s
  timeout: 5s
  retries: 3

Prometheus uses its dedicated health endpoint:

healthcheck:
  test: ["CMD", "wget", "--spider", "-q", "http://localhost:9090/-/healthy"]
  interval: 30s
  timeout: 5s
  retries: 3

Uptime Kuma is the awkward one — no wget or curl in the image, just Node.js:

healthcheck:
  test: ["CMD-SHELL", "node -e \"const http = require('http'); http.get('http://localhost:3001/api/status-page/heartbeat', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) }).on('error', () => process.exit(1))\""]
  interval: 30s
  timeout: 5s
  retries: 3

Ugly. But it works, and that's the point — you use whatever HTTP client the image gives you. Alpine-based images have wget. Debian-based often have curl. Node images have node. Check what's available before reaching for apt-get install in a health check command (I've seen people do this; don't).

The interval: 30s with retries: 3 means Docker marks a container unhealthy after 90 seconds of consecutive failures. Tight enough to catch real problems, loose enough to survive a slow Loki ingester warmup.


Memory Limits

If you're running on a 4-8 GB homelab box, one misbehaving container can OOM-kill everything. Memory limits turn a full-system crash into one container restarting.

deploy:
  resources:
    limits:
      memory: 256M

The actual limits from the monitoring stack:

ServiceMemory LimitWhy
Prometheus512MLargest consumer — TSDB, WAL, query execution
Grafana256MDashboard rendering, plugin loading
Loki256MLog ingestion and TSDB
Uptime Kuma256MNode.js runtime + SQLite
cAdvisor128MKernel metrics collection
Promtail128MLog tailing and shipping
Node Exporter64MMinimal — reads /proc and /sys
Alertmanager64MLightweight alert routing
PVE Exporter64MPython + API polling

Total budget: ~1.4 GB for 9 services. That leaves breathing room on an 8 GB machine running other things.

I arrived at these numbers by running the stack under load and watching actual consumption. Prometheus peaked around 380 MB with 15 days of retention and 7 scrape targets. The 512M limit gives it headroom without letting a runaway query eat all your RAM.

(An aside: I initially set Promtail to 64M and it kept getting killed during first boot when it tried to ingest all existing Docker logs at once. 128M fixed it. Your move, first-boot burst ingestion.)


Named Volumes vs. Bind Mounts

Two different volume patterns serve two different purposes. The compose file uses both, deliberately.

Named volumes for persistent data — things the service writes and manages:

volumes:
  grafana-data:/var/lib/grafana
  prometheus-data:/prometheus
  loki-data:/loki
  alertmanager-data:/alertmanager
  uptime-kuma-data:/app/data
  promtail-positions:/tmp

Bind mounts for config files you author and the service reads:

volumes:
  - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
  - ./prometheus/alerts:/etc/prometheus/alerts:ro
  - ./grafana/provisioning:/etc/grafana/provisioning:ro
  - ./grafana/dashboards:/var/lib/grafana/dashboards:ro

Named volumes survive docker compose down and get managed by Docker's storage driver. Bind mounts point at files you version control. Mixing these up — putting Prometheus data in a bind mount, or configs in a named volume — creates pain that shows up weeks later when you try to back up or redeploy.


Read-Only Config Mounts

Every bind-mounted config file has :ro appended:

volumes:
  - ./loki/loki.yml:/etc/loki/loki.yml:ro
  - ./promtail/promtail.yml:/etc/promtail/promtail.yml:ro
  - ./alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml:ro

Two reasons. First, defense in depth — a compromised container can't modify its own config to do something you didn't intend. Second, it makes intent explicit. When I see :ro I know that file flows one direction: host into container, never back.

Node Exporter takes this further with read-only mounts of the host filesystem:

volumes:
  - /proc:/host/proc:ro
  - /sys:/host/sys:ro
  - /:/host:ro,rslave

That :ro,rslave on the root mount matters — rslave propagates mount events from the host into the container (so new mounts appear), but :ro prevents the container from writing anything back. Node Exporter needs to see your filesystems to report disk usage. It does not need write access.


Dedicated Networks

networks:
  monitoring:
    name: monitoring
    driver: bridge

Every service joins this network:

networks:
  - monitoring

The name: monitoring is deliberate. Without it, Docker prefixes the project directory name — you'd get homelab-monitoring_monitoring or whatever your folder is called. Explicit names make docker network inspect monitoring predictable.

Inside this network, services resolve each other by name. Grafana's datasource config references http://prometheus:9090 and http://loki:3100. No IPs. If Prometheus restarts and gets a different internal IP, everything still works.

A common mistake: using the default bridge network for everything. When your monitoring stack, your media stack, and your home automation stack all share a network, every container can talk to every other container. Dedicated networks provide isolation. Your Prometheus doesn't need to reach your Plex server.


Restart Policies

Every service uses the same policy:

restart: unless-stopped

Not always. The difference: unless-stopped respects manual docker stop commands. If you stop Grafana to debug something, it stays stopped. always would restart it immediately, fighting you.

Not on-failure either. on-failure only restarts on non-zero exit codes. If a service gets OOM-killed, the exit code is 137 (SIGKILL) — on-failure handles that. But some services exit cleanly with code 0 during transient issues and need to come back. unless-stopped covers both cases.

One line. Boring. Prevents you from waking up to a dead monitoring stack because Loki crashed at 2 AM and nobody restarted it.


Container Naming

container_name: monitoring-grafana
container_name: monitoring-prometheus
container_name: monitoring-loki
container_name: monitoring-promtail
container_name: monitoring-node-exporter
container_name: monitoring-cadvisor
container_name: monitoring-alertmanager
container_name: monitoring-pve-exporter
container_name: monitoring-uptime-kuma

The monitoring- prefix does two things. docker ps becomes scannable — you can instantly tell which stack a container belongs to. And it prevents name collisions. If you have a grafana container in your monitoring stack and another in a dev stack, Docker will refuse to start the second one. Prefixed names eliminate the ambiguity.

docker ps --filter name=monitoring- gives you just this stack. Fast.


Dependency Ordering With Health Conditions

Promtail ships logs to Loki. If Promtail starts before Loki is ready, log batches fail and get retried — noisy, wasteful, and alarming if you're watching the logs.

promtail:
  depends_on:
    loki:
      condition: service_healthy

This is the only dependency declaration in the stack because it's the only one that genuinely matters. Prometheus scrapes targets on an interval — if a target isn't up yet, the scrape fails silently and succeeds on the next cycle. Grafana queries Prometheus on demand — if Prometheus isn't ready, the dashboard shows "No data" and refreshes automatically. These services tolerate startup ordering gracefully.

Promtail is different. It discovers existing Docker log files on boot and immediately starts shipping thousands of lines. Sending that burst to a Loki instance that's still initializing its TSDB produces a wall of 503 errors. The condition: service_healthy gates Promtail startup until Loki's /ready endpoint returns 200.

Don't add depends_on between services that handle missing peers gracefully on their own. You'll just slow down your startup for no reason.


All Nine Patterns

  1. ${VAR:-default} everywhere — zero YAML editing for users
  2. Health checks on every service — using whatever HTTP client the image provides
  3. Memory limits — total stack budget under 1.5 GB for 9 services
  4. Named volumes for data, bind mounts for config — clean separation of concerns
  5. :ro on all config mounts — config flows host to container, never back
  6. Dedicated named network — service DNS, stack isolation, predictable naming
  7. unless-stopped restart policy — respects manual stops, covers all failure modes
  8. Stack-prefixed container names — monitoring-* for instant docker ps filtering
  9. depends_on with condition: service_healthy — only where startup order actually matters

None of these are exotic. Most are one or two lines. I have been guilty of skipping half of them in personal projects and regretting it six months later when I try to redeploy on a different box. The difference between a compose file that works on your machine and one that works on anyone's machine is usually just these small, boring decisions compounding.


The Homelab Monitoring Stack kit uses every pattern above across a production-tested 9-service Docker Compose file — Grafana, Prometheus, Loki, Alertmanager, Node Exporter, cAdvisor, Promtail, PVE Exporter, and Uptime Kuma. Pre-built dashboards, 23 alert rules, and documentation included. Free download, yours to modify.

On this page

Related Product

Homelab Monitoring Stack — Complete Docker Compose + Grafana Dashboards

A 9-service Docker Compose monitoring stack with 7 pre-built Grafana dashboards (68 panels), 23 Prometheus alert rules, Loki log aggregation, and full documentation. Copy .env, run docker compose up, done.

Get Free Download
Series: Homelab Monitoring StackPart 4 of 4
← Previous

Monitoring Proxmox with Grafana and Prometheus: A Practical Setup

Get weekly AI architecture insights

Patterns, lessons, and tools from building a production multi-agent system. Delivered weekly.

Read Next

Build Log17 min

I Built a 9-Service Homelab Monitoring Stack and Shipped It as a Product — Here's the Full Build Log

A chronological build log of creating a complete homelab monitoring stack with Grafana, Prometheus, Loki, cAdvisor, Alertmanager, and Uptime Kuma — 9 Docker services, 7 dashboards, 68 panels, 23 alert rules. Every decision and error documented.

Tutorial21 min

Deploy a Complete Homelab Monitoring Stack with Docker Compose: Grafana, Prometheus, Loki, and 23 Alert Rules

Step-by-step tutorial for deploying a 9-service monitoring stack on any Linux server. Prometheus for metrics, Loki for logs, Grafana for dashboards, Alertmanager for notifications, plus Proxmox and Uptime Kuma. One docker compose up and you have 7 dashboards and 23 pre-configured alert rules.

Guide10 min

Monitoring Proxmox with Grafana and Prometheus: A Practical Setup

How to monitor Proxmox VMs, LXC containers, node health, and storage from Grafana using the PVE Exporter and Prometheus, with pre-built dashboards and alert rules.