Introduction
Docker Compose is often relegated to local development, but it is increasingly used for single-host production deployments of small-to-medium applications. The common criticism that “Compose is not for production” overlooks a key point: for many workloads, a well-configured Compose stack on a single VM provides the right balance of simplicity and reliability.
This guide covers production-grade Compose practices — treating your compose files as infrastructure-as-code with proper version control, CI/CD integration, and production-specific hardening.
1. Compose File Structure and Versioning
Use the Compose Specification (format 3.8+) and adopt a multi-file strategy for separating concerns:
compose.yml # Base configuration (shared across environments)
compose.override.yml # Local development overrides (not used in production)
compose.prod.yml # Production-specific overrides
Leverage YAML anchors to reduce duplication across services:
x-logging: &logging
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
services:
app:
image: myapp:latest
logging: *logging
restart: unless-stopped
Use --env-file to inject environment-specific variables and avoid hardcoding secrets. Prefer docker compose (v2) over the deprecated docker-compose (v1).
2. Service Dependencies and Health Checks
Relying on depends_on alone is insufficient for production. Use condition: service_healthy to ensure services wait for dependencies to pass health checks before starting.
services:
postgres:
image: postgres:16
healthcheck:
test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER"]
interval: 5s
timeout: 5s
retries: 5
start_period: 30s
app:
depends_on:
postgres:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/healthz"]
interval: 10s
timeout: 3s
retries: 3
This cascading health check pattern — Nginx → App → Redis → PostgreSQL — ensures each service starts only when its dependencies are truly ready.
3. Networking Configuration
Use custom networks with driver: bridge for service isolation. Avoid the default bridge network, which lacks service discovery via DNS.
networks:
frontend:
driver: bridge
backend:
driver: bridge
internal: true
monitoring:
driver: bridge
services:
app:
networks:
frontend:
backend:
monitoring:
A three-network topology provides clean separation: frontend for the reverse proxy and app, backend (internal) for the app and database, and monitoring for observability tools. Use network aliases for stable DNS resolution across restarts.
4. Resource Constraints and Limits
Production containers must have resource boundaries to prevent a single service from starving the host.
services:
app:
deploy:
resources:
limits:
cpus: '0.5'
memory: 256M
pids: 100
reservations:
cpus: '0.25'
memory: 128M
| Directive | Purpose |
|---|---|
limits.cpus | Maximum CPU cores (e.g., '0.5' = half a core) |
limits.memory | Maximum memory before OOM kill |
reservations | Guaranteed minimum resources |
ulimits | File descriptor and process limits (e.g., nofile: 65536) |
For Node.js apps, align --max-old-space-size with the memory limit to avoid garbage collection thrashing.
5. Logging and Observability
Centralized logging is critical for debugging production issues. Configure the json-file driver with rotation to prevent disk exhaustion:
x-logging: &logging
driver: "json-file"
options:
max-size: "10m"
max-file: "5"
Alternative drivers for centralized aggregation:
| Driver | Destination | Use Case |
|---|---|---|
fluentd | Fluentd → Elasticsearch | Full-text search and analytics |
gelf | Graylog | Structured log management |
awslogs | Amazon CloudWatch | AWS-native deployments |
Output structured JSON from your application so Docker can forward it efficiently. Use docker compose logs --tail 100 --follow for real-time debugging.
6. Secrets and Environment Management
Never hardcode secrets in compose files. For Swarm mode, use Docker Secrets which are encrypted at rest and mounted as files:
secrets:
db_password:
file: ./secrets/db_password.txt
services:
app:
secrets:
- db_password
For non-Swarm Compose, use .env files (listed in .gitignore), encrypted with sops or age. Reference variables via $ syntax:
services:
app:
environment:
- DATABASE_URL=postgres://user:${DB_PASSWORD}@postgres:5432/myapp
Rotate secrets regularly and use a naming convention like SERVICE_NAME_SECRET for clarity.
7. Rolling Updates and Zero-Downtime Deployments
Use deploy.update_config to orchestrate rolling updates with health check gating:
services:
app:
deploy:
replicas: 3
update_config:
parallelism: 1
delay: 10s
order: start-first
failure_action: rollback
rollback_config:
parallelism: 1
order: stop-first
| Parameter | Description |
|---|---|
parallelism | Number of containers updated at once |
delay | Pause between update groups |
order | start-first (blue-green) or stop-first |
failure_action | pause, continue, or rollback |
Run docker compose up --detach --wait for orchestrated updates that wait for health checks to pass before considering the deployment complete.
8. Reverse Proxy Integration
Traefik is the recommended reverse proxy for Compose deployments. It discovers services dynamically via Docker labels and automates SSL certificate management with Let’s Encrypt.
services:
traefik:
image: traefik:v3
command:
- "--providers.docker=true"
- "--entrypoints.websecure.address=:443"
- "--certificatesresolvers.le.acme.tlschallenge=true"
ports:
- "443:443"
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
app:
labels:
- "traefik.http.routers.app.rule=Host(`example.com`)"
- "traefik.http.services.app.loadbalancer.server.port=3000"
Alternatives: Nginx requires manual upstream definitions and SSL termination configuration. Caddy offers automatic HTTPS similar to Traefik but with less dynamic reconfiguration.
9. Monitoring and Backup
Define a separate monitoring.yml compose file for the observability stack (cAdvisor + Prometheus + Grafana). This keeps concerns separated — monitoring can be updated independently of the application.
For database backups, schedule docker exec dumps via cron:
0 3 * * * docker exec postgres pg_dump -U user mydb > /backups/db_$(date +\%Y\%m\%d).sql
Use restart: unless-stopped for production services. Unlike restart: always, this prevents containers from restarting after a manual docker stop.
Conclusion
Docker Compose is production-viable for the right scale of deployments. Treat compose files as code, design cascading health checks, set resource limits, centralize logs, manage secrets properly, and implement rolling updates. These practices transform Compose from a development convenience into a robust deployment tool.
When scaling needs grow beyond a single host, the same compose file provides a smooth migration path to Docker Swarm or Kubernetes, making Compose a pragmatic starting point for container orchestration.
