Featured image of post Docker Compose for Production: Deployment Best Practices Featured image of post Docker Compose for Production: Deployment Best Practices

Docker Compose for Production: Deployment Best Practices

Production Docker Compose best practices: compose file versioning, service dependencies, health checks, logging drivers, resource limits, secrets, and rolling updates.

Introduction

Docker Compose is often relegated to local development, but it is increasingly used for single-host production deployments of small-to-medium applications. The common criticism that “Compose is not for production” overlooks a key point: for many workloads, a well-configured Compose stack on a single VM provides the right balance of simplicity and reliability.

This guide covers production-grade Compose practices — treating your compose files as infrastructure-as-code with proper version control, CI/CD integration, and production-specific hardening.


1. Compose File Structure and Versioning

Use the Compose Specification (format 3.8+) and adopt a multi-file strategy for separating concerns:

compose.yml          # Base configuration (shared across environments)
compose.override.yml # Local development overrides (not used in production)
compose.prod.yml     # Production-specific overrides

Leverage YAML anchors to reduce duplication across services:

x-logging: &logging
  driver: "json-file"
  options:
    max-size: "10m"
    max-file: "3"

services:
  app:
    image: myapp:latest
    logging: *logging
    restart: unless-stopped

Use --env-file to inject environment-specific variables and avoid hardcoding secrets. Prefer docker compose (v2) over the deprecated docker-compose (v1).


2. Service Dependencies and Health Checks

Relying on depends_on alone is insufficient for production. Use condition: service_healthy to ensure services wait for dependencies to pass health checks before starting.

services:
  postgres:
    image: postgres:16
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER"]
      interval: 5s
      timeout: 5s
      retries: 5
      start_period: 30s

  app:
    depends_on:
      postgres:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/healthz"]
      interval: 10s
      timeout: 3s
      retries: 3

This cascading health check pattern — Nginx → App → Redis → PostgreSQL — ensures each service starts only when its dependencies are truly ready.


3. Networking Configuration

Use custom networks with driver: bridge for service isolation. Avoid the default bridge network, which lacks service discovery via DNS.

networks:
  frontend:
    driver: bridge
  backend:
    driver: bridge
    internal: true
  monitoring:
    driver: bridge

services:
  app:
    networks:
      frontend:
      backend:
      monitoring:

A three-network topology provides clean separation: frontend for the reverse proxy and app, backend (internal) for the app and database, and monitoring for observability tools. Use network aliases for stable DNS resolution across restarts.


4. Resource Constraints and Limits

Production containers must have resource boundaries to prevent a single service from starving the host.

services:
  app:
    deploy:
      resources:
        limits:
          cpus: '0.5'
          memory: 256M
          pids: 100
        reservations:
          cpus: '0.25'
          memory: 128M
DirectivePurpose
limits.cpusMaximum CPU cores (e.g., '0.5' = half a core)
limits.memoryMaximum memory before OOM kill
reservationsGuaranteed minimum resources
ulimitsFile descriptor and process limits (e.g., nofile: 65536)

For Node.js apps, align --max-old-space-size with the memory limit to avoid garbage collection thrashing.


5. Logging and Observability

Centralized logging is critical for debugging production issues. Configure the json-file driver with rotation to prevent disk exhaustion:

x-logging: &logging
  driver: "json-file"
  options:
    max-size: "10m"
    max-file: "5"

Alternative drivers for centralized aggregation:

DriverDestinationUse Case
fluentdFluentd → ElasticsearchFull-text search and analytics
gelfGraylogStructured log management
awslogsAmazon CloudWatchAWS-native deployments

Output structured JSON from your application so Docker can forward it efficiently. Use docker compose logs --tail 100 --follow for real-time debugging.


6. Secrets and Environment Management

Never hardcode secrets in compose files. For Swarm mode, use Docker Secrets which are encrypted at rest and mounted as files:

secrets:
  db_password:
    file: ./secrets/db_password.txt

services:
  app:
    secrets:
      - db_password

For non-Swarm Compose, use .env files (listed in .gitignore), encrypted with sops or age. Reference variables via $ syntax:

services:
  app:
    environment:
      - DATABASE_URL=postgres://user:${DB_PASSWORD}@postgres:5432/myapp

Rotate secrets regularly and use a naming convention like SERVICE_NAME_SECRET for clarity.


7. Rolling Updates and Zero-Downtime Deployments

Use deploy.update_config to orchestrate rolling updates with health check gating:

services:
  app:
    deploy:
      replicas: 3
      update_config:
        parallelism: 1
        delay: 10s
        order: start-first
        failure_action: rollback
      rollback_config:
        parallelism: 1
        order: stop-first
ParameterDescription
parallelismNumber of containers updated at once
delayPause between update groups
orderstart-first (blue-green) or stop-first
failure_actionpause, continue, or rollback

Run docker compose up --detach --wait for orchestrated updates that wait for health checks to pass before considering the deployment complete.


8. Reverse Proxy Integration

Traefik is the recommended reverse proxy for Compose deployments. It discovers services dynamically via Docker labels and automates SSL certificate management with Let’s Encrypt.

services:
  traefik:
    image: traefik:v3
    command:
      - "--providers.docker=true"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.le.acme.tlschallenge=true"
    ports:
      - "443:443"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock"

  app:
    labels:
      - "traefik.http.routers.app.rule=Host(`example.com`)"
      - "traefik.http.services.app.loadbalancer.server.port=3000"

Alternatives: Nginx requires manual upstream definitions and SSL termination configuration. Caddy offers automatic HTTPS similar to Traefik but with less dynamic reconfiguration.


9. Monitoring and Backup

Define a separate monitoring.yml compose file for the observability stack (cAdvisor + Prometheus + Grafana). This keeps concerns separated — monitoring can be updated independently of the application.

For database backups, schedule docker exec dumps via cron:

0 3 * * * docker exec postgres pg_dump -U user mydb > /backups/db_$(date +\%Y\%m\%d).sql

Use restart: unless-stopped for production services. Unlike restart: always, this prevents containers from restarting after a manual docker stop.


Conclusion

Docker Compose is production-viable for the right scale of deployments. Treat compose files as code, design cascading health checks, set resource limits, centralize logs, manage secrets properly, and implement rolling updates. These practices transform Compose from a development convenience into a robust deployment tool.

When scaling needs grow beyond a single host, the same compose file provides a smooth migration path to Docker Swarm or Kubernetes, making Compose a pragmatic starting point for container orchestration.