Docker: From Development to Production Containers
A production-focused guide to Docker — from writing optimized Dockerfiles and multi-stage builds to Docker Compose workflows, image security scanning, registry management, and Kaniko for rootless CI/CD builds. Covers volumes, healthchecks, BuildKit, networking modes, and .dockerignore. Based on containerizing 26 microservices.
By Jose Nobile | Updated 2026-04-23 | 22 min read
Table of Contents
- Dockerfile Best Practices
- .dockerignore
- Multi-Stage Builds
- BuildKit
- Docker Compose for Local Development
- Volumes and Bind Mounts
- Healthchecks
- Image Optimization
- Security Scanning
- Registry Management
- Kaniko for CI/CD Builds
- Container Networking
- Real-World: Production Containerization
- Latest Docker Features (2025-2026)
Dockerfile Best Practices
A Dockerfile is a recipe for building a container image. Every instruction creates a layer, and layers are cached. The order of instructions directly impacts build speed and image size. Place instructions that change rarely (installing system packages) before instructions that change often (copying application code). This maximizes cache hits and cuts rebuild times from minutes to seconds.
Use specific base image tags, never latest. Pin to a digest or version tag like node:20.11-alpine3.19 for reproducible builds. Combine RUN instructions with && to minimize layers and clean up temporary files in the same layer. Every layer persists in the final image, so downloading, building, and cleaning up in separate RUN statements bloats the image even if the files are deleted later.
Always include a .dockerignore file to exclude node_modules/, .git/, *.md, test files, and other artifacts that should not be in the build context. A bloated build context slows down docker build because the entire context is sent to the daemon before any instruction executes. In production, every microservice has a curated .dockerignore that reduces the build context from ~500MB to ~20MB.
COPY . .
RUN npm install
# Good: Only re-install when dependencies change
COPY package.json package-lock.json ./
RUN npm ci --production
COPY . .
.dockerignore
The .dockerignore file works like .gitignore but for the Docker build context. Without it, docker build sends every file in the build directory to the Docker daemon, including .git/ (often hundreds of megabytes), node_modules/, test fixtures, documentation, and IDE configuration. This wastes time, bandwidth, and can leak secrets into the image.
Place .dockerignore in the root of your build context (next to the Dockerfile). List directories and file patterns that have no business inside the container: version control history, development dependencies, test suites, CI configuration, local environment files, and editor settings. A well-maintained .dockerignore can reduce the build context from hundreds of megabytes to under 20MB.
Be explicit rather than permissive. Some teams use an allowlist approach: ignore everything with * then selectively un-ignore what the build actually needs with !. This prevents accidental inclusion of new files added to the repository. In production, every microservice uses the allowlist pattern, ensuring that only src/, package.json, package-lock.json, and tsconfig.json enter the build context.
.git
.gitignore
node_modules
npm-debug.log*
dist
coverage
*.md
.env*
.vscode
.idea
Dockerfile*
docker-compose*
__tests__
*.test.ts
*.spec.ts
.github
.gitlab-ci.yml
# .dockerignore - Allowlist approach (stricter)
*
!src/
!package.json
!package-lock.json
!tsconfig.json
Multi-Stage Builds
Multi-stage builds are the single most impactful technique for reducing Docker image size. They allow you to use one image for building (with compilers, dev dependencies, build tools) and a separate, minimal image for the final runtime. Only the artifacts you explicitly COPY --from=build end up in the production image. In production, multi-stage builds cut Node.js service images from 1.2GB (with dev dependencies) to 180MB (Alpine + production deps + compiled code).
For TypeScript services, the pattern is: Stage 1 compiles TypeScript to JavaScript with all dev dependencies, Stage 2 copies only the compiled JavaScript and production node_modules into an Alpine-based runtime image. For Go services, it is even more dramatic: the final image can be scratch (empty) since Go compiles to a static binary with no runtime dependencies. For the smallest attack surface, use Google's distroless images (gcr.io/distroless/nodejs20-debian12) which include the runtime but no shell, package manager, or OS utilities.
Name your stages with AS for clarity: FROM node:20-alpine AS build. You can also target specific stages during development with docker build --target=build to get an image with dev tools and test runners. This makes the same Dockerfile serve both development and production needs.
FROM node:20-alpine AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY tsconfig.json ./
COPY src/ ./src/
RUN npm run build
# Stage 2: Production (Alpine)
FROM node:20-alpine AS production
WORKDIR /app
RUN addgroup -g 1001 app && adduser -u 1001 -G app -D app
COPY package.json package-lock.json ./
RUN npm ci --production && npm cache clean --force
COPY --from=build /app/dist ./dist
USER app
EXPOSE 3000
CMD ["node", "dist/index.js"]
# Alternative Stage 2: Distroless (no shell, minimal attack surface)
# FROM gcr.io/distroless/nodejs20-debian12
# WORKDIR /app
# COPY --from=build /app/dist ./dist
# COPY --from=build /app/node_modules ./node_modules
# USER 1000
# CMD ["dist/index.js"]
BuildKit
BuildKit is Docker's next-generation build engine, enabled by default since Docker 23.0. It replaces the legacy builder with parallel build execution, better caching, and new Dockerfile features. If you are on an older Docker version, enable it with DOCKER_BUILDKIT=1 docker build or set { "features": { "buildkit": true } } in the Docker daemon configuration.
BuildKit's most impactful feature is parallel stage execution. In a multi-stage Dockerfile, independent stages build concurrently instead of sequentially, cutting total build time significantly. It also provides cache mount support via RUN --mount=type=cache,target=/root/.npm npm ci, which persists package manager caches across builds without bloating the image layer. This eliminates the need for npm cache clean --force and speeds up dependency installation.
BuildKit introduces secret mounts (RUN --mount=type=secret,id=npmrc cat /run/secrets/npmrc) that inject secrets at build time without writing them to any image layer. This is critical for private npm registries or API keys needed during the build. Other features include --mount=type=ssh for SSH agent forwarding (private Git repos), inline build cache export (--cache-to and --cache-from), and heredoc syntax for multi-line RUN instructions.
export DOCKER_BUILDKIT=1
# Cache mount: persist npm cache across builds
# syntax=docker/dockerfile:1
FROM node:20-alpine AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN --mount=type=cache,target=/root/.npm npm ci
COPY . .
RUN npm run build
# Secret mount: private registry auth without leaking to layers
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc npm ci
# SSH mount: clone private repos at build time
RUN --mount=type=ssh git clone git@github.com:org/private-lib.git
# Inline cache for CI/CD
docker build --cache-from type=registry,ref=registry/app:cache \
--cache-to type=registry,ref=registry/app:cache \
-t registry/app:latest .
Docker Compose for Local Development
Docker Compose defines multi-container applications in a single YAML file. It is the standard tool for local development environments that replicate production service dependencies. A single docker compose up spins up your application alongside MySQL, Redis, Elasticsearch, or any other service your app depends on, with networking already configured between them.
Use volumes to mount your source code into the container for hot-reloading during development. Combine with depends_on and health checks to ensure services start in the correct order. The profiles feature lets you define optional services (like debug tools or test databases) that only run when explicitly activated with --profile debug.
Maintain separate Compose files for different contexts: compose.yaml for core services, compose.override.yaml for local development overrides (volume mounts, debug ports), and compose.test.yaml for CI test environments. Compose automatically merges compose.yaml with compose.override.yaml, so developers get local overrides without modifying the base file. Note: the version key is deprecated in Compose V2 and can be omitted.
services:
api:
build:
context: .
target: build
volumes:
- ./src:/app/src
ports:
- "3000:3000"
depends_on:
mysql:
condition: service_healthy
environment:
- DB_HOST=mysql
- REDIS_HOST=redis
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 10s
timeout: 5s
retries: 3
start_period: 15s
mysql:
image: mysql:8.0
environment:
MYSQL_ROOT_PASSWORD: dev
MYSQL_DATABASE: myapp
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]
interval: 5s
retries: 10
volumes:
- mysql_data:/var/lib/mysql
redis:
image: redis:7-alpine
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
retries: 5
volumes:
mysql_data:
Volumes and Bind Mounts
Docker containers are ephemeral: when a container is removed, all data inside it is lost. Volumes solve this by persisting data outside the container's writable layer. Docker manages named volumes in /var/lib/docker/volumes/, and they survive container recreation, upgrades, and restarts. Use named volumes for databases, file uploads, and any stateful data that must persist.
Bind mounts map a host directory directly into the container. They are essential for local development: mount your src/ directory so code changes appear instantly inside the container without rebuilding. However, bind mounts have performance implications on macOS and Windows due to filesystem translation overhead. Use Docker's :cached or :delegated consistency flags, or the newer VirtioFS file-sharing backend, to improve performance on non-Linux hosts.
In production, prefer named volumes over bind mounts for portability and management. Use tmpfs mounts for sensitive data that should never be written to disk (session tokens, temporary secrets). For databases in containers (development only), always use a named volume to avoid losing data on docker compose down. The docker volume prune command cleans up dangling volumes, but use it carefully to avoid deleting data volumes.
docker run -v mysql_data:/var/lib/mysql mysql:8.0
# Bind mount (host directory into container)
docker run -v $(pwd)/src:/app/src node:20-alpine
# tmpfs mount (in-memory, never written to disk)
docker run --tmpfs /tmp:rw,noexec,size=100m myapp
# Compose volumes with driver options
volumes:
mysql_data:
driver: local
uploads:
driver: local
driver_opts:
type: none
o: bind
device: /data/uploads
Healthchecks
Healthchecks tell Docker whether a container is functioning correctly, not just running. A container can have its process alive but be deadlocked, out of memory, or unable to serve requests. The HEALTHCHECK instruction in a Dockerfile or the healthcheck key in Compose defines a command that Docker runs periodically. If the check fails consecutively, Docker marks the container as unhealthy.
Define healthchecks that test the actual functionality of your service, not just process existence. For HTTP services, hit a /health or /readyz endpoint. For databases, use the native ping command (mysqladmin ping, redis-cli ping, pg_isready). The start_period parameter gives the container time to initialize before healthchecks begin failing, which prevents false negatives during startup.
Healthchecks enable Compose's depends_on: condition: service_healthy, which ensures dependent services only start after their dependencies are genuinely ready, not just running. Without healthchecks, your API container might start before MySQL is ready to accept connections, causing startup crashes. In production, every Compose service has a healthcheck, and the API service waits for MySQL, Redis, and Elasticsearch to all report healthy before starting.
FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm ci --production
HEALTHCHECK --interval=15s --timeout=5s --start-period=10s --retries=3 \
CMD ["node", "-e", "require('http').get('http://localhost:3000/health', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) })"]
CMD ["node", "dist/index.js"]
# Compose healthchecks
services:
api:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 10s
timeout: 5s
retries: 3
start_period: 15s
postgres:
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s
retries: 10
Image Optimization
Image size directly impacts pull times, storage costs, and attack surface. Every megabyte counts when you are pulling images across 20+ services on every deployment. Start with Alpine-based images (node:20-alpine is 50MB vs. node:20 at 350MB). If your application has glibc dependencies that break on Alpine's musl, use Debian slim variants (node:20-slim at 80MB) instead.
Layer caching is your best friend. Docker caches each layer and reuses it if the instruction and all parent layers are unchanged. Structure your Dockerfile so that the most frequently changing content (source code) comes last. Use npm ci instead of npm install for deterministic, cache-friendly installs. Copy only package.json and package-lock.json before running npm ci, then copy the rest of the source code.
For the smallest possible images, consider Google's distroless base images (gcr.io/distroless/nodejs20-debian12, gcr.io/distroless/static-debian12). Distroless images contain only the language runtime and your application — no shell, no package manager, no OS utilities. This eliminates entire classes of vulnerabilities and reduces images to their absolute minimum size. Use docker image history and dive (a TUI for exploring image layers) to identify bloat. In production, regular image audits with dive reduced the average microservice image from 350MB to 140MB, saving ~4.2GB of registry storage per deployment across 20 services.
Alpine Base Images
Alpine Linux images are 5-10x smaller than Debian-based alternatives. Use node:20-alpine, python:3.12-alpine. Watch for musl compatibility issues with native modules.
Distroless Images
Google's distroless images contain only the runtime. No shell, no package manager. Use gcr.io/distroless/nodejs20-debian12 for Node.js or gcr.io/distroless/static-debian12 for Go binaries. Smallest attack surface possible.
Layer Cache Order
System packages first, then dependency files, then npm ci, then source code. Changing a source file only rebuilds from the COPY instruction onward, keeping all dependency layers cached.
Dive for Analysis
Run dive your-image:tag to interactively explore each layer, see wasted space, and identify files that should not be in the image. Target an image efficiency score above 90%.
Security Scanning
Container images inherit vulnerabilities from their base OS packages, language runtime, and application dependencies. A single unpatched CVE in a base image affects every service built on it. Scan images at build time (in CI/CD), at push time (registry-side), and continuously (scheduled scans of running images). Tools like Trivy, Grype, and Snyk Container detect known CVEs in OS packages, language libraries, and even Dockerfile misconfigurations.
Integrate scanning into your CI pipeline as a gate: fail the build if critical or high-severity vulnerabilities are found. Trivy runs in under 30 seconds and outputs SARIF format for GitLab/GitHub security dashboards. Snyk provides deeper dependency tree analysis and fix suggestions. In production, every GitLab CI pipeline runs trivy image --severity HIGH,CRITICAL --exit-code 1 before pushing to GCR. This caught 3 critical CVEs in the Node.js base image within the first month.
Beyond scanning, follow container security hygiene: run as a non-root user (USER app), use read-only root filesystems where possible, drop all Linux capabilities and add back only what is needed, and never embed secrets in the image. Use BuildKit secret mounts for build-time credentials. Use distroless or scratch images for the smallest possible attack surface — no shell means no shell-based exploits.
trivy image --severity HIGH,CRITICAL gcr.io/myproject/api:v2.4.1
# Trivy: fail CI on findings
trivy image --severity HIGH,CRITICAL --exit-code 1 $IMAGE
# Trivy: scan Dockerfile for misconfigurations
trivy config --severity HIGH,CRITICAL ./Dockerfile
# Trivy: generate SARIF report for GitLab/GitHub
trivy image --format sarif --output trivy-results.sarif $IMAGE
# Snyk: scan image with fix suggestions
snyk container test $IMAGE --severity-threshold=high
# Snyk: monitor for new vulnerabilities
snyk container monitor $IMAGE
Registry Management
A container registry stores and distributes your Docker images. Google Container Registry (GCR) and its successor Artifact Registry, Amazon ECR, and GitHub Container Registry (ghcr.io) are the main cloud options. Choose the registry in the same cloud provider as your cluster to minimize pull latency and egress costs. GKE nodes pull from GCR/Artifact Registry at wire speed with no egress charges.
Tag images with semantic versions and Git SHA for traceability: gcr.io/myproject/api:v2.4.1 for releases and gcr.io/myproject/api:abc1234 for every build. Never use :latest in production Kubernetes manifests — it defeats the purpose of immutable deployments and makes rollbacks impossible. Implement a retention policy to automatically delete images older than N days or keep only the last N tags to control storage costs.
For multi-architecture images (amd64 + arm64), use docker buildx to build and push manifest lists. This is increasingly important as teams adopt ARM-based nodes (AWS Graviton, GKE Arm) for cost savings. In production, images are pushed to Google Artifact Registry with both version tags and Git SHA tags, and a lifecycle policy retains only the last 30 versions per service.
Google Artifact Registry
GCR successor. Regional repositories with fine-grained IAM, vulnerability scanning, and cleanup policies. Free egress to GKE in the same region. Used in production for all 20+ service images.
Amazon ECR
Tightly integrated with ECS/EKS. Lifecycle policies for automatic image cleanup. Cross-region replication for multi-region deployments. Free pull from within the same region.
GitHub Container Registry
ghcr.io integrates with GitHub Actions. Public images are free. Good for open-source projects and small teams using GitHub for CI/CD.
Kaniko for CI/CD Builds
Kaniko builds Docker images inside a container without requiring a Docker daemon or privileged mode. This solves the "Docker-in-Docker" security problem in CI/CD environments like GitLab CI, where running a privileged Docker daemon inside a CI job is a security risk. Kaniko executes Dockerfile instructions in userspace, building the image and pushing directly to a registry.
Configuration is minimal: mount your build context, pass the Dockerfile path, and specify the destination registry. Kaniko supports layer caching via a remote cache repository, which dramatically speeds up builds by reusing unchanged layers from previous builds. In production, Kaniko with remote caching reduced average build times from 4 minutes to 90 seconds across all microservices.
Kaniko supports all standard Dockerfile instructions, multi-stage builds, and build arguments. The --cache=true and --cache-repo flags enable the remote cache. Use --snapshot-mode=redo for faster layer snapshotting on large images. The --use-new-run flag improves RUN instruction performance by 20-30%.
In production, every GitLab CI pipeline uses Kaniko for image builds. No privileged containers, no Docker daemon, no Docker-in-Docker. The pipeline authenticates to GCR via a service account key mounted as a Kubernetes secret. Build, push, and scan happen in under 2 minutes per service.
build:
stage: build
image:
name: gcr.io/kaniko-project/executor:v1.22.0-debug
entrypoint: [""]
script:
- /kaniko/executor
--context $CI_PROJECT_DIR
--dockerfile $CI_PROJECT_DIR/Dockerfile
--destination $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
--destination $CI_REGISTRY_IMAGE:$CI_COMMIT_TAG
--cache=true
--cache-repo=$CI_REGISTRY_IMAGE/cache
--snapshot-mode=redo
Container Networking
Docker provides several network drivers, each suited to different use cases. The default bridge network isolates containers from the host and from each other unless ports are explicitly published. User-defined bridge networks (created with docker network create) add automatic DNS resolution between containers by name, which is why Docker Compose creates one per project. All Compose services share a custom bridge network automatically and can reach each other by service name.
The host network mode removes network isolation entirely: the container shares the host's network stack directly. This eliminates the overhead of network address translation and is useful for high-performance scenarios where the container needs maximum network throughput. Use --network host or network_mode: host in Compose. The tradeoff is no port isolation — the container's ports bind directly to the host.
The overlay network driver enables multi-host networking for Docker Swarm and is the foundation for container communication across multiple machines. Overlay networks encrypt traffic between nodes by default and support service discovery. In Kubernetes environments, overlay networking is handled by the CNI plugin (Calico, Cilium, or GKE's native VPC) rather than Docker's overlay driver, but the concept is the same: containers on different hosts communicate over a virtual network.
For local development security, publish ports with 127.0.0.1:3000:3000 to bind only to localhost instead of all interfaces. For debugging networking issues: docker network inspect bridge shows container IPs and connections, docker exec container curl other-service:3000 tests inter-container connectivity, and docker logs reveals connection errors.
Bridge (Default)
Isolated network with port publishing. User-defined bridges add DNS resolution. Default for Compose. Best for most local development and single-host production.
Host
Container shares host's network stack. No NAT overhead. Maximum throughput. No port isolation. Use for performance-critical services or when the container needs host network access.
Overlay
Multi-host networking for Docker Swarm. Encrypted inter-node traffic. Service discovery across hosts. In Kubernetes, CNI plugins (Calico, Cilium) serve the same purpose.
Real-World: Production Containerization
Every microservice is containerized with Docker, built with Kaniko in GitLab CI, and deployed to GKE via Helm charts. The containerization strategy evolved from naive single-stage Dockerfiles to optimized multi-stage builds, cutting image sizes by 85% and build times by 60%.
20+ Containerized Services
Node.js/TypeScript microservices, each with a multi-stage Dockerfile, Alpine base, and non-root user. Average image size: 140MB. All built with Kaniko in GitLab CI, pushed to Google Artifact Registry.
Kaniko + GitLab CI
Zero Docker daemons in CI. Kaniko builds with remote layer caching cut build times from 4 minutes to 90 seconds. Trivy security scanning gates every image before it reaches the registry.
Docker Compose for Dev
Local development mirrors production topology: API services, MySQL, Redis, and Elasticsearch all defined in Compose with healthchecks. New developers go from clone to running environment in under 5 minutes.
Latest Docker Features (2025-2026)
Docker Scout: Docker Scout is now the primary security scanning tool in the Docker ecosystem. It provides real-time CVE detection during builds, policy evaluation against organizational security standards, and VEX (Vulnerability Exploitability eXchange) document support for managing exceptions. Scout exposes a Prometheus endpoint for vulnerability metrics, enabling integration with existing monitoring stacks. Unlike external scanners that run post-build, Scout integrates directly into the build process, catching vulnerabilities before images are pushed to registries.
Docker Compose v5 ("Mont Blanc") (Dec 2025): Compose v5 removes the internal builder entirely, delegating all builds to Docker Bake. This yields faster builds with better caching, native multi-platform support, and consistent build behavior between docker compose build and standalone Bake invocations. Compose v5 also ships with a new official Go SDK, enabling programmatic control of Compose stacks from custom tooling and CI/CD systems.
Docker Build Cloud: Cloud-based builds with shared build caches across teams. Instead of each developer and CI runner maintaining their own local cache, Build Cloud provides a centralized cache that all team members and pipelines share. This dramatically reduces build times for teams distributed across locations and eliminates cold-cache penalties when CI runners are ephemeral.
Docker Engine v29 (Jan 2026): The containerd image store is now the default for new installations, replacing the legacy storage driver. This aligns Docker with the broader container ecosystem (Kubernetes already uses containerd). Engine v29 also adds nftables backend support for networking, modernizing Docker's iptables dependency. The minimum API version is now 1.44, requiring Docker v25 or later as the client -- older clients will need upgrading.
Docker Model Runner and MCP (Desktop 4.40+, April 2026): Docker Model Runner, powered by llama.cpp, lets developers run LLMs locally via an OpenAI-compatible API with no extra setup. GPU acceleration is supported on macOS (Metal), Windows (NVIDIA), and Windows (Qualcomm). Model Runner now works with Docker Compose and Testcontainers (Java and Go), integrating AI services into microservice stacks. The Docker AI Agent fully embraces the Model Context Protocol (MCP), exposing capabilities as MCP servers for Claude Desktop, Cursor, or any MCP-compatible client. The built-in MCP Toolkit (Desktop 4.42+) discovers and manages 100+ MCP servers without extensions. As of Docker Desktop 4.69.0 (April 15, 2026), Docker ships roughly weekly with rapid AI tooling iteration -- MCP profile template cards, expanded model support, and inference request inspection for debugging AI workflows.
Docker Scout
Real-time CVE detection during builds. Policy evaluation, VEX exceptions, and Prometheus metrics endpoint. The primary Docker security scanning tool.
Compose v5 "Mont Blanc"
Removed internal builder, delegates to Bake. Faster builds, better caching, multi-platform support. New official Go SDK for programmatic control.
Docker Build Cloud
Cloud-based builds with shared caches across teams. Eliminates cold-cache penalties on ephemeral CI runners. Faster builds for distributed teams.
Docker Engine v29
Current stable (2026). Containerd image store as default. nftables backend support. Native IPv6 (dual-stack, IPv4-only, or IPv6-only). Minimum API version 1.44 (Docker v25+ required). Modernized networking stack.
Compose v5.1.2
Latest Compose release (April 9, 2026). The docker-compose (hyphenated) command is fully deprecated -- use docker compose (space-separated). Version field in compose files is no longer required and is ignored.
Model Runner & MCP
Run LLMs locally via OpenAI-compatible API (llama.cpp). GPU on macOS, Windows NVIDIA, and Qualcomm. Compose and Testcontainers integration. Built-in MCP Toolkit manages 100+ servers. Desktop 4.69.0 (April 2026).