DockerApr 15, 20261 min read

Reproducible test environments with Docker

"Works on my machine" is a process failure. Containers turn your test environment into something you can version, share and trust.

The phrase "works on my machine" is a polite way of saying the environment isn't reproducible. Containers fix that: the runtime, browsers and dependencies become an artifact you can version and ship — identical locally and in CI.

Pin everything

A reproducible image starts with pinned versions. Floating tags drift, and drift is how a green suite turns red overnight with no code change.

Dockerfile

FROM mcr.microsoft.com/playwright/python:v1.49.0-noble
 
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
 
COPY . .
CMD ["pytest", "-n", "auto", "--maxfail=1"]

Keep the build cache working for you

Order layers from least to most volatile. Dependencies change rarely, so copy and install them before the source — most rebuilds then skip the slow install step.

Copy requirements.txt and install before copying source.
Use --no-cache-dir to keep the image lean.
Add a .dockerignore so local junk never invalidates the cache.

Compose the whole world

Real apps need a database, a queue, maybe a mock server. docker compose brings them up together so the suite tests the system, not a fragment of it.

docker-compose.yml

services:
  tests:
    build: .
    depends_on:
      db:
        condition: service_healthy
  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_PASSWORD: test
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 2s
      retries: 10

Now the same command runs the same way on every laptop and every CI runner. That reproducibility is the foundation everything else — parallelism, debugging, confidence — builds on.