Reproducible test environments with Docker
"Works on my machine" is a process failure. Containers turn your test environment into something you can version, share and trust.
The phrase "works on my machine" is a polite way of saying the environment isn't reproducible. Containers fix that: the runtime, browsers and dependencies become an artifact you can version and ship — identical locally and in CI.
Pin everything
A reproducible image starts with pinned versions. Floating tags drift, and drift is how a green suite turns red overnight with no code change.
FROM mcr.microsoft.com/playwright/python:v1.49.0-noble
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["pytest", "-n", "auto", "--maxfail=1"]Keep the build cache working for you
Order layers from least to most volatile. Dependencies change rarely, so copy and install them before the source — most rebuilds then skip the slow install step.
- Copy
requirements.txtand install before copying source. - Use
--no-cache-dirto keep the image lean. - Add a
.dockerignoreso local junk never invalidates the cache.
Compose the whole world
Real apps need a database, a queue, maybe a mock server. docker compose brings
them up together so the suite tests the system, not a fragment of it.
services:
tests:
build: .
depends_on:
db:
condition: service_healthy
db:
image: postgres:16-alpine
environment:
POSTGRES_PASSWORD: test
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 2s
retries: 10Now the same command runs the same way on every laptop and every CI runner. That reproducibility is the foundation everything else — parallelism, debugging, confidence — builds on.