You are building a Docker image. The build reaches step 8 of 12 and then crashes with the dreaded error: no space left on device. You run docker system prune -a, confirm the deletion, see "Total reclaimed space: 12.4GB" — and then the very next build fails with the same error. This scenario plays out on developer machines and CI servers thousands of times every day, and the standard advice of "just run docker prune" solves it less than half the time.
The reason is that Docker stores data in multiple locations, and docker system prune only cleans a subset of them. Understanding where Docker puts data, what the prune commands actually remove, and what they silently leave behind is essential for anyone running Docker in production or on a busy development machine.
Understanding Where Docker Stores Data
Docker stores its data in a root directory, typically /var/lib/docker on Linux. This directory contains several subdirectories, each storing a different type of data:
overlay2/ — This is the storage driver directory where image layers and container filesystem layers are stored. Each layer is a directory containing the filesystem diff from the previous layer. On a busy system, this directory can easily consume 50-100 GB or more. When you pull an image, every layer in that image creates a subdirectory here. When you run a container, an additional writable layer is created on top. When you build images, every build step creates a new layer.
volumes/ — Named Docker volumes created with docker volume create or implicitly by docker-compose.yml volume declarations. These persist even after the containers that created them are removed.
containers/ — Metadata and logs for each container. The log files here can grow to gigabytes if you are not limiting log size. A single container with verbose logging and no log rotation can fill a disk entirely on its own.
buildkit/ — BuildKit build cache, including source code snapshots, intermediate build results, and downloaded dependencies. This is separate from the image layer cache and is NOT removed by docker system prune unless you add the --all flag and even then, BuildKit has its own garbage collection.
tmp/ — Temporary files used during builds. Large COPY operations create temporary archives here that can consume significant space during the build even if the final image is small.
To see the actual disk usage breakdown, run:
docker system df
docker system df -v
The verbose flag shows every image, container, and volume with its size. But this still does not show BuildKit cache. For that, run:
docker buildx du
This command, available in Docker 20.10 and later, shows the BuildKit cache usage separately. On a machine that builds frequently, this can easily be 20-40 GB that docker system df does not report.
Why docker system prune Does Not Fix It
The docker system prune command removes stopped containers, dangling images (images without a tag), unused networks, and optionally build cache. However, it has significant blind spots:
It does not remove named volumes. If your docker-compose.yml declares named volumes for databases, uploads, logs, or caches, those volumes persist forever. A PostgreSQL volume that has been accumulating WAL files and table data for months can easily consume 10-20 GB. To see orphaned volumes, run docker volume ls -f dangling=true and remove them with docker volume prune.
It does not remove tagged images you are not using. If you have pulled 50 versions of your application image over the past month (myapp:v1.0 through myapp:v1.50), all 50 are still on disk. Prune only removes images with no tag. To see all images sorted by size, run docker images --format "{{.Repository}}:{{.Tag}} {{.Size}}" | sort -k2 -h.
It does not fully clean BuildKit cache. BuildKit maintains its own cache separate from the image layer cache. Even with docker builder prune, BuildKit can retain cache entries that it considers potentially useful. To force a complete BuildKit cache wipe, run docker builder prune --all --force.
It does not clean container logs. Container log files in /var/lib/docker/containers/*/ can grow without limit unless you configure log rotation. A container that has been running for months with DEBUG logging can produce log files exceeding 50 GB. Check log sizes with sudo du -sh /var/lib/docker/containers/*/.
The Nuclear Option: Reclaim Everything
When you need to reclaim all Docker disk space immediately, run these commands in order:
# Stop all running containers
docker stop $(docker ps -q)
# Remove all containers
docker rm $(docker ps -aq)
# Remove all images
docker rmi $(docker images -q) --force
# Remove all volumes
docker volume prune --force
# Remove all networks
docker network prune --force
# Remove all BuildKit cache
docker builder prune --all --force
# Verify
docker system df
This removes everything. On a development machine, this is usually fine — you will re-pull images as needed. On a production server, obviously do not do this unless you have planned the downtime.
Hidden Culprit 1: Overlay2 Orphaned Layers
The most insidious space consumer is orphaned overlay2 layers. These are filesystem layers that are no longer referenced by any image or container but still exist on disk. They can accumulate when builds are interrupted, when the Docker daemon crashes during image deletion, or when Docker's internal reference counting gets out of sync.
To check for orphaned layers, compare the number of directories in /var/lib/docker/overlay2/ with the number Docker knows about:
# Count filesystem directories
sudo ls /var/lib/docker/overlay2/ | wc -l
# Count layers Docker tracks
docker inspect $(docker images -q) | jq '.[].RootFS.Layers[]' | sort -u | wc -l
If the filesystem count is significantly higher than the tracked count, you have orphaned layers. The safest way to clean them is to stop Docker, verify no containers are running, and then remove the entire overlay2 directory and restart Docker. Docker will recreate the directory structure. You will need to re-pull all images afterward.
sudo systemctl stop docker
sudo rm -rf /var/lib/docker/overlay2
sudo systemctl start docker
This is a drastic measure. For less aggressive cleanup, use docker image prune -a which removes all images not associated with a running container, then check if the space is recovered.
Hidden Culprit 2: BuildKit Source Snapshots
BuildKit, which is the default build engine since Docker 20.10, caches source code snapshots for every build context you send. If you are building frequently with large build contexts (because you forgot to create a proper .dockerignore file), each build stores a snapshot of your entire project directory.
Create a comprehensive .dockerignore file to reduce build context size:
node_modules
.git
.next
dist
build
*.log
.env*
coverage
.nyc_output
__pycache__
*.pyc
.pytest_cache
.venv
tmp
Without a .dockerignore, a Next.js project with node_modules and .git sends 500 MB to 1 GB as build context for every single build. With a proper .dockerignore, the same project sends 10-20 MB. Over 50 builds, that difference is 25 GB versus 500 MB of cached build contexts.
Hidden Culprit 3: Multi-Stage Build Orphans
Multi-stage builds create intermediate images for each stage. The builder stage in a typical two-stage Dockerfile produces a large image with all build dependencies, source code, and compiled output. Docker keeps this intermediate image cached for build performance. These intermediate images do not show up in docker images by default — you need to add the --all or -a flag to see them.
# Show all images including intermediate layers
docker images -a
# Remove all dangling (intermediate) images
docker image prune --force
On a CI server that builds multiple projects, intermediate images from multi-stage builds can accumulate to 50+ GB. Schedule a regular cleanup job that runs docker image prune -a --filter "until=24h" to remove images not used in the last 24 hours.
Preventing the Problem: Production Best Practices
Rather than fighting disk space issues reactively, implement these preventive measures:
Configure Docker log rotation. Add a default logging driver configuration to /etc/docker/daemon.json:
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
This limits each container's log to three files of 10 MB each, capping total log storage at 30 MB per container. Restart Docker after making this change.
Schedule automated cleanup. Create a cron job or systemd timer that runs daily:
# /etc/cron.daily/docker-cleanup
#!/bin/bash
docker image prune -a --filter "until=72h" --force
docker container prune --filter "until=24h" --force
docker volume prune --force
docker builder prune --keep-storage=5GB --force
The --keep-storage flag for builder prune keeps the most recent 5 GB of build cache for performance while removing older entries.
Monitor disk usage. Set up monitoring alerts when Docker's disk usage exceeds 80 percent of the available space. Tools like Prometheus with the node_exporter can track filesystem usage and alert before the disk fills up.
Use separate storage for Docker. On production servers, mount /var/lib/docker on a separate disk or partition. This prevents Docker from filling the root filesystem, which can crash the entire operating system. A 100 GB dedicated volume for Docker is a good starting point for most workloads.
CI/CD Specific Recommendations
CI servers are particularly prone to disk space issues because they build many different projects and rarely clean up between builds. GitHub Actions hosted runners start fresh each run, but self-hosted runners accumulate layers over time.
For self-hosted GitHub Actions runners, GitLab runners, and Jenkins agents, add a pre-build cleanup step that limits Docker to a fixed storage budget:
# Before each build
docker builder prune --keep-storage=10GB --force
docker image prune --filter "until=48h" --force
This keeps the most recent 10 GB of build cache for speed while preventing unbounded growth. Adjust the values based on your available disk space and build frequency.
Docker's disk usage is a maintenance task that requires the same attention as log rotation and database backups. Ignore it, and you will get woken up at 3 AM by a full disk. Automate it, and you will never think about it again.
ZeonEdge manages Docker infrastructure including automated cleanup, monitoring, and CI/CD pipeline optimization. Learn more about our DevOps services.
Marcus Rodriguez
Lead DevOps Engineer specializing in CI/CD pipelines, container orchestration, and infrastructure automation.