maciej łebkowski
Maciej Łebkowski

Cleaning up docker to reclaim disk space

in Professional

I’m a big fan of docker. For more than two years now I believe it will change how we deploy applications. Not only web apps, but CLI tools as well. Maybe even GUI apps. That point of view doesn’t change the fact that I see a lot of drawbacks.

One of the main things that bother me when using docker is it hogging up disk space. Recently I constantly ran into issues with my setup because the disk space was „leaking” somewhere and I couldn’t tell where it went. Here are some tips and solutions how to avoid keeping unused volumes behind or prevent them being left in the first place.

Good practices regarding keeping disk usage to a minimum

First of all, docker by default doesn’t care about using the disk space. Most of the commands leave a trace behind, make a copy of something or replace an item without removing the previous version. Let’s take a look at the most common ones:

  • docker pull and docker build create new docker images. Each layer is cached and uses aufs, so it decreases disk usage by itself, but it’s also leaving previous versions / layers dangling.

    You can remove untagged images by running:

    # hint osx users: your version of xargs won’t have the -r switch
    # so just skip it (you may encounter an error if there are no
    # images to clean up)
    
    docker images --no-trunc | grep '<none>' | awk '{ print $3 }' \
        | xargs -r docker rmi
    
  • docker run leaves the container by default. This is convenient if you’d like to review the process later -- look at the logs or exit status. This also stores the aufs filesystem changes, so you can commit the container as a new image.

    This can be expensive in terms of disk space usage, especially during testing. Remember to use docker run --rm flag if you don’t need to inspect the container later. This flag doesn’t work with background containers (-d), so you’ll be left with finished containers anyway. Clean up dead and exited containers using command:

    docker ps --filter status=dead --filter status=exited -aq \
      | xargs docker rm -v
    
  • docker rm does not remove the volumes created by the container. I can’t figure out why would the default be this way, but you need to use the -v flag to remove the volumes along the container.

Docker filesystem storage and volumes

There are three main ways docker stores files:

  • By default, everything you save to disk inside the container is saved in the aufs layer. This doesn’t create problems if you clean up unused containers and images.
  • If you mount a file or directory from the host (using docker run -v /host/path:/container/path …) the files are stored in the host filesystem, so it’s easy to track them and there is no problem also.
  • The third way are docker volumes. Those are special paths that are mapped to a special directory in /var/lib/docker/volumes/ path on the host. A lot of images use volumes to share files between containers (using the volumes-from option) or persist data so you won’t lose them after the process exits (the data-only containers pattern).

Now, since there is no tool to list volumes and their state, it’s easy to leave them on disk even after all processes exited and all containers are removed. The following command inspects all containers (running or not) and compares them to created volumes, printing only the paths that are not referenced by any container:

#!/usr/bin/env bash

find '/var/lib/docker/volumes/' -mindepth 1 -maxdepth 1 -type d | grep -vFf <(
  docker ps -aq | xargs docker inspect | jq -r '.[]|.Mounts|.[]|.Name|select(.)'
)

What it does, step by step:

  • List all created volumes
  • List all containers and inspect them, creating a JSON array with all the entries
  • Format the output using jq to get all the names of every mounted volume
  • Exclude (grep -vFf) mounted volumes form the list of all volumes

You need to run this as root and have jq utility present.

The command doesn’t remove anything, but simply passing the results to xargs -r rm -fr does so.

Hint: docker 1.9 has new volume management system, so it’s way easier with this version:

docker volume ls -qf dangling=true | xargs -r docker volume rm

Recap

Save the following script to clean up everything at once:

https://gist.github.com/mlebkowski/471d2731176fb11e81aa

#!/bin/bash

# remove exited containers:
docker ps --filter status=dead --filter status=exited -aq | xargs -r docker rm -v

# remove unused images:
docker images --no-trunc | grep '<none>' | awk '{ print $3 }' | xargs -r docker rmi

# remove unused volumes (needs to be ran as root):
find '/var/lib/docker/volumes/' -mindepth 1 -maxdepth 1 -type d | grep -vFf <(
  docker ps -aq | xargs docker inspect | jq -r '.[]|.Mounts|.[]|.Name|select(.)'
) | xargs -r rm -fr

This maintenance should trim all the fat and reduce your docker environment only to the used containers, images, and volumes.

Was this interesting?

About the author

My name is Maciej Łebkowski. I’m a full stack software engineer with 25 years of experience, currently part of the WonderProxy team.

https://wondernetwork.com/about

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This means that you may use it for commercial purposes, adapt upon it, but you need to release it under the same license. In any case you must give credit to the original author of the work (Maciej Łebkowski), including a URI to the work.