Docker Registry Deployment Patterns

A container registry is best thought of as an artifact store for Docker images. Once your CI system has validated the container image for behavior, performance, content and security, it needs a place to put the known-good image. A registry is the place to store it.

This article will take a look a the private Docker registry. Please add other solutions and patterns in the comments if your tool or topology is not included. Bonus points if you have a technical write up on how to use it that can be linked to.

Your Very on Private Registry

Many organizations cannot use a SaaS based private artifact store for reasons of culture, policy or performance.  In these case, they must run their own registry and the attention shifts to how this might be done.

The emerging best practice today appears to be running a registry process on each host in a compute pool while storing image data in a global storage provider such as S3 (public cloud) or Swift (private cloud).

Let’s take a look at the more common registry deployment patterns along with their pros and cons.

Single Container, Local Storage

This is not a recommended way to run a registry for any serious workload. It is, however, handy for standing one up quickly to do the research around build pipelines and workflow implications.

Pros

  • Trivially easy to do when you are solving a broader problem at first

Cons

  • It is in one place, thus subject to scaling issues
  • It is in one place, thus a single point of failure
  • If using dedicated data volume the registry and data containers must have host affinity
  • Must secure with TLS. Often done with host level (rather than container level) certs. This creates a host level configuration management issue (Puppet, Chef, et al)
  • Skipping TLS means each docker daemon in the host pool must be run in –insecure-registry mode. Also a configuration management issue. Using any flag with the word insecure in it should be viewed with a healthy amount of skepticism!
  • An orchestration engine must be told where the registry is. To solve a general case this is usually done with service discovery, thus adding complexity.
  • Upgrading the registry means a loss of data or the use of a dedicated data volume (see below)

You can find simple directions for running a registry this way here.

Single Container with Dedicated Data Volume

It is generally a good idea to separate and application from its data. If this is not clear, check out the 12 Factor method for more detail.  Basically it boils down to the ease by which the application can be upgraded and the cattle-like nature of application containers.

Pros

  • Easy to deploy
  • Easy to upgrade the registry process with a new container (e.g. when version 2.3 comes out)

Cons

  • The container is a single point of failure.
  • This method only mades sense if both the registry and the volume container are on the same host
  • Affinity between the registry and the data container adds orchestration complexity
  • Backing up data is cludgy
  • All the same TLS and –insecure-registry issues apply
  • Must find the registry via service discovery or hard code its location in the orchestration layer.
  • A typical registry will take up many gigabytes of storage. Containers of such size are not easy to manage.

The simple instructions above can be combined with this documentation on data volume containers to try this deployment pattern.

Single Container, Host Storage

Writing data to a host volume is another way to think about running a registry. This has many of the same pitfalls as using a dedicated data container (see above).

Pros

  • Trivial to Deploy
  • Data backups can be done via common practices

Cons

  • The container is a single point of failure.
  • The registry container must be on the same host as the data, thus causing orchestration complexity
  • All the same TLS and –insecure-registry issues apply
  • Must find the registry via service discovery or hard code its location in the orchestration layer.

If global storage such as s3 is not available, this is probably the best option, in terms of durability, for your registry deployment.

The simple instructions linked above along with a simply `–volumes` switch make this possible.

Single Container, Global storage

Run the registry on a single — possibly dedicated — host, but use S3, Swift, Azure, Ceph etc back end.

This addresses concerns about treating containers — and hosts — as pets (rather than cattle) for the storage aspect. However the host running the registry is something of a snowflake (pet) itself.

Pros

  • Data is handled in a known good way in terms of availability and backups. There is a great deal of lore around backing up S3, Ceph et al. 
  • No host affinity necessary for the registry container.
  • All the same TLS and –insecure-registry issues apply
  • Must find the registry via service discovery or hard code its location in the orchestration layer.

Cons

  • The container is a single point of failure.
  • To really gain IO boosts there is a need to run a blob cache. This means a host affinity between the registry and blob cache containers.
  • Must find the registry via service discovery or hard code its location in the orchestration layer.

Registry Container per Host

This will likely become a best practice for running a private registry over time. In this scenario, a registry container is run on each host in a compute pool. Data is stored in global storage such as S3, Ceph etc.

In fully realized environments of disposable compute, a host could be killed if its registry container fails to restart if it dies.

Pros

  • No need to set up TLS or use –insecure-registry
  • The registry is always available at localhost, thus simplifying the orchestration of starting and upgrading services in an application.
  • Ease of deployment – a good orchestration tool will start a registry on each new host.
  • The blog cache can be deployed on a per host basis as well.

Cons

  • It is unclear from the documentation if any ACID concerns are in play with multiple processes writing to a single back end. (Multiple use cases have yet to indicate a problem)

For an example of how to run a registry in this mode, check out this article from StackEngine.

Conclusion

Many organizations want to run their own private image registry for various reasons. While there are many patterns to follow, the one with the most promise at this time appears to be running the registry process on each host in a compute pool while connecting to shared backend storage.

Boyd Hemphill

Boyd Hemphill is a DevOps raconteur and thought leader in the silicon hills of Austin Texas. Boyd founded Austin DevOps when he learned this thing he was doing had a name. Boyd organizes the Docker Austin meet up and the first ever Container Days is Austin Texas. In his professional life, Boyd has been a developer (PL/SQL and PHP), DBA (Oracle and MySQL), and system administrator. He sees Docker as containers for mere mortals in the same way Slicehost was virtualization for mere mortals in 2009. Currently Boyd is the Director of Evangelism for StackEngine where he educates and espouses DevOps practices as they relate to Linux Containers.

Boyd Hemphill has 3 posts and counting. See all posts by Boyd Hemphill

One thought on “Docker Registry Deployment Patterns

  • Excellent article, thanks! For the sake of completeness: experienced large-scale prod sites at Google or Facebook use, amongst other things, BitTorrent to distribute images in a cluster. KUTGW!

    Cheers,
    Michael

Comments are closed.