As part of our internal knowledge Friday initiative I decided to spend my time sharing on a topic I have been following lately, the introduction to a feature in Docker which finally attempts to combat the file system (I/O) performance issues found most prominently in Docker for Mac.
DISCLAIMER: This is published just before DockerCon 2017 kicks off over in US later today. As has been the case every year, there will be announcements of new features not covered here yet, I’ll update this blog post if anything relevant pops up.
TL;DR; This post focuses on the usage of Docker for Mac 17.04 (edge channel atm) out of the box using the new :cached mount option, specifically if “File access in mounted volumes extremely slow” is solved or not, using eZ Platform v1.9..
I started early off as a Docker fan, before v1.0. I looked into it quite a bit, and even ended up speaking about the topics surrounding it at conferences and meetups in 2015 and 2016 as a result of that. What got me excited was the potential for simplifications across organizations, dev/QA/Support teams, and customers everywhere by having a shared environment that can be distributed together with the code and be universally executed everywhere.
You might say “A dream”, because if reality has shown us anything the last two years it is that Docker and the now wider container industry hasn’t really reached that potential yet. There are clear issues running it in production better described elsewhere, and there are blocking issues from using it even for local development as I’ll cover a bit here.
For the development issue(s) many solutions or workarounds exists, there are everything from full fledged alternative docker-machine wrappers like for instance dinghy, to solutions on how files are synced d4m-nfs, docker-sync, .., to the ever present pragmatic and safest solution of just running Linux natively or in VM.
But this post will rather focus on improvements to the out of the box Docker for Mac solution itself.
The I/O problem with Docker for Mac
When we setup dev environment using docker, like you would when using VM/Vagrant, we ideally want to share files between host and container(guest). So we let Docker mount the files from our project working directory into the container using a “volume” so changes we do on host immediately show up in container.
This works flawlessly on Linux where Docker uses native Linux kernel technologies like cgroups when running Linux Containers (as opposed to Windows Containers) to avoid having to use a VM, meaning for instance I/O operations are as always on Linux blazing fast. On Mac (and Windows) that's another story, we need a VM, and suddenly mounting files from host to guest is much more complicated. Essentially Docker needs to sync the files from host => VM => container.
For users of compiled languages like Java (Scala/..), C#, Go, this is normally not a problem, the application is compiled and not touching the disk more than needed. For things like compilers where you need to read and write, or for what affects most PHP applications; Dynamic programming languages which interpret the code and hence typically needs to do a lot of file read or at least stat files at runtime, slow disk I/O quickly becomes a major problem.
When Docker for Mac launched over a year ago to replace Docker Machine it was thought to be aimed at solving these I/O problem that has existed with VM’s for years and typically worked around by using NFS or rsync. However it was perhaps more meant to solve the other issues traditional VM system brings, like I/O events for file watchers (sass/less/TypeScript/babel/.. generators and transpilers), and permission issues when matching users on host and guest system.
So instead it actually made I/O performance far worse by aiming to guarantee security sandboxing between host and guest, leaving Docker for Mac largely unusable for development use with for instance PHP applications.
PHP Application used for testing
For those not familiar with eZ and eZ Platform, it’s a PHP application using Symfony Framework, specifically a full fledged CMS. So it’s a somewhat heavy application, and while that's something we are improving from release to release, especially in v2 later this year, it will always be more heavy than for instance Wordpress.
This is especially true when Symfony is configured in development mode offering detailed profiling, configuration change detection, support for running in non cached mode , advanced logging and adding extra information provided to help boost the developer productivity.
On top of this the backend in eZ Platform v1 is written as a rich JS application, that combined with a traditional (solid but chatty) REST API means there are several backend requests going on to be able to browse true the backend interface, means slow backend problems are multiplied several times making it far worse.
All this makes eZ Platform a prime candidate to show just how bad performance Docker for Mac is...
While I could have done A/B testing or similar, given the factors above on backend consisting of several lookups to the backend machine, user perceived performance, aka time it takes to show changes and finish showing changes on a page is more relevant. Even if I here do it rather un scientific
For comparison two machines are used:
- “Older Macbook Pro”
- 2011 13” Macbook Pro with 2.3 Ghz dual-core i5, 8Gb Ram & SSD
- “Moderately fast iMac”
- 2011 iMac with 3.1 Ghz quad-core i5, 8Gb Ram & SSD
- Trivia: A new top of the line 2016 13” Macbook Pro’s with dual core now has almost same multicore performance in Geekbench, I mention this since the file system issues with Docker for Mac is mostly CPU bound as the osxfs solution used handles the whole thing in software.
Docker for Mac is for these tests configured to use 2 CPU’s and 3Gb RAM in both cases.
- If configured to use less, the performance numbers were noticable worse!
Docker prod container
For comparison here is performance when running with code built into container:
Older Macbook Pro
Moderately fast iMac
*Once UI has loaded, so on subsequent operations, UI loading took roughly 2s.
Docker dev container
Prior setup (not using :cached mount option)
Older Macbook Pro
Moderately fast iMac
4-6s (Very slow)
*Once UI has loaded, so on subsequent operations, UI loading took several minutes here...**Performance has been even worse in some Docker for Mac versions, but either way catastrophic slow.
Setup with Docker 17.04 (using :cached mount option)
Docker 17.04 adds a couple of new mount options for volumes, in itself they don’t do much yet, but they add semantical meaning that Docker for Mac/Windows can take advantage of to more aggressively cache mount points if applicable. This is still work in progress, but already gives us noticeable speed once application is loaded a couple of times:
Older Macbook Pro
Moderately fast iMac
3-6s (Very slow)
1-3s (Slow but almost ok’ish for dev use)
Backend* performance /w SYMFONY_ENV=PROD
*Once UI has loaded, so on subsequent operations, UI loading took roughly 10-20s.
Pull request for the change above can be found here, including instructions on how to run.
On the testing done here
The proof of concept done here does not fully represent how :cached should be used in the end, these tests were done with the whole project directory (including cache) mounted up as one :cached volumen. However, future Docker for Mac versions is going to add support for :delegate as well, meaning we should at least split out mount points we write to from those that are purly read (code).
Further reading on the concepts here.
Relevance to eZ Platform v2
Since v2 of eZ Platform is mentioned here it is probably worth mentioning that performance numbers here will see benefits from the efforts we do there as well, from features such as:
- Reducing server round trips:
- REST API include/push (Client side hint for getting several entities at once)
- Usage of Symfony for more parts of the UI (More logic is pre-rendered on server side, simplifying extensibility for Symfony developers)
- Reducing file system reads and writes:
- Improvements in Symfony itself in 3.x and towards 4.x
- Move from Stash cache to Symfony Cache
- Improvements to the Docker setup
- Work is on the way together with Sébastien Morel (@Plopix) to add things like memcached and varnish to the bundled docker setup, reducing the reliance of file based caches
.. and other upcoming improvements to the architecture, a lot to show at this years upcoming eZ Conference in London ;)
Reflection on Docker progress
A few years later and there seems to finally be some movement on this, it seems like we might be close to having a working development setup for use of containers across all platforms. Simplifying the lives of development teams everywhere, especially when getting started, including here at eZ.
So what took so long? For all I know it could be that because Docker Inc is mainly using Go they didn't anticipate this issue, or they somehow had to prioritize other topics then making sure all developers could use Docker for local development, for instance Swarm. Questionable order of priority perhaps, but could be they are in a hurry to show they can start to make money.
So is this solving everything? Not yet, there are still scenarios that is very slow or buggy:
- Instability: In the backend tests there would sometimes be a request to refresh user session before doing anything else, at times this could stall for quite some time and the only explanation I could find was either that the I/O cache was expired or that the writes to session on disk in this setup where causing it.
- General I/O performance: The performance overhead is still too large, both when cached for instance Netbook / Macbook Air users won’t be able to use this still despite the tweaks done here, they are still way better off using Linux natively or in a full VM. But also for everyone when cache is cold or disabled.
All-in-all this is a great step in the right direction, but for Docker to be ready for the greater majority of users out there are still some improvements needed. Close, but no cigar, yet.