If you're seeing slow build times on macOS or Windows, using batect's caches as well as volume mount options such as
cached might help
Docker requires features only found in the Linux kernel, and so on macOS and Windows, Docker Desktop runs a lightweight Linux virtual machine to host Docker. However, while this works perfectly fine for most situations, there is some overhead involved in operations that need to work across the host / virtual machine boundary, particularly when it comes to mounting files or directories into a container from the host.
While the throughput of mounts on macOS and Windows is generally comparable to native file access within a container, the latency performing I/O operations such as opening a file handle can often be significant, as these need to cross from the Linux VM hosting Docker to the host OS and back again.
There are two ways to improve the performance of file I/O when using batect:
The performance penalty of mounting a file or directory from the host machine does not apply to Docker volumes, as these remain entirely on the Linux VM hosting Docker. This makes them perfect for directories such as caches where persistence between task runs is required, but easy access to their contents is not necessary.
batect makes this simple to configure. In your container definition, add a mount to
For example, for a typical Node.js application, to cache the
node_modules directory in a volume, include the following in your configuration:
containers: build-env: image: "node:13.8.0" volumes: - local: . container: /code - type: cache name: app-node-modules container: /code/node_modules working_directory: /code
batect uses a cache initialisation container to prepare volumes for use as caches. This process ensures that volumes used as caches are readable by containers running with run as current user mode enabled, and that new caches are empty the first time they are used.
(Mounting an empty volume into a container copies the contents of the container's directory into the volume, so the cache initialisation process adds an empty
file to the volume to prevent this behaviour and ensure that the volume is effectively empty.)
To make it easier to share caches between builds on ephemeral CI agents, you can instruct batect to use directories instead of volumes, and then use these
directories as a starting point for subsequent builds. Run batect with
--cache-type=directory to enable this behaviour, then save and restore the
.batect/caches directory between builds.
Using mounted directories instead of volumes has no performance impact on Linux.
The performance penalty described above does not apply when mounting directories into Windows containers. batect therefore always uses directory mounts for
caches on Windows containers, even if
--cache-type=volume is specified on the command line.
This section only applies to macOS-based hosts, and is only supported by Docker version 17.04 and higher.
cached mode is harmless for other host operating systems.
For situations where a cache is not appropriate (eg. mounting your code from the host into a build environment), specifying the
cached volume mount option
can result in significant performance improvements.
(Before you use this option in another context, you should consult the documentation to understand the implications of it.)
For example, instead of defining your container like this:
containers: build-env: image: "ruby:2.4.3" volumes: - local: . container: /code working_directory: /code
containers: build-env: image: "ruby:2.4.3" volumes: - local: . container: /code options: cached # This enables 'cached' mode for the /code mount working_directory: /code
Setting this option will not affect Linux or Windows hosts, so it's safe to commit and share this in a project where some developers use macOS and others use Linux or Windows.
Database schema and test data¶
Try to do as much work as possible at image build time, rather than doing it every time the container starts
A significant amount of time during integration or journey testing with a database can be taken up by preparing the database for use - setting up the schema (usually with some kind of migrations system) and adding the initial test data can take quite some time, especially as the application evolves over time.
One way to address this is to bake the schema and test data into the Docker image used for the database, so that this setup cost only has to be paid when building the image or when the setup changes, rather than on every test run. The exact method for doing this will vary depending on the database system you're using, but the general steps that would go in your Dockerfile are:
- Copy schema and test data scripts into container
- Temporarily start database daemon
- Run schema and data scripts against database instance
- Shut down database daemon
Make sure signals such as SIGTERM and SIGKILL are being passed to the main process
If you notice that post-task cleanup for a container is taking longer than expected, and that container starts the main process from a shell script, make sure that signals such as SIGTERM and SIGKILL are being forwarded to the process. (Otherwise Docker will wait 10 seconds for the application to respond to the signal before just terminating the process.)
For example, instead of using:
#! /usr/bin/env bash /app/my-really-cool-app --do-stuff
#! /usr/bin/env bash exec /app/my-really-cool-app --do-stuff