Overview
Docker relies on Dockerfile to automatically build images. Although the syntax of Dockerfile is simple, writing Dockerfile to reduce image size and speed up image building requires practice and experience. This section introduces some best practices for writing Dockerfile to help you write Dockerfile efficiently.
Excluding Files through dockerignore
When Docker builds an image, it collects all files in the Dockerfile directory into the process. Files that do not need to participate in the build can be excluded through the dockerignore file, thereby reducing the image size. The syntax of .dockerignore is similar to gitignore, as shown in the example below:
.git/ node_modules/
This example excludes the git and node_modules folders in the Dockerfile directory.
A Container Only Runs a Single App
Although Docker supports running multiple processes, such as running the front-end, back-end, and database all in one Docker container, this approach can lead to some issues:
1. Long build time, as modifying one app causes the entire system to be rebuilt.
2. Large image size.
3. Different apps require different resources, leading to resource wastage when scaling.
Therefore, it is recommended to split the service into different apps, then build and deploy images for each application separately. For example, a service that depends on node.js and MySQL originally needs to install dependencies for both in its Dockerfile:
RUN apt-get install -y nodejs mysql
After splitting, the node.js and MySQL services can be deployed separately:
The Dockerfile for the node.js service includes:
RUN apt-get install -y nodejs
The Dockerfile for the MySQL service includes:
RUN apt-get install -y mysql
Avoid Installing Unnecessary Packages
Avoid installing any additional or unnecessary packages to reduce image complexity, decrease image size, and shorten build time. When an update is required, it is recommended to use "apt-get install -y xxx" to upgrade the specified packages, avoiding the installation of unnecessary dependencies. "apt-get upgrade" will automatically update all dependency packages, which can lead to unpredictability in the build process and potentially generate inconsistent images. Therefore, usage of this should be minimized.
Reducing Image Layers and Utilizing Cache
Docker images are layered, with each command in the Dockerfile creating a new image layer. These image layers are cached and reused. When a command in the Dockerfile is altered, a copied file is modified, or the variables specified during image construction are changed, the associated image layer cache becomes invalid. This also invalidates the cache of all subsequent image layers.
For the following Dockerfile:
FROM ubuntu ADD . /app RUN apt-get update RUN apt-get install -y nodejs RUN cd /app && npm install CMD npm start
You can merge the two RUN commands related to apt-get. This can reduce the image layers and prevent outdated dependencies from being installed by apt-get install due to the apt-get update hitting the cache. Also, by moving the apt-get command forward, you can avoid creating new image layers due to changes in the source code, which could invalidate the apt-get command cache. Therefore, the final adjusted result is:
FROM ubuntu RUN apt-get update && apt-get install -y nodejs ADD ./app RUN cd /app && npm install CMD npm start
Deleting Redundant Files Created by Deletion Instructions
Suppose we have updated the apt-get source, downloaded, unzipped, and installed some software packages, all of which are saved in the /var/lib/apt/lists/ directory. However, these files are not required in the Docker image when running the app. Therefore, it is recommended to delete them to prevent the Docker image from becoming too large. For example:
RUN apt-get update \ && apt-get install -y nodejs \ && rm -rf /var/lib/apt/lists/* # Deleting the directory created by apt-get update
Specifying Basic Image Tags
When an image tag is not specified, the "latest" tag is used by default. Therefore, the FROM ubuntu command is equivalent to FROM ubuntu: latest. When the image undergoes an update, the "latest" tag can point to various images. This could potentially lead to the failure of the image build. Therefore, unless you specifically require the latest edition of the basic image, it is recommended to specify a particular image tag, such as "FROM ubuntu:16.04".
Selecting an Appropriate Basic Image
For different apps, the most suitable basic image should be chosen. For example, if your only need to run a node program, you can use the node image instead of the ubuntu image. Moreover, by using the more minimalistic alpine version, you can reduce the image size even more. For example:
FROM node:7-alpine ADD ./app RUN cd /app && npm install CMD npm start
Using Multi-Stage Builds
Multi-stage builds refer to the method of using multiple FROM statements in a Dockerfile to build images. As each build stage only includes necessary dependencies and files, it can enhance the build speed and reduce the size of the final image. The following example demonstrates a two-stage build process. In stage I, the source code is compiled. In stage II, the executable file from the first stage is run in an empty scratch image, effectively reducing the final image size.
### Stage I FROM golang:1.16 as builder WORKDIR /go/src COPY myapp.go ./ RUN go build myapp.go -o myapp ### Stage II FROM scratch WORKDIR /server ### Reference the Executable File from the Stage I COPY --from=builder /go/src/myapp ./ CMD ["./myapp"]