Swipe left or right to navigate to next or previous post

Guide on Best practices on Dockerfile and Docker image

30 Dec 2022 . category: Docker . Comments
#Server #Docker

Following are some of the best practices while creating Dockerfile and Docker image

1. Exclude the unnecessary files, folders with .dockerignore file

The .dockerignore file is used to exclude files and folders that are not relevant to the build. The .dockerignore file uses a exclusion patterns similar to .gitignore files of git.

Here is an example of .dockerignore file

    # The following content will be ignore
    */temp*
    */*/temp*
    temp?

The above .dockerignore file cause the following behavior

*/temp*: Exclude files and directories names start with temp in any immediate subdirectory of the root. somedir/temporary.txt and directory /somedir/temp is excluded
*/*/temp*: Exclude files and directories starting with temp from any subdirectory that is two levels below the root. For example, /somedir/subdir/temporary.txt is excluded.
temp?: Exclude files and directories in the root directory whose names are a one-character extension of temp. For example, /tempa and /tempb are excluded.

   *.md
    !README.md

All markdown files except README.md are excluded from the context.

    *.md
    !README*.md
    README-secret.md

No markdown files are included in the context except README files other than README-secret.md.

2. Use multi-stage builds

Multi-stage builds allow you to drastically reduce the size of the final docker image without struggling to reduce the number of intermediate layers and files.

Since an image is built during the final stage of the build process, you can minimize image layers by leveraging build cache.

For example, if you build contains several layers, you can order them from the less frequently changes to the more frequently changed.

Install tools you need to build your application
install or update the dependencies
Generate your application

3. Don’t install unnecessary packages

Avoid installing unnecessary or extra packages which will reduce complexity, dependencies, file sizes and build.

4. Decouple applications

Decouple the applications into multiple containers as it makes easier to scale horizontally and reuse containers.

For example, a typical web application can have three separate containers, each with its own unique image, to manage the database, web application and cache in decoupled manner.

5. Sort multi-line arguments

Sort the multiline arguments alphanumerically whenever possible. It reduces the duplication of packages and makes easier to update later. This also makes PRs to easier to read and review.

     docker-compose exec api python manage.py commands_name

6. Security scanning of docker image

It is a good practice to scan for security vulnerabilities using docker scan command.

To scan the docker images, you must be logged into Docker hub.

    docker scan --login
    docker scan image-name

To check the getting-started name docker image

    docker scan getting-started

7. Image layering

Using the docker image history command, it is possible to see the command that was used to create each layer within an image.

     docker-compose image history getting-started

Each of the lines on the output of above command represents a layer. It also shows the size of each layer which helps to diagnose large images.

8. Layer caching

It is possible to decrease the build times of the container images. Once must know that, once a layers changes, all downstream layers have to be recreated.

Look into the following Dockerfile

     # syntax=docker/dockerfile:1
    FROM node:18-alpine
    WORKDIR /app
    COPY . .
    RUN yarn install --production
    CMD ["node", "src/index.js"]

The drawback of the above dockerfile command is that if there is any changes on the project, the COPY command copies the files from our directory to docker and then install all the dependencies.

This can be solved by following Dockerfile

      # syntax=docker/dockerfile:1
     FROM node:18-alpine
     WORKDIR /app
     COPY package.json yarn.lock ./
     RUN yarn install --production
     COPY . .
     CMD ["node", "src/index.js"]

On the above commands, if there is any changes on project files, it only changes the latest two command starting from COPY . . . The layer to install the dependencies occurs runs only if there is changes on package.json or yarn.lock file.

9. Do not use your Dockerfile as a build script

The Dockerfile should never be used to build script because it will make the builds unnecessary long.

We should use the ADD instruction to copy files necessary for compilation into image before it starts running commands. This will help to keep the Dockerfile short and manage any dependencies required for compilation separately in seperatly from the Dockerfile.

10. Use ENV to define environment variables

If you have a variable that must be different both inside and outside of your container, then you must define it using ENV.

11. Commit your Dockerfile to the repository

Committing the Dockerfile to repository helps to reference it later without remembering all of the commands.

12. Be mindful of the base image and its image size

The extraneous layer and code increase the Docker image's size which will make the container to start up slower. So, use the packages and scripts that are only useful. If it does not seem to necessary to include in the base image, try and find a way to install it when the container start up instead.

13. Do not expose secrets

Never copy the sensitive information into the DOcker file. You can use the .dockerignore file to prevent copying the sensitive information. Another way can be using the environment file to store the sensitive information

14. Do not expose the unnecessary ports

Docker expose the range of random internal ports by default which can expose the critical services to outside world and vulnerable and open to attack.

Create EXPOSE entry in the Dockerfile to expose a service to the public internet.

     EXPOSE 8000

This entry expose 8000 Docker port to public

15. Use Specific Docker image version

When you select the latest tag of the image to build the image, there might be change on the latest image which may break the application later when there is change in version as the latest tag is unpredictable and may cause unexpected behavior.

So, instead of choosing the random latest image, fix the version of the image that you want to use. When the new version is released, developer can test and update the version of the image.

16. Use small-sized Official images

The smaller size image reduces the size of storage space in image repository which also reduces the usages of storage in deployment server. It also helps in faster when pulling or pushing them from the repository. So, choose the small-sized official images that full-fill the requirements of application.

17. Use the least privileged User

By default, the Dockerfile uses a root user which introduces a security issue. But there is no reason to run containers with root privileges. When the docker is run with the root privileges on the host, it hold the privileges on underlying host and its processes.

So, it is best practice is to create a dedicated user and a dedicated group in the Docker image to run the application. Use USER directive to specify the user that is used to run the Docker.

Some of the images contains some generic user bundled in. So, we don't to create a separate dedicated user and dedicated group. For example, node.js image contains node generic image.

Tapan B.K. | Full Stack Software Engineer

Tapan B.K. is Full Stack Software Engineer. In his spare time, Tapan likes to watch movies, visit new places.