Quick Refresher on Dockerfiles: Creating Efficient Container Images

TL;DR:
A Dockerfile is a simple yet powerful script that defines how to build a Docker image. By using a series of instructions, it ensures consistent, automated, and reproducible builds—making it a cornerstone of modern DevOps workflows.


Why Dockerfiles Matter
If you’ve ever deployed an app and thought, “It worked on my machine,” Dockerfiles are your antidote. These text-based blueprints define everything needed to build a container image, from the base OS to runtime commands. Whether you’re working on a personal project or a complex Kubernetes deployment, understanding Dockerfiles is essential to building reliable, portable applications.

In this post, we’ll break down the structure of a Dockerfile, explore key instructions, and share best practices to help you write cleaner, more efficient builds.


What Is a Dockerfile?

A Dockerfile is a plain text file that contains a list of instructions Docker uses to assemble an image. Think of it as a recipe: each line adds an ingredient or performs a step, eventually producing a container-ready image.

The syntax is straightforward and follows a key-value format. Comments start with #, and long commands can be split across lines using \.

Here’s a quick example for a Python app:

FROM python:3.9-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]

This defines a lightweight Python environment, copies your app files, installs dependencies, and sets the default command.


Common Dockerfile Instructions (and What They Do)

Each instruction in a Dockerfile serves a specific purpose. Here are the most commonly used ones:

  • FROM: Specifies the base image (e.g., FROM ubuntu:20.04).
  • RUN: Executes commands in a new image layer (e.g., installing packages).
  • COPY / ADD: Transfers files into the image. Use COPY for straightforward copies; ADD can handle URLs and archives.
  • CMD / ENTRYPOINT: Sets the default command when the container starts.
  • ENV: Defines environment variables.
  • EXPOSE: Documents which ports the container listens on.
  • VOLUME: Creates mount points for persistent or shared data.
  • LABEL: Adds metadata (e.g., version, maintainer).
  • USER: Specifies which user runs the container processes.
  • WORKDIR: Sets the working directory for subsequent instructions.
  • ARG: Defines build-time variables.
  • ONBUILD: Triggers instructions for child images.

For a full list of instructions and syntax, refer to the official Dockerfile reference.


Best Practices for Writing Dockerfiles

Good Dockerfiles aren’t just functional—they’re efficient, secure, and maintainable. Here are some best practices to keep in mind:

1. Minimize Layers

Each instruction creates a new layer. Combine related commands to reduce image size and build time:

RUN apt-get update && apt-get install -y \
    curl \
    git \
 && rm -rf /var/lib/apt/lists/*

2. Use .dockerignore

Just like .gitignore, this file tells Docker which files to exclude from the build context. It helps keep your image lean and protects sensitive files.

3. Leverage Build Cache

Docker caches layers to speed up rebuilds. Order your instructions from least to most likely to change so you can reuse cached layers effectively.

4. Use Official Base Images

Start with trusted, minimal base images like python:3.9-slim or alpine to reduce vulnerabilities.

5. Keep Images Small

Remove unnecessary dependencies and clean up after installs. Smaller images are faster to build, transfer, and deploy.

For more practical tips and command examples, check out this handy Docker, Kubernetes, and Swarm crib sheet.


Dockerfiles in the Real World

Dockerfiles are foundational in CI/CD pipelines, allowing consistent builds across environments. In Kubernetes, Dockerfiles power the custom images that run inside pods. This ensures every deployment uses the exact same runtime environment, reducing “it works on my machine” bugs.

Imagine deploying a Python microservice to Kubernetes. With a Dockerfile, you can:

  • Define the Python version and dependencies
  • Set environment variables for different stages (dev, staging, prod)
  • Package the app into a lightweight, portable container

This level of control is what makes Dockerfiles so powerful in modern software development.


Key Takeaways

  • Dockerfiles define how to build container images using a series of clear, repeatable instructions.
  • Common directives include FROM, RUN, COPY, CMD, and ENV, among others.
  • Best practices help keep images small, secure, and efficient.
  • Dockerfiles are essential for reproducibility in CI/CD and Kubernetes environments.
  • Resources like the Dockerfile reference and crib sheet offer valuable guidance.

Conclusion

A well-crafted Dockerfile is more than just a build script—it’s a blueprint for stability, scalability, and speed. Whether you’re shipping code to production or spinning up a dev environment, mastering Dockerfiles is a skill that pays off across the entire software lifecycle.

Ready to take your Dockerfiles to the next level? Start by auditing your current ones for unnecessary layers, explore multi-stage builds, and don’t forget to check out the official Dockerfile documentation for deeper insights.

📚 Further Reading & Related Topics
If you’re exploring creating efficient Dockerfiles, these related articles will provide deeper insights:
Enhancing Docker Builds with BuildKit and GitHub Actions – Learn how to speed up your Docker builds and improve caching efficiency using BuildKit and GitHub Actions, which complements the goal of writing optimized Dockerfiles.
Streamlining CI/CD with GitHub Actions: A Dive into Docker Builds – This article explores how Docker fits into modern CI/CD pipelines, offering practical tips that align with building and deploying containerized applications efficiently.
Spring Boot and Docker: Containerising Your Application – A practical guide to containerizing Spring Boot apps with Docker, expanding on how Dockerfiles are used in real-world application packaging.

Leave a comment

I’m Sean

Welcome to the Scalable Human blog. Just a software engineer writing about algo trading, AI, and books. I learn in public, use AI tools extensively, and share what works. Educational purposes only – not financial advice.

Let’s connect