ML Zoomcamp 2023 – Deploying Machine Learning Models– Part 6

  1. Environment management: Docker
    1. Why do we need Docker?
    2. Running a Python image with Docker
    3. Dockerfile – general information
    4. Dockerfile for churn app
    5. Building a Docker image
    6. Running a Docker image

Environment management: Docker

Why do we need Docker?

Docker is a powerful tool that addresses many challenges in modern software development and deployment. It allows us to isolate entire applications from the rest of the system’s processes and dependencies, providing a level of encapsulation and portability that is invaluable in today’s computing landscape.

One of Docker’s core advantages is its ability to create containers, which are lightweight, portable environments. These containers can include not only the application itself but also all the necessary libraries, dependencies, and even different versions of operating systems. This isolation ensures that a single system can host multiple containers, each running its own set of services and applications.

By placing different services in separate containers, we achieve complete isolation between them. These services are unaware of each other’s presence, as each one operates in its own container, believing it is the sole process running on the system. This isolation greatly simplifies software management, maintenance, and troubleshooting.

Moreover, Docker allows us to define specific environments with precision. For example, we can create a container with Ubuntu 18.04, configure it with different system libraries, and specify particular Python versions. Docker’s flexibility in environment definition ensures consistency across development, testing, and production stages.

Perhaps the most compelling feature of Docker is its portability. Once we’ve encapsulated a service within a Docker container, we can easily move it between different environments, from a local development machine to the Cloud or any other deployment target. This consistent packaging and deployment process save time and reduce potential deployment issues.

In summary, Docker revolutionizes the way we manage and deploy applications by providing isolation, flexibility, and portability. It has become an essential tool in modern software development, empowering developers to build, test, and deploy software reliably across various platforms and environments.

Running a Python image with Docker

To get started with running a Python image in Docker, you can search for available Python images on Docker Hub by navigating to ‘hub.docker.com/_/python’.

Once there, you’ll find various tags for Python images that you can choose from. Let’s select a specific version, such as ‘3.8.12-slim’.

To run the chosen Python image, you can use the following command:

docker run -it --rm python:3.8.12-slim

Here’s what each part of the command does:

  • -it: This option allows you to interact with the container’s terminal.
  • --rm: This flag indicates that you want to remove the container once you’re done with it.

If the specified Python version isn’t already present on your system, Docker will automatically download it from the internet. Running the command will grant you access to the Python terminal within the container.

Additionally, you can access the container’s terminal directly by overwriting the entry point. The entry point is the default command that gets executed when you run the container. Here’s how you can access the container’s terminal:

docker run -it --rm --entrypoint=bash python:3.8.12-slim

Running this command will open a terminal within the container, allowing you to interact with it directly.

Dockerfile – general information

A Dockerfile is a script-like text file used in Docker to build container images. It contains a series of instructions that Docker follows to create a reproducible and self-contained environment for running applications.

Here are some key aspects of a Dockerfile:

  1. Base Image: You start by specifying a base image, which serves as the foundation for your container. Base images are typically lightweight Linux distributions or specific application images.
  2. Instructions: Dockerfiles consist of various instructions, each responsible for a specific task. Common instructions include FROM (to set the base image), RUN (to execute commands inside the container during build), COPY (to copy files into the container), and CMD (to specify the default command to run when the container starts).
  3. Layering: Docker images are built in layers. Each instruction in the Dockerfile creates a new layer. Layers are cached, and if nothing changes in a layer, Docker can reuse it from cache, making builds faster.
  4. Environment Configuration: You can set environment variables, configure ports, define working directories, and perform other setup tasks to configure the container environment.
  5. File Copying: You can copy files from the host system into the container image using the COPY instruction. This is often used to add application code, configuration files, or other assets.
  6. Container Execution Command: The CMD instruction specifies the default command to run when the container is started. It’s often used to launch the primary application within the container.
  7. Best Practices: Following best practices in Dockerfile design can help create efficient and secure images. This includes minimizing the number of layers, avoiding installing unnecessary packages, and keeping images small.

Dockerfiles are a crucial component of Docker’s philosophy of containerization. They allow developers to define the entire environment needed to run an application, ensuring consistency across different environments and simplifying deployment and scaling.

Dockerfile for churn app

Let’s examine the following Dockerfile for the churn app, along with explanations of the steps involved:

WORKDIR /app

This instruction sets the working directory to ‘/app’ within the Docker container. If the directory doesn’t exist, it creates it and navigates to it, similar to the ‘cd’ command.

COPY ["Pipfile", "Pipfile.lock", "./"]

Here, we copy the ‘Pipfile’ and ‘Pipfile.lock’ from the local directory into the ‘/app’ directory within the Docker container.

RUN pipenv install --system --deploy

Instead of creating a virtual environment within the Docker container (using RUN pipenv install), this instruction installs the required dependencies directly into the system environment. This is often preferred to keep the container smaller and simpler.

However, there are two important tasks missing. First, we need to specify which port should be exposed by the container to the host machine. We achieve this with the ‘EXPOSE’ instruction, which tells Docker to open port 9696.

EXPOSE 9696

Finally, when running the container, we need to map the port between the container and the host machine. This is done using the ‘-p’ option, where ‘9696:9696’ (container port : host port) indicates that port 9696 on the host machine should be mapped to port 9696 within the container.

FROM python:3.8.12-slim

RUN pip install pipenv

WORKDIR /app
COPY["Pipfile", "Pipfile.lock", "./"]

RUN pipenv install --system --deploy

COPY["predict.py", "model_C=1.0.bin", "./"]

EXPOSE 9696

ENTRYPOINT["gunicorn", "--bind=0.0.0.0:9696", "predict:app"]

Building a Docker image

To build the Docker container, use the following command:

docker build -t zoomcamp-test .

This command instructs Docker to build an image tagged as ‘zoomcamp-test’ using the current directory as the build context.

Running a Docker image

To run the container, use the following command:

docker run -it --rm -p 9696:9696 zoomcamp-test

The port mapping established by -p 9696:9696 enables our ‘test.py’ script running on the host machine to communicate with the churn service running inside the Docker container. This mapping allows the script to utilize the local machine’s port, which is mapped to the container’s port used by the churn service.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.