I had my first contact with Docker containers a few months ago and, since the very first image pull, I noticed how robust and easy to use this platform can be.
I’m currently running WSL2 and the Ubuntu distribution offered in the Microsoft Store. Installing WSL2 and a Linux distro is also an easy task, following the very detailed Microsoft documentation at: Install Windows Subsystem for Linux (WSL) on Windows 10 | Microsoft Docs
Since that, almost all my Data Science stuff is running on containers, including: Python environment, Jupyter and JupyterLab, Tensorflow, MySQL and even a docker compose with AirFlow and Postgres, with images pulled “as it is” from Docker Hub.
There are usually some tricks to make some of these tools running well on a container, but since you understand how a docker container works, things are getting easy to fix and build, specially when when you need to persist your data/settings, as containers are ephemeral and will not keep any data saved to it when you kill it.
But in this article I will show some very simple steps to create your own Docker image. Starting with a empty dockerfile or modifing an existing one.
You can find the customized dockerfile and other stuff in my GitHub repo: ruginski/DSDocker
What is Docker?
Docker is an amazing platform that uses OS-level virtualization, what makes the images extremely light and flexible — compared to virtual machines. It helps automating the deployment of applications both on premises or to the cloud.
The image below shows a comparation between standard VMs and Docker containers:
What is a dockerfile?
A dockerfile is simply a text file that contains all the commands and instructions that will be used by the docker build tool to create the image.
- jupyter/base-notebook: is the base for all the other images.
- jupyter/scipy-notebook: packages to scientific Python ecosystem.
- jupyter/tensorflow-notebook: with Python libraries for deeplearning.
- jupyter/R-notebook: with packages for R development.
The complete list and descriptions can be found at: Selecting an Image — docker-stacks.
Building the image
So, let’s see how I managed to built my custom docker image using an existing dockerfile derived from jupyter/docker-stacks.
In my case, I merged both jupyter/minimal-notebook and jupyter/scipy-notebook as they have most of the libraries, dependencies and commands already in it. But the original scipy-notebook has the minimal-notebook as it’s base, that yet has the base-notebook as it’s base. Anda I was looking for one single dockerfile that was not dependent on so many other images.
Step 1 — Create the dockerfile
Create a folder that will contain the dockerfile and anyother files needed. The folder can be placed inside the WSL2 distro or on your host PC. Let´s make it simple and create it at the host PC as C:\docker\ds-notebook
Copy the dockerfile that you want to use as a base and copy to this folder. In this case, let’s copy the one from docker-stacks/scipy-notebook.
Use your preferred text editor to open it, but I suggest VS Code.
Step 2 — Make your changes
Now, all you have to do is edit the file according to your needs. You can add/remove OS updates and libraries, among many other fancy stuff.
As previously mentioned I merged both jupyter/minimal-notebook and jupyter/scipy-notebook and besides that, all that I did was to add more libraries to it and a few other changes. Final version can be found at: DSDocker/ds-notebook
Step 3 — Build the image
Now the fun part: let’s actually build the image!
Open your WSL console and head to the folder we’ve previously created:
Make sure Docker Desktop is up and running before proceeding!
Type the following command:
docker build -t ds-notebook .
Note: You can replace “ds-notebook” with the name you want for your image.
The building process will start. Docker build will download and install all the dependencies and libraries, so grab a coffee and come back in a few minutes.
Step 4 — Check if image was built
Use the docker images command and see if your new image is there:
Step 5 — Start the container
Use the docker run command as below to start you new container:
docker run -d -p 8888:8888 -e JUPYTER_RUNTIME_DIR=/tmp -v "$PWD":/home/jovyan --name DS-Notebook DS-Notebook:latest