Have your Data Science workload on Containers with Jupyter Notebook

Tirth Patel
4 min readMay 28, 2021

Suppose you are building a UI application and deploying it on a docker container. If you want that UI application to display the user interface screen on your local machine while running the application inside the docker container, you will have to connect the display of the Docker container with the display of Local Machine. So, here we want Jupyter Notebook which runs on a browser to be run inside docker container.

Let suppose, you want to run firefox browser inside a Docker container. So first, lets have a look at type of applications and how it runs. There are majorly two types of applications that can be containerized.

  1. Applications that run as a Background service eg: webserver
  2. GUI Application that run in foreground.

So, for any GUI application to run, we need to have XServer which is already available in all Linux systems.

Lets understand what is XServer?

The X server is a single binary executable(/usr/x11R6/bin/Xorg)[in rhel8] that dynamically loads any necessary X server modules at runtime from the /usr/X11R6/lib/modules/ directory. Some of these modules are automatically loaded by the server, while other are optional. The X server and its configuration files are stored in /etc/X11/ directory. The configuration file of X server is /etc/X11/xorg.conf.

But, when we launch a container, we don’t have Xserver configured here, so what we can do is

  • We can share the Host’s XServer with the container by creating a volume
--volume="$HOME/.Xauthority:/root/.Xauthority:rw"

We have to provbide this as a option to share docker host i.e in our case RHEL8 XServer.

  • We have to share the host display environment to the container.
--env="DISPLAY"
  • Also, we have to run container with host network.
--net=host

Now, we have to run the container by providing these option.

docker run -it --name jupyteros --net=host --env="DISPLAY" --volume="$HOME/.Xauthority:/root/.Xauthority:rw" centos:latest

Now, we can install some of the basic software like ncurses, net-tools to check IP Address etc.

Now, run command yum install python3 to install python inside docker container.

Now, to run jupyter notebook, we should have a browser. So here i am installing firefox browser inside the container using command yum install firefox

Now, lets install jupyter. Run command pip3 install jupyter

First of all let us create a workspace where we can create a jupyter notebook to train our model.

Now, lets transfer our dataset from RHEL8 i.e docker host to our container.

Come to the terminal of RHEL8 and run the below command.

docker cp <SOURCELOCATION>  <CONTAINERNAME>:<DESTINATIONLOCATION>

And now if we go to the workspace in our container, we can find our dataset has been transferred.

Now, we need pandas, numpy, scikit-learn etc library to carry out datascience work. Lets install using pip3.

pip3 install pandas numpy scikit-learn

Now, our environment is ready inside docker container. Now, run the command jupyter notebook — allow-root

Now, create a new python3 notebook

Now, you can do all pre-processing and analysis stuff and create a model.

Thanks for reading :)

--

--