Introduction to Docker | Tech
by Saumya Shah
Notes from Introductory course on Docker:
Introduction:
What are containers?
- Group of processes running in isolation (processes run on same shared kernel)
- Isolated by having its own set of namespaces
- Monitored by Control groups or cgroups
VMs vs Containers
- VMs run a hypervisor, run full blown OS on same host and are heavy, slow to start
- Containers - no full OS, only OS specific files as per need of applications, thus are fast to start and light weight
- Containers give benefits of isolation that VM have, but at lower cost
- Containers can be run on top of VMs, run on base OS/Kernel -> Better resource utilization.
- Containers provide standard developer to operations interface.
- Containers include everything they need to run the processes within them. You need not install additional dependencies on the host.
What is docker?
- Namespaces and cgroups have been since longer but not used because tooling wasn’t good enough.
- Docker is tooling to package using containers and integrating in CI/CD pipelines.
- “Works on my machine problems” no longer exist. No environmental drift problems as images can be shipped easily.
- Docker provides ecosystem for containers.
My first container:
You need to have docker installed.
docker container run -t Ubuntu top
docker container run
command runs a container with Ubuntu image by using the top command. -t flag allocates pseudo-TTY which you need for the top command. This will provide the list of processes running inside the container as the top command does. Due to isolation by namespaces, you only see one process I.e. top process as nothing else is running in the container.
We are using Ubuntu image but it doesn’t have its own kernel, it is shared with the base.
In new terminal, see the list of containers using docker container ls
command.
Next, run docker container exec -it <container id> bash
to run bash inside that container. -it flag lets you enter with interactive mode while allocating pseudo-terminal.
docker container exec
This allows you to enter a running container’s namespaces with a new process.
You can now run all the ubuntu commands here. Try ps -ef
and see the running processes in the container.
Use exit
command to exit the container.
Press CTRL+C to clean the container running the top process.
Namespaces are features of Linux systems but docker works on Mac and Windows too because Docker product has embedded a Linux Subsystem.
On docker store and docker hub, you can find many Docker images. Store images are certified by Docker and Certified images are enterprise ready.
Running multiple containers
Start by using nginx image from the docker store.
docker container run —detach —publish 8080:80 —name ngnix ngnix
detach flag runs this in the background. publish flag publishes port 80 (default for nginx) in the container using port 8080 on host. NET namespace gives processes their own network stack. name flag assigns a name, if you don’t one will be assigned randomly. It makes it easier so you don’t have to use container Id.
NGINX is lightweight web server, you can access it at http://localhost:8080
Next run a MongoDB server. Instead of default latest version, let’s use version 3.4.
docker container run —detach —publish 8081:27017 —name mongo mongo:3.4
You can find it similarly at http://localhost:8081
You can list the containers using the docker container ls
command and further inspect using docker container inspect <container id>
command.
Removing the containers
Stop the containers using docker container stop <container id>
command. You only need to provide first few letters of id until they are unique. Next run docker system prune
to remove the stopped containers.
Making Custom Docker images
Docker Images
- Docker image is a tar file of file system of the container.
- Can be shared and distributed to other users with other ends
- Many containers can be instantiated with an image
- Docker images are downloaded from docker hub only once, later it is reused.
- Docker registry is where we can pull and push images, Docker Hub is one such central registry, just like GitHub.
- Docker hub has many prepackaged images to use, like we did.
- Within enterprise, we can use private registry. You can self-host one too. Registry is available as a docker image too.
Building an image
- Create a “Dockerfile” with list of instructions to construct the container.
- run
docker build -f Dockerfile
Understanding docker layers
- Each line in the docker file is a new layer on image. They are cached, built on previous layers. If you change last layer, all the previous unedited layers remain same and only last one is pushed/rebuilt.
- Organize the docker file such that more dynamic files are added at the end. So that rebuilding is fast as previous ones are cached.
- Image layers are read-only.
- Union File system: Merge image layers into single file system for each container. Handles conflicts and separates read-only underlying image layers and writable layers which can be edited automatically I.e. thin read-write layer on top of containers.
- Copy on write: Copies files up to the writable container layer
- Both of these help use resources optimally.
Building and pushing
- Make a python file. Enter the following command in terminal.
`echo ‘from flask import Flask
app = Flask(name)
@app.route(“/”) def hello(): return “hello world!”
if name == “main”: app.run(host=”0.0.0.0”)’ > app.py `
Install flask using pip and run the python file locally.
2. Create a docker image
Create a file named Dockerfile and copy the following:
FROM python:3.6.1-alpine
RUN pip install flask
CMD ["python","app.py"]
COPY app.py /app.py
- FROM line indicates the starting image to build the layers on top of. We use alpine (Linux distro) because of its lightweight nature. Be aware of the base image being used, if not certified, it can be a malware.
- If no tag (like 3.6.1-alpine) is specified, it shall use latest.
- Next line, RUN command installs the flask package as we did before. RUN commands are executed during build time and are added to the layers of your image.
- CMD is the command executed during start of the container. Here we are just running the app.py file. You can specify only one CMD per Dockerfile, if there are more, only the last one will take effect.
- The last line copies the file to the local directory, where you build the image. Notice that it is the last line because that is the file likely going to be modified a lot. Also notice, even if it is after CMD line, CMD is only executed during the start of the container and not during the build.
3. Build the image
Run docker image build -t python-hello-world .
Notice the period at the end of the command. Check if the image is built using docker image ls
command.
4. Running the image
Next, run the image using docker run -p 5001:5000 -d python-hello-world
-p flag maps the port 5000 inside the container to host. Check https:/localhost:5001 in browser to see result.
Check container log outputs using docker container logs <container id>
5. Pushing the image to registry
Create an account on docker hub, if you don’t have one already. Then login in terminal docker login
. Tag the image with your username as docker tag python-hello-world <username>/python-hello-world
And lastly push using docker push <username>/python-hello-world
.
You can check the upload by visiting your profile on Docker Hub.
Next, try to make some changes in app.py like changing the “hello world” to something else. Rebuild the image using docker image build -t <username>/python-hello-world
and push the same. Notice “using cache” while building and “layer already exists” while pushing. Use
docker diff <container id>
to see which files have been pulled up to the container level (thin read write layer). You can also use docker image history python-hello-world
command to look closely at the layers.
Container Orchestration
For production, face problems of scheduling applications, scaling, updates, fault tolerance, Zero downtime deployments, etc.
Container Orchestration is a project that helps with the following to solve the above problems:
- Cluster management
- Scheduling
- Service discovery
- Replication
- Health management
- Declare desired state
- Active reconciliation, actively rescheduling containers on failed nodes
E.g. Kubernetes, Docker Swarm, MESOS, etc. Also hosted services like Amazon AWS ECS, Microsoft Azure, IBM cloud container service, etc.
Other features:
- Docker swarm: special mode activated by command
docker swarm init
. You can use —advertise-addr flag to specify the address which other nodes will use to join swarm. It generates a token for other nodes to join. - To run containers on swarm, we need to create service, which is an abstraction representing multiple containers of the same image deployed across the cluster.
- —publish flag while using swarm has routing mesh which takes care of the requests coming in on the respective port I.e. the requests are sent to port on any node that is running a container for that service.
- You can scale up using —replica flag.
- If a node goes down, it reconciles containers on remaining nodes.
- Have multiple managers is a good idea to accommodate manager node failure. Manager nodes implement raft consensus algorithm, and with more than 50% of the nodes agreement, the state is stored for the cluster.
- There can be thousands of worker nodes
- To use GUI with docker for linux
xhost +
sudo docker run -it --rm -ti --net=host --ipc=host -v /tmp/.X11-unix:/tmp/.X11-unix --env="QT_X11_NO_MITSHM=1" --gpus all -e DISPLAY=$DISPLAY <image_name>
- To save changes to the container as an image, use docker commit:
sudo docker commit [CONTAINER_ID] [new_image_name]
That’s it for the day.