Wiki Contents

Docker - Overview


Docker

Docker is a platform designed to make it easier to create, deploy, and run applications using containers. Containers allow developers to package an application with all its necessary parts, such as libraries and dependencies, and ship it as one package. This ensures that the application will run in the same way regardless of where the container is being run, providing consistency across different stages of the development lifecycle and across different environments.

Dockerfiles

A Dockerfile is a script used by Docker to automate the building of container images. It contains a set of instructions that specify the base image to start from, the software to install, the files to copy into the image, environment variables to set, and other settings. Once you have a Dockerfile, you can use the docker build command to create a Docker container image from it.

Docker Images

A Docker image is a lightweight, stand-alone, executable software package that contains everything needed to run a piece of software, including the code, runtime, system tools, system libraries, and settings. It is created from a set of instructions specified in a Dockerfile. Once created, this image can be used to instantiate Docker containers that run the packaged software.

Docker Containers

A Docker container is a running instance of a Docker image. It encapsulates the application, its environment, and a layer for writable storage, all running in isolation on top of the host operating system's kernel.

Why Use Docker at the KDD Lab?

Environment Consistency

Research often requires replicable and consistent environments. Docker ensures that the software environment is consistent across different stages of research, from development to production, reducing the "it works on my machine" problem.

Isolation

If multiple experiments or processes need to be run simultaneously, Docker containers can isolate them, ensuring they don't interfere with each other.

Version Control for Environments

Docker images can be versioned, which can be crucial for research reproducibility. If a researcher wants to go back to an earlier version of an experiment environment, they can do so easily with Docker.

Resource Efficiency

Unlike virtual machines, Docker containers share the host OS kernel, making them lightweight. This is particularly useful in labs where resources might be limited and need to be efficiently utilized.

Easy Distribution

Researchers can share their Docker images with colleagues, ensuring that the recipient has the exact same environment. This is beneficial for collaborative research.

Safety and Experimentation:

Containers allow researchers to experiment without fear. If something breaks, it doesn't affect the host machine or other containers. This safety net can encourage more adventurous testing or experimentation.

Collaboration with Industry

If the lab collaborates with industry partners, Docker provides a means to package research outputs in a manner that's easily deployable in industry settings.

Last updated by rotclanny on Oct 7, 2023