A container is nothing but a lightweight package or a software unit; it contains the necessary codes and dependencies that allow applications to run smoothly. To run a container requires OS-level virtualisation tools, code, runtime, and an ecosystem (system tools, libraries, and settings).
An NVIDIA Container is one of the NVIDIA processes and services that run on the pc after installing NVIDIA drivers. Also known as nvcontainer.exe, NVIDIA utilises these containers for developing, testing, and benchmarking. They facilitate the deployment of High-Performance Computing (HPC) applications and deep learning (DL) frameworks.
These containers are mainly useful because of their data centre application skills, as they can encapsulate the dependencies of applications. It is so because it provides reproducible and reliable execution of applications and services. And now, NVIDIA offers its containers to be GPU accelerated.
How to run GPU-supported Container Runtime?
The NVIDIA Container Runtime is Graphics Processing Unit (GPU) hardware. Due to its compatibility with Open Containers Initiative (OCI), CRI-0, and other container technologies, it can simplify the process of building and deploying GPU accelerated applications in the container runtime ecosystem.
However, to run the next-gen GPU-aware container of NVIDIA, it has to undergo a series of integration before being activated from the NVIDIA drivers.
Environment Variables Required
Before enabling GPU support in container runtime, NVIDIA uses certain environment variables to specify the GPU accelerated containers. These include the following:-
- NVIDIA_VISIBLE_DEVICES: It controls the accessibility of GPUs in the containers (all GPUs are by default accessible in the container)
- NVIDIA_DRIVER_CAPABILITIES: It supervises which driver capabilities (means driver features like graphics) will be exposed or made visible to the container
- NVIDIA_REQUIRE_*: It defines the constraints in the container through its configurations (like computing capability, driver features, minimum CUDA, etc.)
More or less, after detecting these environment variables, the enabling of the GPU starts. However, if none are detected on the command line, 'runc' is used by default.
Enable GPU Support for Docker
Docker is a popular container technology for NVIDIA GPU-aware drivers. Due to NVIDIA Container Runtime, developers can easily register runtime during the container's formation to expose NVIDIA GPUs to the container's apps and utilise them.
To install a docker supported version, the system must satisfy the following criteria:
- Go through these instructions for getting Docker and check the supported version for the system
- To get the Cuda-drivers package, use package manager or install it directly from the drivers' download site
(Manual Docker Engine setup is also permitted, provided the instructions are followed)
Enable GPU Support for CRI-0
Container Runtime Interface (CRI) is a lightweight Kurbenetes container runtime.
Similar to OCI (by docker) runtimes, CRI-O is capable of NVIDIA Container Runtime GPU-accelerated applications on Kubernetes. With the availability of a 'container,' the integration happens via a plugin.
Enable GPU Support for Linux Containers (LXC)
LXC is an operating system (OS) level virtualisation tool that is used for building Virtual Machines (VMs). It runs several isolated Linux system containers on a controlled host using only a Linux kernel.
It proves advantageous for underprivileged containers that require deployment in HPC environments, import Docker images, and multiple Linux distribution networks (like the GPU-aware NVIDIA Container Runtime ecosystem). However, it is basically for those users who do not have high administrative rights to run containers.
To enable GPU support for LXC, the following repositories are used before installing LXC and skpeo (a dependent tool):
$ sudo add-apt-repository ppa:ubuntu-lxc/lxc-stable
$ sudo apt-add-repository ppa:projectatomic/ppa
To find more information on the installation of LXC, follow the LXC template here.
- How does Docker 'docker-ed' the container to enable GPU support?
Ans: OCI, used by Docker container technology, pre-starts the hook of nvidia-container-runtime-hook to runc (From the runc layer) and starts integration of libnvidia-container into Docker.
Then, the environment variables of NVIDIA begin to expose drive features to GPU during container creation time till it gets 'docker-ed' or integrated flexibly to enable GPU support.
- From where can I get started on NVIDIA GPU-accelerated Container Runtime?
Ans: NVIDIA Container runtime supports container technologies like Docker, LXC, CRI-0, etc. Therefore, to get started on NVIDIA, the developers must encapsulate their GPU-accelerated applications with their dependencies to form a single package. This package will be passed on to a deployment environment (like OCI, LXC, etc.)
Hence, follow here to do the same and activate the NVIDIA GPU-aware container runtime ecosystem.
- Are there any other methods of installing NVIDIA containers?
Ans: Yes, there are. You can use NVIDIA GPU Cloud (NGC) Containers. The NVIDIA NGC™ contains a host of several GPU-optimised containers for DL, ML, visualisation, and HPC applications in its catalogue. The main advantage is that almost all NVIDIA systems are pre-configured to run NGC.
Register here at the NGC catalogue and follow the rest of the steps as redirected by the website to install NGC containers.