Creation of Environment
This documentation has been prepared to assist the implementation of the Axolotl training and fine tuning platform
Environment Creation
The first step in the process is ensuring our machine is optimised to use the NVIDIA platform and GPUs
Base Distribution: Ubuntu 20.04
We are going to use Ubuntu 20.04 as the base image because:
-This version is known for its long-term support (LTS)
-Ubuntu is well-supported by NVIDIA
-Works well as remote instance so we can access high powered GPUs
-These remote environments allows all team members to access our platform
Integrated Development Environment (IDE)
We recommend using Visual Studio Code (VS Code) for our Integrated Development Environment
Support for Remote Development: VS Code allows remote development support, crucial for accessing and managing virtual machines with powerful GPUs
Integrated Terminal and Docker Support: The integrated terminal in VS Code enables direct interaction with command-line tools, essential for managing Docker containers and executing model training scripts.
Extensive Language Support: Large language model development often involves multiple programming languages (like Python, C++). VS Code supports a wide range of languages and their specific tooling, which is critical for such multifaceted development.
Version Control Integration: With built-in Git support, VS Code makes it easier to track and manage changes in code
Virtual Machine Requirements:
-Docker*
-CUDA (Version 12.1)*: parallel computing platform and programming model
-NVIDIA NGC: NVIDIA Container toolkit for access to NVIDIA Docker Container
-NVIDIA CUDA Toolkit*: compiler for CUDA, translates CUDA code into executable programs
-GCC: the compiler required for development using the CUDA Toolkit
-GLIBC: the GNU Project's implementation of the C standard library. Includes facilities for basic file I/O, string manipulation, mathematical functions, and various other standard utilities.
*Please note, Continuum's base virtual machine installation script installs Docker, the NVIDIA Container Toolkit and CUDA Driver 12.1 as well as the NVIDIA Container Toolkit
Check the virtual machine is ready
To ensure that the virtual machine is set up for the training and fine tuning of large language models, follow the instructions below:
Check the installation of the NVIDIA CUDA Toolkit
What is the CUDA Toolkit?
The NVIDIA CUDA Toolkit provides a development environment for creating high performance GPU-accelerated applications.
With the CUDA Toolkit, you can develop, optimize, and deploy your applications on GPU-accelerated systems.
The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library to deploy your application.
We will be installing the NVIDA CUDA Toolkit, version 12.1
First, check to see if the CUDA Toolkit is installed, we can check to see whether the core compiler is installed, the NVIDIA CUDA Compiler (NVCC)
Nvidia CUDA Compiler (NVCC) is a part of the CUDA Toolkit. It is the compiler for CUDA, responsible for translating CUDA code into executable programs.
NVCC takes high-level CUDA code and turns it into a form that can be understood and executed by the GPU. It handles the partitioning of code into segments that can be run on either the CPU or GPU, and manages the compilation of the GPU parts of the code.
First, to check if NVCC is installed and its version, run
nvcc --version
If the NVIDIA CUDA Toolkit has been installed, this will be the output:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 12.1, V11.8.89
12.1.r11.8/compiler.31833905_0
You should see release 12.1 - which indicates the CUDA Toolkit Version 12.1 has been successfully installed.
If NVCC is not installed, then go ahead an install the CUDA Toolkit 12.1.
The CUDA Toolkit download website is located here:
The web application at this site will ask you to define your installation set up.
For our base virtual machine this will be:
Operation System
Linux
Architecture
x86_64 (64 bit)
Distribution
Ubuntu
Version
20.04
Installer Type
deb (local)

The detail instructions and explanation of the process is below:
Post installation, to check if NVCC has been installed successfully and check its version, run
nvcc --version
If the NVIDA CUDA Toolkit has been installed, this will be the output:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 12.1, V11.8.89
.r11.8/compiler.31833905_0
If NVCC is pointing to an older version of CUDA despite upgrading to 12.1 you will need to follow the instructions below. CUDA must be on PATH!
CUDA Reference Materials*
Please find the NVIDIA CUDA Installation Guide for Linux
If you are interested, you can familiarise yourself with CUDA Best Practices:
Below is a summary of CUDA Programming
Check the installation of the NVIDIA Container Toolkit
The NVIDIA Container Toolkit enables users to build and run GPU-accelerated containers.
The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs.
This allows you to use NVIDIA Container in Docker.
Background and Explanation

The NVIDIA Container Toolkit is designed to integrate NVIDIA GPUs into containerised applications. It's compatible with various container runtimes and consists of several components:
NVIDIA Container Runtime (nvidia-container-runtime): An OCI-compliant runtime for Docker or containerd, enabling the use of NVIDIA GPUs in containers.
NVIDIA Container Runtime Hook (nvidia-container-toolkit / nvidia-container-runtime-hook): A component executing prestart scripts to configure GPU access in containers.
NVIDIA Container Library and CLI (libnvidia-container1, nvidia-container-cli): These provide a library and CLI for automatically configuring containers with NVIDIA GPU support, independent of the container runtime.
The toolkit's architecture allows for integration with various container runtimes like Docker, containerd, cri-o, and lxc. Notably, the NVIDIA Container Runtime is not required for cri-o and lxc.
The toolkit comprises main packages: nvidia-container-toolkit, nvidia-container-toolkit-base, libnvidia-container-tools, and libnvidia-container1, with specific dependencies between them.
Older packages like nvidia-docker2 and nvidia-container-runtime are now deprecated and merged into the nvidia-container-toolkit.
The Architecture

Key functionalities
NVIDIA Container Runtime Hook: Implements a runC prestart hook, configuring GPU devices in containers based on the container's
config.json
.NVIDIA Container Runtime: A wrapper around runC, modifying the OCI runtime spec for GPU support.
NVIDIA Container Toolkit CLI: Offers utilities for configuring runtimes and generating Container Device Interface (CDI) specifications.
For installation, the nvidia-container-toolkit
package is generally sufficient.
The toolkit's packages are available on GitHub, useful for both online and air-gapped installations. The repository also hosts experimental releases of the software.
Check for Docker
Before installing ensure you have Docker installed
docker --version
The output should be as below:
Docker version 26.0.2, build 3c863ff
To install the NVIDIA Container Toolkit, follow the instructions below:
Post installation, you can check to see if the Container Toolkit is working:
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
The output should look like this:
---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100-SXM4-80GB On | 00000000:00:05.0 Off | 0 |
| N/A 35C P0 55W / 400W| 5MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
Compatibility Testing
The CUDA development environment relies on tight integration with the host development environment, including the host compiler and C runtime libraries, and is therefore only supported on Ubuntu versions that have been qualified for the CUDA Toolkit release.
Now that we have installed the NVIDIA CUDA Toolkit and the NVIDIA Container Toolkit, we need to ensure our virtual machine is compatible with these installations.
Compatibility is critical
The material below provides instructions on how to ensure the NVIDIA Drivers are compatible with the host system.
With the NVIDIA CUDA Toolkit installed, we need to ensure the host machine is compatible with this Toolkit.
Compatibility between CUDA 12.1 and the host development environment
This table lists the kernel versions, default GCC (GNU Compiler Collection) versions, and GLIBC (GNU C Library) versions for two different LTS (Long-Term Support) releases of Ubuntu.
Ubuntu 22.04 LTS
5.15.0-43
11.2.0
2.35
Ubuntu 20.04 LTS
5.13.0-46
9.3.0
2.31
Check the Kernel compatibility
To check the kernel version of your Ubuntu 20.04 system, you can use the uname
command in the terminal. The uname
command with different options provides various system information, including the kernel version. Here's how you can do it:
Run the uname
command to get the kernel version by typing the following command and press Enter:
uname -r
The output should be this on a typical Ubuntu WSL2 distribution:
5.15.133.1-microsoft-standard-WSL2
or this on a typical Ubuntu 20.04 virtual machine
5.4.0-167-generic
As you can see here the first Linux kernel is 5.15.133.1 - which is compatible with the CUDA Toolkit installed (range of 5.13.0 to 5.13.46).
The second Linux kernel is also compatible at 5.4 (range of 5.13 to 5.46)
Check GNU Compiler Compatibility
NVIDIA CUDA Libraries work in conjunction with GCC (GNU Compiler Collection) on Linux systems.
GCC is commonly used for compiling the host (CPU) part of the code, while CUDA tools like nvcc (NVIDIA CUDA Compiler) are used for compiling the device (GPU) part of the code.
The CUDA Toolkit includes wrappers and libraries that facilitate the integration between the CPU and GPU parts of the code.
NVIDIA provides compatibility information for specific versions of GCC, especially on Linux systems where GCC is a common choice for compiling the host code.
The CUDA runtime libraries, which are installed separately, are sufficient for running CUDA applications on systems with compatible NVIDIA GPUs.
The gcc compiler is required for development using the CUDA Toolkit
To reiterate - when developing applications that use both CPU and GPU, developers might use GCC for compiling the CPU part of the code, while CUDA tools (like nvcc - NVIDIA CUDA Compiler) are used for compiling the GPU part.
The CUDA toolkit often includes compatibility information with specific versions of GCC, especially on Linux systems, where GCC is a common choice for compiling the host code.
Run the following command to check the installed version of GCC:
gcc --version
The first line of the output will show the version number. Ensure it matches the default GCC version listed in your table for your Ubuntu version.
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2)
9.4.0 <--- This is the GCC version number
If you do not have GCC installed, execute the following:
Post installation of build essentials, check the GCC version you have:
gcc --version
The output should now prove you have GCC installed:
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2)
This version 9.4 should work with CUDA 12.1, which requires at least 9.3. GCC is considered 'backward compatible', so this version of 9.4 should be fine.
Check GLIBC Compatibility
The GNU C Library, commonly known as glibc, is an important component of GNU systems and Linux distributions.
GLIBC is the GNU Project's implementation of the C standard library. It provides the system's core libraries. This includes facilities for basic file I/O, string manipulation, mathematical functions, and various other standard utilities.
To check the GLIBC version:
ldd --version
The first line of the output will show the version number. For example:
ldd (Ubuntu GLIBC 2.31-0ubuntu9.9) 2.31
The output should be:
ldd (Ubuntu GLIBC 2.31-0ubuntu9.12) 2.31 <--This is the version
Compare this with the GLIBC version in your table.
The GLIBC version of 2.31 is the same as the version required for the NVIDIA CUDA Toolkit
With the NVIDIA CUDA Toolkit's compatibility with host installations, the next step is to do a check for compatibility
Process for checking installations have been successful
First, check your Ubuntu version. Ensure it matches Ubuntu 20.04, which is our designated Linux operating system
lsb_release -a
Then, verify that your system is based on the x86_64 architecture. Run:
uname -m
The output should be:
x86_64
To check if your system has a CUDA-capable NVIDIA GPU, run
nvidia-smi
You should see an output like this, which details the NVIDIA Drivers installed and the CUDA Version.
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 On | 00000000:00:05.0 Off | 0 |
| N/A 36C P0 56W / 400W| 4MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1314 G /usr/lib/xorg/Xorg 4MiB |
+---------------------------------------------------------------------------------------+
If this output is not visible, we must install the NVIDIA Drivers
A full analysis
To do this all at once...
If you would like a full printout of your system features, enter this command into the terminal:
echo "Machine Architecture: $(uname -m)" && \
echo "Kernel Name: $(uname -s)" && \
echo "Kernel Release: $(uname -r)" && \
echo "Kernel Version: $(uname -v)" && \
echo "Hostname: $(uname -n)" && \
echo "Operating System: $(uname -o)" && \
echo "----" && \
cat /proc/version && \
echo "----" && \
echo "CPU Information:" && cat /proc/cpuinfo | grep 'model name' | uniq && \
echo "----" && \
echo "Memory Information:" && cat /proc/meminfo | grep 'MemTotal' && \
echo "----" && \
lsb_release -a 2>/dev/null && \
echo "----" && \
echo "NVCC Version:" && nvcc --version
The output from the terminal will provide you all the information necessary to check system information for compatibility.
Installation of .NET SDK - required for Polyglot Notebooks
Test Compatibility
Below are some scripts to create to test for compatibility.
These scripts will test that both your CPU and GPU are correctly processing the CUDA code. It will also test to make sure there are no compatibility issues between the installed GCC version and the CUDA Toolkit version you are using.
Remember: Compatibility between the GCC version and the CUDA Toolkit is crucial. Make sure the GCC version you choose is compatible with your CUDA Toolkit version.
Where are you now?
We have now created a deep learning development environment optimised for NVIDIA GPUs, with compatibility across key components.
We have so far:
-Installed CUDA Toolkit and Drivers
-Set up the NVIDIA Container Toolkit to allow access to NVIDIA Docker containers
-Ensured Host Compatibility by verifying variables such as GCC (GNU Compiler Collection) and GLIBC (GNU C Library) are compatible with the CUDA version.
-Created a Compatibility Check Script: Developing a script to check for compatibility issues
With these components in place, your environment is tailored for deep learning development. It supports the development and execution of deep learning models, leveraging the computational power of GPUs for training and inference tasks.
==> With the environment established for NVIDIA GPUS, the next step is creating the virtual environment for Axolotl and installing the code base
Last updated
Was this helpful?