Ubuntu 22.04 Setup Guide¶
Overview¶
Ubuntu 22.04 LTS (Jammy Jellyfish) is an excellent choice for Orion nodes due to its stability, long-term support, and extensive hardware compatibility.
Base Node Configuration¶
System Update¶
Begin with a system update to ensure all packages are current:
Swap Configuration¶
Disable swap for optimal Kubernetes performance:
To make this change persistent across reboots, comment out any swap entries in/etc/fstab
:
Docker Installation¶
Docker installation includes containerd and other packages required for Kubernetes, with pre-configured container runtimes:
Recommended Packages¶
These additional utilities enhance functionality for networking, storage, and system management:
GPU Configuration¶
Follow these steps to prepare your system for GPU-enabled workstations and rendering nodes.
Update System for GPU Support¶
Ensure your system has the latest updates before proceeding with GPU setup:
Verify GPU Hardware Detection¶
Confirm that your system recognizes the GPU hardware:
Video Device Mounted
If this passes, you should see the following output:
No Output
If you do not see this output, the kernel is not recognizing the GPU. Normally this is due to the kernel not having the correct firmware or the kernel needs to be updated.
Kernel Update
If you are updating the kernel, make sure you have a backup of your system and that you are prepared for a reboot.
To update the firmware and kernel, run the following commands:
NVIDIA Driver Installation¶
To install the NVIDIA drivers, follow these steps. You can also reference the Ubuntu NVIDIA Documentation directly.
1. Install Ubuntu Drivers Helper¶
Use Ubuntu's built-in driver detection and installation tool:
2. Verify Driver Installation¶
After the system restarts, verify that the drivers are installed correctly:
Drivers Installed
If this passes, you should see the following output but for your GPU model:
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.216.01 Driver Version: 535.216.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2080 On | 00000000:00:10.0 Off | N/A |
| 0% 33C P8 13W / 215W | 1MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
Confirm GPU device mounting:
Video Device Mounted Properly With NVIDIA Drivers
If this passes, you should see something similar to the following output with your cards and a render device:
total 0
drwxr-xr-x 3 root root 120 Jan 30 18:12 .
drwxr-xr-x 18 root root 3.4K Jan 30 18:13 ..
drwxr-xr-x 2 root root 100 Jan 30 18:12 by-path
crw-rw---- 1 root video 226, 0 Jan 30 18:12 card0
crw-rw---- 1 root video 226, 1 Jan 30 18:12 card1
crw-rw---- 1 root render 226, 128 Jan 30 18:12 renderD128
NVIDIA Container Toolkit Installation¶
Now we need to configure the NVIDIA Container Toolkit. This will allow us to run GPU enabled containers which is then provided as a runtime in kubernetes via the CRI.
1. Add NVIDIA Container Toolkit Repository¶
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
2. Install NVIDIA Container Toolkit¶
3. Configure Runtime Environments¶
For containerd (required for Kubernetes):
If Docker is installed:
4. Verify Container Toolkit Installation (Optional)¶
To confirm the NVIDIA Container Toolkit is working correctly:
Testing the NVIDIA Container Toolkit
If you would like to test the NVIDIA Container Toolkit, you can run the following commands. This is not needed for the installation but is a good way to verify that the toolkit is installed correctly.
Working Toolkit
If you see the output of nvidia-smi
then the toolkit is installed correctly.
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
INFO[0000] Loading config from /etc/docker/daemon.json
INFO[0000] Wrote updated config to /etc/docker/daemon.json
INFO[0000] It is recommended that docker daemon be restarted.
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
de44b265507a: Pull complete
Digest: sha256:80dd3c3b9c6cecb9f1667e9290b3bc61b78c2678c02cbdae5f0fea92cc6734ab
Status: Downloaded newer image for ubuntu:latest
Thu Jan 30 23:44:20 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.216.01 Driver Version: 535.216.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2080 On | 00000000:00:10.0 Off | N/A |
| 0% 30C P8 12W / 215W | 1MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
Next Steps¶
After completing this setup, Proceed to Orion cluster deployment by following our On-Prem Installation Guide.
For further assistance, contact Juno Support.