Development Environment Setup (Required)
To build, develop, and ship NWAVE consistently across multiple GPU platforms, we provide preconfigured Docker environments for developing PyTorch extensions with HIP, supporting both:
- AMD GPUs (ROCm)
- NVIDIA GPUs (CUDA)
These environments ensure:
- Identical toolchains across platforms
- Consistent ROCm / CUDA / HIP versions
- Proper GPU access and permissions
- A reproducible setup for all developers
Warning
Using these environments is strongly recommended. Deviating from them will likely result in build issues, missing symbols, or platform-specific incompatibilities.
The Dockerfiles are derived from the official ROCm examples repository, and are available in the Neuronova's github if you're an authorized client.
What’s Included
Each environment comes fully set up with:
- Ubuntu base image
- HIP development tools and libraries
- Support for ROCm (AMD) and CUDA (NVIDIA)
- Anaconda Python distribution
- A ready-to-use Conda environment (
PyTorch) - CMake and standard development tools
- Correct GPU device access, IPC, and shared memory configuration
Prerequisites
Before building or running the containers, make sure you have:
- Docker installed
- AMD users:
- ROCm-compatible GPU
- ROCm drivers installed on the host
- NVIDIA users:
- CUDA-compatible GPU
- NVIDIA drivers installed on the host
Building the Docker Images
User ID and Group ID Configuration (Important)
To allow the container to access GPUs correctly and read/write files in a shared host directory the container user (developer) must match your host user ID (UID) and render group ID (GID).
Get your render group GID
getent group render | cut -d: -f3
Get your UID
id -u $USER
Now take the appropriate Dockerfile:
hip-libraries-rocm-ubuntu-630.Dockerfile(if using an AMD GPU)hip-libraries-cuda-ubuntu-630.Dockerfile(if using an NVIDIA GPU)
Look for those entries in the file, and update them with the values obtained from previous steps:
#### USER CONFIG
ARG GID=109 # render group ID
ARG UID=1000 # user ID
AMD ROCm Configuration
GPU Architecture Selection
PyTorch ROCm binaries are officially tested only on a subset of GPUs.
To enable all GPUs within a given architecture, you must set the correct override in the hip-libraries-rocm-ubuntu-630.Dockerfile file:
#### SELECT GPU
# RDNA / RDNA 2:
# ENV HSA_OVERRIDE_GFX_VERSION=10.3.0
# RDNA 3 / RDNA 3.5 (default):
ENV HSA_OVERRIDE_GFX_VERSION=11.0.0
Choose the value matching your hardware.
Build the AMD Image
docker build . \
-f hip-libraries-rocm-ubuntu-630.Dockerfile \
-t kernel_devel_amd_630
NVIDIA CUDA Configuration
The NVIDIA image is based on CUDA base containers and supports HIP-on-CUDA.
Build the NVIDIA Image
docker build . \
-f hip-libraries-cuda-ubuntu-630.Dockerfile \
-t kernel_devel_nvidia_630
Running the Containers
We recommend creating a shared workspace directory on the host, for example:
/home/<username>/DockerMem/<container-name>
This directory is mounted inside the container at:
/home/workspaces
allowing seamless file sharing between host and container.
Running on AMD ROCm
To enable GPU access, ROCm requires explicit device mappings and security options.
docker run \
--mount type=bind,source=/abs/path/to/your/shared/folder,target=/home/workspaces \
-ti \
--cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
--device=/dev/kfd \
--device=/dev/dri \
--group-add video \
--ipc=host \
--shm-size 16G \
--name <container-name> \
kernel_devel_amd_630
Running on NVIDIA CUDA
Install NVIDIA Container Toolkit
Restart Docker after installation:
systemctl restart docker
Run the NVIDIA Container
docker run \
--mount type=bind,source=/abs/path/to/your/workspace,target=/home/workspaces \
-ti \
--gpus all \
--shm-size 16G \
--name <container-name> \
kernel_devel_nvidia_630
Using the Environment
Activate the preinstalled Conda environment:
source activate PyTorch
This last command must be runned everytime the NWAVE library will be used. If using VS code, when attaching to a running container, choose the PyTorch environment to correctly use the docker environment.
Installing NWAVE
NWAVE is distributed as a .whl file. To install in the configured docker environment, inside the docker:
source activate PyTorch
pip install nwavesdk-0.0.1a0-cp310-cp310-linux_x86_64.whl
Troubleshooting
GPU Not Detected
- AMD: Verify ROCm installation and presence of
/dev/kfdand/dev/dri - NVIDIA: Verify NVIDIA Container Toolkit and driver installation
Permission Issues
- Ensure your user is in the
dockergroup - Confirm UID/GID match between host and container
- Check permissions on the shared workspace directory
Contributing
For issues, feature requests, or questions, either raise a github issue on the release repo or send a mail at giuseppe@neuronovatech.com.