rocm
License information
ROCmâ„¢ is made available by Advanced Micro Devices, Inc. under the open source license identified in the top-level directory for the library in the repository on Github.com (Portions of ROCm are licensed under MITx11 and UIL/NCSA. For more information on the license, review the license.txt in the top-level directory for the library on Github.com).
User documentation (central installation)
This is ROCm™ installed in a non-standard location which may have consequences for some programs that may rely on the standard location for ROCm™.
Neither can we guarantee that these modules are always compatible with the HPE Cray Programming Environment as each version of the CPE is developed for particular ROCm™ versions.
-
The only modules officially supported by the current AMD GPU driver at the time of writing (February 2026) are the
6.2.2,6.2.4and6.4.4modules. Older modules may still be present on the system as a full clean-up is nearly impossible, but modules older than ROCm™ 6.1 will likely not be fully functional and there is nothing the LUMI User Support Team can do about that.The
6.2.2module is only there for historical reasons as it was installed before6.2.4became available. -
The ROCm modules have some PDF documentation in some subdirectories of
$EBROOTROCM/share/doc. TheEBROOTROCMenvironment variable is defined after loading the module. -
The
6.2.2and6.2.4modules can be used withPrgEnv-amdbut come without matchingamd/6.2.2oramd/6.2.4module. It is sufficient to load therocm/6.2.2orrocm/6.2.4module after thePrgEnv-amdmodule (orcpeAMDmodule) to enable this ROCm version also for the compiler wrappers in that programming environment. -
The
6.2.2and6.2.4modules are not compatible with the CCE 17.0.1 compilers (in the 23.09 version of the programming environment) due to an incompatibility between LLVM 17 on which the CCE is based and LLVM 18 from ROCm™ 6.2. The only supported programming environments are PrgEnv-gnu (or cpeGNU) and PrgEnv-amd (or cpeAMD). -
Since ROCm™ 6.2, hipSolver depends on SuiteSparse. If an application depends on hipSolver, it is the user's responsibility to load the SuiteSparse module which corresponds to the CPE they wish to use (cpeAMD or cpeGNU). Note that the SuiteSparse module needs to be loaded before the
rocmmodule or the regularrocmmodule for the toolchain will be used.
Note that using ROCm™ in containers is still subject to the same driver compatibility problems as when using these modules. Though containers will solve the problem of ROCm™ being installed in a non-standard path (which was needed for the modules as the standard path is already occupied by a different ROCm™ version), it will not solve any problem caused by running a newer version of ROCm™ on a too old driver (and there may be problems running an old version of ROCm™ on a too new driver also).
User documentation (user installation)
There are some additional ROCm™ modules that are not pre-installed on the system because they are meant for very specialised use or are not fully compatible with the system software.
At the time of writing (February 2026), the following user-installable modules are compatible with the GPU driver on LUMI:
-
rocm./6.1.3is meant to be used with the 23.09 compilers. It is the last version of ROCm™ based on LLVM 17, the LLVM version used in the Cray compilers in 23.09, and the only such version compatible with the GPU driver that became available after the January 2026 system update. -
rocm/7.0.3is offered as-is. RCCL does not work and there is also no matching MPI library. ROCm™ 7 support will only come with the 26.03 version of the HPE Cray Programming Environment, but it remains to be seen how the RCCL issues can get solved.
User documentation (singularity container)
Future of these containers
It is not clear yet if we will be able to offer equivalent containers in the future. The containers developed before the system update of January 2026 were based on SUSE Linux and hence very close to the system. However, the LUMI AI factory has taken over some of the development of containers mainly meant for AI use on LUMI. However, they work with a different basis (and containers based on Ubuntu) so it may not be possible to, e.g., inject parts from the HPE Cray Programming Environment if needed. See also the LUMI AI Factory AI Software Environment documentation.
The rocm container is developed by AMD specifically for LUMI and contains the necessary parts explore ROCm. The use is rather limited because at the moment the methods that can be used to build upon an existing container are rather limited on LUMI due to security concerns with certain functionality needed for that. The can however be used as a base image for cotainr and it is also possible in some cases to extend them using the so-called SingularityCE "unprivileged proot build" process.
It is entirely normal that some features in some of the containers will not work. Each ROCm driver supports only particular versions of packages. E.g., the ROCm driver from ROCm™ 6.3.4 is only guaranteed to support ROCm™ versions between 6.1 and 7.0 and hence problems can be expected with older or newer ROCm™ versions. There is nothing LUMI support can do about it. Only one driver version can be active on the system, and installing a newer version depends on other software on the system also and is not as trivial as it would be on a PC.
Use via EasyBuild-generated modules
The EasyBuild installation with the EasyConfigs mentioned below will do three things:
-
It will copy the container to your own file space. We realise containers can be big, but it ensures that you have complete control over when a container is removed.
We will remove a container from the system when it is not sufficiently functional anymore, but the container may still work for you.
If you prefer to use the centrally provided container, you can remove your copy after loading of the module with
rm $SIFfollowed by reloading the module. This is however at your own risk. -
It will create a module file. When loading the module, a number of environment variables will be set to help you use the module and to make it easy to swap the module with a different version in your job scripts.
-
SIFandSIFROCMboth contain the name and full path of the singularity container file. -
SINGULARITY_BINDwill mount all necessary directories from the system, including everything that is needed to access the project, scratch and flash file systems.
-
-
It will create the
runscriptssubdirectory in the installation directory that can be used to store scripts that should be available in the container, and thebinsubdirectory for scripts that run outside the container.Currently there is one script outside the container:
start-shellwill start a bash session in the container, and can take arguments just as bash. It is provided for consistency with planned future extensions of some other containers, but really doesn't do much more than callingand passing it the arguments that were given to the command.
Note that the installation directory is fully erased when you re-install the container module using EasyBuild. So if you chose to use it to add scripts, make sure you store them elsewhere also so that they can be copied again if you rebuild the container module for some reason.
Installation via EasyBuild
To install the container with EasyBuild, follow the instructions in the
EasyBuild section of the LUMI documentation, section "Software",
and use the dummy partition container, e.g.:
To use the container after installation, the EasyBuild-user module is not needed nor
is the container partition. The module will be available in all versions of the LUMI stack
and in the CrayEnv stack
(provided the environment variable EBU_USER_PREFIX points to the right location).
Direct access
The ROCm containers are available in the following subdirectories of /appl/local/containers:
-
/appl/local/containers/sif-images: Symbolic link to the latest version of the container for each ROCm version provided. Those links can change without notice! -
/appl/local/containers/tested-containers: Tested containers provided as a Singulartiy.siffile and a docker-generated tarball. Containers in this directory are removed quickly when a new version becomes available. -
/appl/local/containers/easybuild-sif-images: Singularity.sifimages used with the EasyConfigs that we provide. They tend to be available for a longer time than in the other two subdirectories.
If you depend on a particular version of a container, we recommend that you copy the container to
your own file space (e.g., in /project) as there is no guarantee the specific version will remain
available centrally on the system for as long as you want.
When using the containers without the modules, you will have to take care of the bindings as some system files are needed for, e.g., RCCL. The recommended minimal bindings are:
or, for those containers where MPI still fails to load due to a missing libjansson,
and the bindings you need to access the files you want to use from /scratch, /flash and/or /project.
You can get access to your files on LUMI in the regular location by also using the bindings
Note that the list recommended bindings may change after a system update.
Using the images as base image for cotainr
We recommend using these images as the base image for cotainr if you want to
build a container with cotainr
that needs ROCm. You can use the --base-image=<my base image> flag of the cotainr command
to indicate the base image that should be used.
If you do so, please make sure that the GPU software you install from conda-forge or via pip
with cotainr is compatible with the version of ROCm in the container that you use as the base
image.
PyTorch with cotainr (click to expand)
Note that this is an old example that still needs updating with versions of software appropriate for the current GPU drivers, but it shows the idea.
To start, create a Yaml file to tell cotainr which software should be installed.
As an example, consider the file below which we name py312_rocm603_pytorch.yml
name: minimal_pytorch
channels:
- conda-forge
dependencies:
- filelock=3.15.4
- fsspec=2024.9.0
- jinja2=3.1.4
- markupsafe=2.1.5
- mpmath=1.3.0
- networkx=3.3
- numpy=2.1.1
- pillow=10.4.0
- pip=24.0
- python=3.12.3
- sympy=1.13.2
- typing-extensions=4.12.2
- pip:
- --extra-index-url https://download.pytorch.org/whl/rocm6.0/
- pytorch-triton-rocm==3.0.0
- torch==2.4.1+rocm6.0
- torchaudio==2.4.1+rocm6.0
- torchvision==0.19.1+rocm6.0
Now we are ready to generate a new Singularity .sif file with this defintion:
module load LUMI/24.03
module load cotainr
cotainr build my-new-image.sif --base-image=/appl/local/containers/sif-images/lumi-rocm-rocm-6.0.3.sif --conda-env=py312_rocm603_pytorch.yml
As we are using a PyTorch wheel for ROCm 6.0, we use the container image for ROCm 6.0.3.
You're now ready to use the new image with the direct access method. As in this example we installed
PyTorch, the information on the PyTorch page page in this guide is also very
relevant. And if you understand very well what you're doing, you may even adapt one of the EasyBuild
recipes for the PyTorch containers to use your new image and install the wrapper scripts etc. that
those modules provide (pointing EasyBuild to your image with the --sourcepath flag of the eb
command).
Pre-installed modules (and EasyConfigs)
To access module help and find out for which stacks and partitions the module is
installed, use module spider rocm/<version>.
EasyConfig:
-
rocm/6.2.4 (EasyConfig: rocm-6.2.4.eb)
This version may be useful as an upgrade in 24.03 or a downgrade of the default ROCm in 25.03, and is known to solve some rare issues when using the Cray compilers in 25.03.
-
rocm/6.4.4 (EasyConfig: rocm-6.4.4.eb)
This version of ROCm™ is the base version for the experimental 25.09 stack and is available in
CrayEnvandLUMI/25.09.
User-installable modules (and EasyConfigs)
Install with the EasyBuild-user module:
To access module help after installation and get reminded for which stacks and partitions the module is installed, usemodule spider rocm/<version>.
EasyConfig:
-
EasyConfig rocm-6.1.3.eb, will build rocm/6.1.3
EasyConfig meant to be used with the 23.09 compilers. It is the last version ROCm™ based on LLVM 17 which is the LLVM version used in cce 17.0.1 in CPE 23.09.
Install in
LUMI/23.09 partition/G. -
EasyConfig rocm-7.0.3.eb, will build rocm/7.0.3
ROCm™ 7.0.3, offered as is. RCCL does not work and there is no matching MPI library (GTL and Fortran support broken for sure, more may be broken). LUST cannot fix any issues with this and will also not develop a software stack for it unless there would be a Cray PE for it at some time. It is likely only useful for single GPU runs on LUMI.
Installation in a LUMI software stack in
partition/G, but don't use it with any other software in that stack as it may not work.
Singularity containers with modules for binding and extras
Install with the EasyBuild-user module in partition/container:
To access module help after installation use module spider rocm/<version>.
EasyConfig:
-
EasyConfig rocm-6.1.3-singularity-20241004.eb, will provide rocm/6.1.3-singularity-20241004
-
EasyConfig rocm-6.2.2-singularity-20241007.eb, will provide rocm/6.2.2-singularity-20241007
Technical documentation (central installation)
Easybuild
ROCm 4.5.2 (archived)
The EasyConfig unpacks the official RPMs and copies them to the installation directory. This is a temporary setup so that the users that have access to the Early Access Platform can compile their code from the login node.
ROCm 5.2.5 and 5.3.3
-
Unpacked form RPMs like previous version but use an EasyBlock to easy the process of EasyConfigs creation.
ROCm 5.4.6, 5.6.1 and 6.2.2
-
Unpacked from RPMs but with an additional step to set the RPATH of the libraries and avoid using the system rocm libraries if the module is not loaded.
-
The 5.4.6 and 6.2.2 modules were developed at a later time as the 5.6.1 module and were made to work around some problems we observed with 5.6.1 at that time. The 6.2.2 version was chosen as this at that time was the latest version of ROCm officially supported on the driver on the system at that time.
One difference with the 5.6.1 version is that there is no equivalent
amdmodule. Instead some additional environment variables are set in therocm/5.4.6and6.2.2modules so that if you load it AFTER loading thePrgEnv-amdmodule, the compiler wrappers would still use the compilers fromrocm/5.4.6or6.2.2. -
The 6.2.2 version is not compatible with CCE 17.x due to a LLVM incompatibility.
-
Documentation:
- [ROCm 5.4.6 documentation](https://rocm.docs.amd.com/en/docs-5.4.3/) - [ROCm 5.6.1 documentation](https://rocm.docs.amd.com/en/docs-5.6.1/) - [ROCm 6.2.2 documentation](https://rocm.docs.amd.com/en/docs-6.2.2/)
6.2.4 and 6.4.4
-
As previous ROCm Easyconfigs, but with support for the address sanitizer and debug symbols also. However, the libraries for the address sanitizer and debug symbols need to be activated with
LD_PRELOAD. -
Documentation:
- [ROCm 6.2.4 documentation](https://rocm.docs.amd.com/en/docs-6.2.4/) - [ROCm 6.4.4 documentation (6.4.3 as this is the closest available)](https://rocm.docs.amd.com/en/docs-6.4.3/)
Technical documentation (user EasyBuild installation)
6.1.3 and 7.0.3
-
As previous ROCm Easyconfigs, but with support for the address sanitizer and debug symbols also. However, the libraries for the address sanitizer and debug symbols need to be activated with
LD_PRELOAD. -
Documentation:
- [ROCm 6.1.3 documentation (6.1.2 as that is the closest available)](https://rocm.docs.amd.com/en/docs-6.1.2/) - [ROCm 7.0.3 documentation (7.0.2 as this is the closest available)](https://rocm.docs.amd.com/en/docs-7.0.3/)
Archived EasyConfigs
The EasyConfigs below are additional easyconfigs that are not directly available on the system for installation. Users are advised to use the newer ones and these archived ones are unsupported. They are still provided as a source of information should you need this, e.g., to understand the configuration that was used for earlier work on the system.
-
Archived EasyConfigs from LUMI-SoftwareStack - previously centrally installed software
-
Archived EasyConfigs from LUMI-EasyBuild-containers - previously available singularity containerised software
-
EasyConfig rocm-5.4.5-singularity-20231110.eb, with module rocm/5.4.5-singularity-20231110 (with docker definition)
-
EasyConfig rocm-5.4.5-singularity-20240124.eb, with module rocm/5.4.5-singularity-20240124
-
EasyConfig rocm-5.4.5-singularity-20240207.eb, with module rocm/5.4.5-singularity-20240207
-
EasyConfig rocm-5.4.6-singularity-20231110.eb, with module rocm/5.4.6-singularity-20231110 (with docker definition)
-
EasyConfig rocm-5.4.6-singularity-20240124.eb, with module rocm/5.4.6-singularity-20240124
-
EasyConfig rocm-5.4.6-singularity-20240207.eb, with module rocm/5.4.6-singularity-20240207
-
EasyConfig rocm-5.5.1-singularity-20231110.eb, with module rocm/5.5.1-singularity-20231110 (with docker definition)
-
EasyConfig rocm-5.5.1-singularity-20240124.eb, with module rocm/5.5.1-singularity-20240124
-
EasyConfig rocm-5.5.1-singularity-20240207.eb, with module rocm/5.5.1-singularity-20240207
-
EasyConfig rocm-5.5.3-singularity-20231108.eb, with module rocm/5.5.3-singularity-20231108 (with docker definition)
-
EasyConfig rocm-5.5.3-singularity-20240124.eb, with module rocm/5.5.3-singularity-20240124
-
EasyConfig rocm-5.5.3-singularity-20240207.eb, with module rocm/5.5.3-singularity-20240207
-
EasyConfig rocm-5.6.0-singularity-20240315.eb, with module rocm/5.6.0-singularity-20240315
-
EasyConfig rocm-5.6.1-singularity-20231108.eb, with module rocm/5.6.1-singularity-20231108 (with docker definition)
-
EasyConfig rocm-5.6.1-singularity-20240124.eb, with module rocm/5.6.1-singularity-20240124
-
EasyConfig rocm-5.6.1-singularity-20240207.eb, with module rocm/5.6.1-singularity-20240207
-
EasyConfig rocm-5.7.1-singularity-20240124.eb, with module rocm/5.7.1-singularity-20240124
-
EasyConfig rocm-5.7.1-singularity-20240207.eb, with module rocm/5.7.1-singularity-20240207
-
EasyConfig rocm-5.7.3-singularity-20241004.eb, with module rocm/5.7.3-singularity-20241004
-
EasyConfig rocm-6.0.3-singularity-20241004.eb, with module rocm/6.0.3-singularity-20241004
-