Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

in your scripts. Example job scripts (without -l option for Ares compatibility) are available on this page: Sample scripts


Python, external libraries, and machine learning on Helios

As noted in the previous sections, nodes with GH-200 GPU superchips have CPUs with arm64 architecture., thus demanding modules built for this architecture specifically.

When working with Python, for virtual environment management, anaconda should NOT be used. This is because conda  environments ship with separate Python interpreter installations, which may experience compatibility issues with the ARM architecture.
To create virtual environments, please use Python standard venv  module.

For deep learning applications, we provide a special module with software often used by AI libraries, called  `ML-bundle/24.06a` Make sure to always load this module before installing/building any packages or running Python programs relying on GPUs on GH-200 nodes.
IMPORTANT: Remember that this module should always be loaded as the first step in the given job, before activation of any virtual environments.

We also provide a custom pip repository, with popular machine learning packages pre-built for different versions with GPU support for ARM architecture.
The packages from this repo can be directly installed via pip, simply by specifying the correct name and version tag of a package. To see available libraries with their tags, check the contents of the repo via:
ls /net/software/aarch64/el8/wheels/ML-bundle/24.06a/simple/  command.

Example script for creating venv, installing packages and running the jobs.

Python multiprocessing - a word of caution

More information

Helios is following Prometheus' configuration and usage patterns. Prometheus documentation can be found here: https://kdm.cyfronet.pl/portal/Prometheus:Basics