You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Next »

Preliminary access essentials

Disclaimer

Athena is still under development, and even despite our best efforts, Ahena might experience unscheduled outages or even data loss.

Content

Support

Please get in touch with the PLGrid Helpdesk: https://helpdesk.plgrid.pl/ regarding any difficulties in using the cluster.

For important information and announcements, please follow this page and the messages displayed in the login message.

Access to Athena

Computing resources on Athena are assigned based on PLGrid computing grants. To perform computations on Athena you need to obtain a computing grant through the PLGrid Portal (https://portal.plgrid.pl/) and apply for Athena access through the PLGrid portal (https://aplikacje.plgrid.pl/service/dostep-do-klastra-athena-w-osrodku-cyfronet/).

The work on Athena's storage is still underway. Thus there is no dedicated storage for performing IO-intensive computations. For the time being, please use the ramdisk $MEMFS functionality as the scratch space (https://kdm.cyfronet.pl/portal/Prometheus:Podstawy#Przestrze.C5.84_dyskowa_w_pami.C4.99ci_operacyjnej_MEMFS). Additionally, in current setup the long term storage is sourced from Ares. Thus you need to have a grant with storage resources on Ares, and access to Ares, to use the group directory storage on Athena. Performing high IO computations on group space is strictly forbidden!

If your grant is active, and you have applied for the service access, the request should be accepted in about half an hour. Please report any issues through the helpdesk.

Machine description

Available login nodes:

  • ssh <login>@athena.cyfronet.pl

Note that Athena uses PLGrid accounts and grants. Make sure to request the "Athena access" access service in the PLGrid portal.

Athena is built with Infiniband HDR interconnect and nodes of the following specification:

PartitionNumber of nodesCPURAMAccelerator
plgrid-gpu-a10048128 cores, 2x AMD EPYC 7742 64-Core Processor @ 2.25 GHz1024 GB8x NVIDIA A100-SXM4-40GB

Job submission

Athena is using Slurm resource manager, jobs should be submitted to the following partitions:

NameTimelimitAccount  suffixRemarks
plgrid-gpu-a10048h-gpu-a100GPU A100 partition.

MEMFS RAM storage

MEMFS uses RAM to create a temporary disk, for the duration of the job. This space is the fastest storage available and should be used to store temporary files. In order to use MEMFS please add the "-C memfs” parameter to your job specification. For example, use the following directive in your batch script: #SBATCH -C memfs

A storage volume will be set up for your job, referenced by the $MEMFS environmental variable. Please note that memory allocated to MEMFS storage counts towards the total memory allocated for your job and declared through "--mem" or "--mem-per-cpu".

Caution: When using MEMFS for file storage, be aware of the following limitations:

  • this method is only available for single-node jobs
  • the total amount of memory consumed by your job, including any MEMFS storage, must not exceed the value declared through "--mem".
  • this method may only be used if the total memory requirements of your job (including MEMFS storage) do not exceed the memory available on a single node (1024 GB per standard Athena node).
  • when using MEMFS, it is recommended to request the allocation of a full node for your job.

Accounts and computing grants

Athena uses a new scheme of naming Slurm accounts GPU computing grants. GPU computing grants using A100 GPU resources use the grantname-gpu-a100 suffix. Please mind that sbatch -A grantname won't work on its own. You need to add the -gpu-a100 suffix! Available computing grants, with respective account names (allocations), can be viewed by using the hpc-grants command.

Resource allocated on Athena doesn't use normalization, 1 hour of GPU time equals 1 hour spent using a GPU.

Storage

Available storage spaces are described in the following table:

LocationLocation in the filesystemDescription
$HOME/net/people/plgrid/<login>Storing own applications, and configuration files
$SCRATCH/net/tscratch/people/<login>High-speed storage for short-lived data used in computations. Data older than 30 days can be deleted without notice. It is best to rely on the $SCRATCH environment variable.
$PLG_GROUPS_STORAGE/<group name>/net/pr2/projects/plgrid/<group name>

Long-term storage, for data living for the period of computing grant.

This space is provided by using Ares storage. If you need permanent space for data, please apply for storage on the Ares cluster.

System Utilities

Please use the following commands to interact with the account and storage management system:

  • hpc-grants - shows available grants, resource allocations
  • hpc-fs - shows available storage
  • hpc-jobs - shows currently pending/running jobs
  • hpc-jobs-history - shows information about past jobs

Software

The module tree on Athena is unsupported. For the time being, please install your own software in the $HOME directory.

More information

Athena is following Prometheus' configuration and usage patterns. Prometheus documentation can be found here: https://kdm.cyfronet.pl/portal/Prometheus:Basics

  • No labels