Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Computing resources on Ares are assigned based on PLGrid computing grants (more information can be found here: Obliczenia w PLGrid). To perform computations on Ares you need to obtain a computing grant and also apply for Ares access , service through the PLGrid portal (https://aplikacje.plgrid.pl/service/dostep-do-klastra-ares-w-osrodku-cyfronet/).

If your grant is active, and you have applied for the service access, the request should be accepted in about half an hour, please . Please report any issues through the helpdesk.

...

PartitionNumber
of nodes
CPURAM

Proportional RAM /for one CPU

Proportional RAM /for one GPU

Proportional CPU/CPUs for one GPUAccelerator
plgrid (includes plgrid-long)53248 cores, Intel(R) Xeon(R) Platinum 8268 CPU @ 2.90GHz192GB3850MBn/an/a
plgrid-bigmem25648 cores, Intel(R) Xeon(R) Platinum 8268 CPU @ 2.90GHz384GB7700MBn/an/a
plgrid-gpu-v100932 cores, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz384GBn/a46000M4

8x Tesla V100-SXM2

...

Ares is using Slurm resource manager, jobs should be submitted to the following partitions:

NameTimelimit

Resource type

(account suffix)

Access requirementsDescription
plgrid72h-cpuGenerally available.Standard partition.
plgrid-testing1h-cpuGenerally available.High priority, testing jobs, limited to 3 1 running jobsjob.
plgrid-now12h-cpuGenerally available.The highest priority, interactive jobs, limited to 1 running or queued job.
plgrid-long168h-cpuRequires a grant with a maximum job runtime of 168h.Used for jobs with extended runtime.
plgrid-bigmem72h-cpu-bigmemRequires a grant with CPU-BIGMEM resources.Resources used for jobs requiring an extended amount of memory.
plgrid-gpu-v10048h-gpuRequires a grant with GPGPU resources.GPU partition.

If you are unsure of how to properly configure your job on Ares please consult this guide: Job configuration

Accounts and computing grants

Ares uses a new naming scheme of naming accounts for CPU and GPU computing accounts, which are supplied by the -A parameter in sbatch command. Currently, accounts are named in the following manner:

Resourceaccount name
CPUgrantname-cpu
CPU bigmem nodesgrantname-cpu-bigmem
GPUgrantname-gpu

grants. CPU-only grants are named: grantname-cpu, while GPU accounts use grantname-gpu suffix. Please mind that sbatch -A grantname won't work on its own. You need to add the -cpu or , -cpu-bigmem, or -gpu suffix! Available computing grants, with respective account names (allocations), can be viewed by using the hpc-grants command.

Resource allocated on Ares doesn't use normalization, which was used on Prometheus and previous clusters. 1 hour of CPU time equals 1 hour spent on a computing core , similar to GPU time. Resources used to calculate account billing include CPUs, memory, and GPUs. Jobs on CPU partitions are always billed in CPU hours, while jobs on GPU partitions are always billed in GPU hours. If your job uses the default amount of memory per core, or less, then the job is billed simply for the time spent using CPUswith a proportional amount of memory (consult the table above). The billing system accounts for jobs with more memory than the proportional amount. If the job uses more memory for each allocated CPU than the proportional amount (consult the table above), then the job , it will be billed as it would use have used more CPUs. The billed amount billed can be calculated by dividing the used memory used by the proportional memory per core and rounding the result to the closest and larger integer. Jobs on CPU partitions are always billed in CPU hours.

The same principle applies was applied to GPUsGPU resources, where a GPU has the respective amount of the GPU-hour is a billing unit, and there are proportional memory per GPU and proportional CPUs per GPU defined (consult the table above).

The cost of running a job can be expressed as a simple algorithm for CPUs:

Code Block
cost_cpu    = job_cpus_used * job_duration
cost_memory = ceil(job_memory_used/memory_per_cpu) * job_duration
final_cost  = max(cost_cpu, cost_memory)

and for GPUsand for GPUs, where a GPU has the respective amount of memory per GPU and CPUs per GPU, respectively:

Code Block
cost_gpu    = job_gpus_used * job_duration
cost_cpu    = ceil(job_cpus_used/cpus_per_gpu) * job_duration
cost_memory = ceil(job_memory_used/memory_per_gpu) * job_duration
final_cost  = max(cost_gpu, cost_cpu, cost_memory)

...

LocationLocation in the filesystemPurpose
$HOME/net/people/plgrid/<login>Storing own applications, and configuration files. Limited to 10GB.
$SCRATCH

/net/ascratch/people/<login>

High-speed storage for short-lived data used in computations. Data older than 30 days can be deleted without notice. It is best to rely on the $SCRATCH environment variable.
$PLG_GROUPS_STORAGE/<group name>/net/pr2/projects/plgrid/<group name>Long-term storage for data living for the period of computing grant. Should be used for storing significant amounts of data.

Current usage, capacity and other storage attributes can be checked by issuing the hpc-fs command.

...

and the environment can be purged by:

module purge

Sample job scripts

Example job scripts are available on this page: Sample scripts

More information

Ares is following Prometheus' configuration and usage patterns. Prometheus documentation can be found here: https://kdm.cyfronet.pl/portal/Prometheus:Basics