Configuration of cryoSPARC environment
License
Each user should have their license obtained from https://cryosparc.com/download
To get access to cryoSPARC installation at the Athena cluster
- Make sure you have active access to the Athena cluster.
- Create a ticket at the PLGrid Helpdesk and apply for membership in
plggcryospar
team. Log in to the Athena login node.
Log in to Athena login nodessh <login>@athena.cyfronet.pl
Load the cryoSPARC module using the command
Set cryoSPARC environmentmodule add <module_name>/<version>
cryoSPARC versions
The following modules containing different cryoSPARC versions are currently available on the Athena cluster:
- cryoSPARC/4.7.0
- cryoSPARC/4.6.2 (default)
- cryoSPARC/4.5.3-240807
- cryoSPARC/4.5.1
- cryoSPARC/4.4.1-240110
- cryoSPARC/4.4.0-231114
- cryoSPARC/4.3.0
- cryoSPARC/4.2.1-230621
- cryoSPARC/4.2.1-230403
- cryoSPARC/4.2.0-230302
Stalling jobs
CryoSPARC version 4.6.0 (not listed above) introduced a regression, causing some jobs (especially 2D Classification) to stall and never finish.
The newer version, 4.6.2, should fix this issue, but if you encounter such behavior, please let us know via the Helpdesk.Run cryoSPARC configuration script.
It will configure your cryoSPARC environment, create a new user in the cryoSPARC database, and configure default cluster lanes.
Pass your license ID, e-mail, password (which will be used to log in to the cryoSPARC web app), and first and last name as arguments for the script.Configure cryoSPARCcryosparc_configuration --license <license-id> --email <your-email> --password '<password>' --firstname <first-name> --lastname <last-name>
Note that the password has to be enclosed by single quotes, so the configuration script can properly handle some special characters like '$'.
Slurm account for configuration
cryosparc_configuration
script runssrun
job underneath, so one may need to specify a Slurm account to run this job. It can be done with a commandexport SLURM_ACCOUNT=<account_name>
where <account_name> should be a grant name with a '-gpu-a100' suffix: <grant_name>-gpu-a100
More details about the account naming scheme can be found on the documentation page.
Access problems
In the case of "
cryosparc_configuration: command not found
" error run in terminalnewgrp plggcryospar
to start a new subshell with permissions of
plggcryospar
team.Accounts/grants management
Athena uses a new account naming scheme for computing grants as specified on the documentation page.
Our cryoSPARC tools provide a utility -
cryosparc-accounts
with which one can specify what account/grant should be used for cryoSPARC jobs. Please follow the instructions provided by$cryosparc-accounts help
command.cryosparc-accounts show prints currently selected account for cryoSPARC jobs cryosparc-accounts set sets account/grant to be used by cryoSPARC jobs to ACCOUNT_NAME cryosparc-accounts clear clears custom account/grant for cryoSPARC jobs cryosparc-accounts help prints this message
- Your cryoSPARC master setup is already done. All crypoSPARC master instances should be run in batch jobs.
cryoSPARC master job
cryoSPARC master process must not be run on login nodes of the Athena cluster. It should be run using plgrid-services
partition through the SLURM job described below.
Automated cryoSPARC master in batch job
cryoSPARC master should be started through the batch job.
The batch script is located at $CRYOSPARC_ADDITIONAL_FILES_DIR/cryosparc-master.slurm
. The $CRYOSPARC_ADDITIONAL_FILES_DIR
variable is available when the cryoSPARC module is loaded into the environment.
Load cryoSPARC module
Loading modulemodule add cryoSPARC/<version>
Submit job
cryosparc-master job submissionsbatch $CRYOSPARC_ADDITIONAL_FILES_DIR/cryosparc-master.slurm
cryoSPARC master job
There should be only one job that runs cryoSPARC master in
plgrid-services
partition per user.Check whether the job was started.
Check job statussqueue -j <JobID>
Common states of jobs
PD - PENDING
- job is awaiting resource allocation,R - RUNNING
- job currently has an allocation and is running,CF - CONFIGURING
- job has been allocated resources but is waiting for them to become ready for use (e.g., booting),CG - COMPLETING
- job is in the process of completing. Some processes on some nodes may still be active.
Make a tunnel
In your directory cat job log file:
Listing of job's logcat cryosparc-master-log-<JobID>.txt
where `
XXXXXXX
` is your batch job id which is displayed after you run it f.e. `cat cryosparc-master-log-49145683.txt
`It will show you something like this:
Example of job logCopy/Paste this in your local terminal to ssh tunnel with remote ----------------------------------------------------------------- ssh -o ServerAliveInterval=300 -N -L 40100:172.20.68.193:40100 plgusername@athena.cyfronet.pl ----------------------------------------------------------------- Then open a browser on your local machine to the following address ------------------------------------------------------------------ localhost:40100 ------------------------------------------------------------------
Exec the given command in another shell at your local computer to make a tunnel:
Tunnelingssh -o ServerAliveInterval=300 -N -L 40100:172.20.68.193:40100 plgusername@athena.cyfronet.pl
- Log in to cryoSPARC web application - open in browser: `
localhost:40100
` To gracefully stop cryoSPARC master in a batch job, send signal "2" using
scancel --batch --signal=2 <JobID>
command.Ending master jobscancel --batch --signal=2 <JobID>
In case the job isn't stopped by the user using
scancel command,
it will be ended gracefully by sending signal "2" just before maximal time (it is done through#SBATCH --signal=B:2@240
command in the script).The cryoSPARC master instance should not be stopped when cryoSPARC jobs are running or queued. Otherwise, jobs may fail or behave unexpectedly.
Storage space
We recommend storing project directories and input files in group storage space. Athena shares the same group storage (pr2) with Ares, and access to it can be obtained through the PLGrid grant system (ACK Cyfronet HPC Storage - STORAGE-01 resource).
Please note that Athena's SCRATCH space is high-performance SSD-based storage dedicated to short-lived data in IO-intensive tasks and is limited to 25TB per user. Any data stored in SCRATCH for over 60 days can be deleted without notice. Therefore, it is not suitable for storing large and permanent data.
SSD cache
Athena SCRATCH space is based on high-performance NVMe drives, so enabling the "Cache on SSD" option in all jobs that support it is highly recommended. Performing such jobs without an SSD cache enabled can lead to much longer computing times.
The SSD cache is configured to be located in $SCRATCH/.cryosparc/cache
directory with a 10TB quota and the lifetime of the cryosparc-master instance. As SCRATCH space on Athena is limited to 25TB per user, please try to use no more than 15TB for other data so cryoSPARC can fully use its cache space, especially if you work on very large datasets.
More details about cryoSPARC SSD cache functionality can be found in the official documentation.
Cluster lanes
Default lanes
There are six default cluster lanes available on the Athena cluster.
- athena-plgrid-12h - primary lane dedicated for CPU and GPU jobs, with GPU jobs limited to 12 hours and CPU jobs to 72 hours.
- athena-plgrid-24h - similar to athena-plgrid-12h, but with maximum GPU job duration extended to 24 hours.
- athena-plgrid-6h - similar to athena-plgrid-6h, but with maximum GPU job duration extended to 6 hours (versions >= 4.4.0).
- athena-plgrid-bigmem-12h - same as athena-plgrid-12h but with doubled memory size. (versions >= 4.2.1-230403)
- athena-plgrid-bigmem-24h - same as athena-plgrid-24h but with doubled memory size.
- athena-plgrid-bigmem-6h - same as athena-plgrid-6h but with doubled memory size (versions >= 4.4.0)
Bigmem lanes
As using longer or bigmem lanes may lead to longer queue waiting times and unnecessary resource reservations, it is recommended to conduct calculations on regular lanes whenever possible.
In version 4.2.0, default memory requirements have been readjusted for specific job types (like Non-uniform refinement and Topaz train), so they should fit in regular lanes in more cases.
If even the memory amount available with athena-plgrid-bigmem lanes is insufficient for your needs, please use the custom variables described below. For cryoSPARC versions lower than 4.4.1, consider creating an additional custom lane or contact support.
Custom variables
Starting from version 4.4.0, we began to introduce additional options to support more flexible job configuration. As for now, the following variables are supported by all of the default cluster lanes:
- max_hours - (versions >= 4.4.0) - overrides the default maximum job time for the selected cluster lane. With this variable, one can adjust the maximum job time to the expected execution time of a job. Shorter time reservations may improve queue waiting times. Remember that the scheduling system will kill the job if it isn't finished in time.
- mem_mult - (versions >= 4.4.1) - overrides the default memory multiplier for the selected cluster lane. The default setting of mem_mult is 1 for ordinary lanes and 2 for bigmem lanes. For example, setting mem_mult to 4 will result in a reservation of two times more memory than with bigmem lanes with default values.
- slurm_account - (versions >= 4.4.1) - can be used to specify the Slurm account for the job. This variable overrides the account specified with the cryosparc-accounts utility.
- notification_email - (versions >= 4.6.2) - the email address specified in this variable will be used to send notifications about job status.
Details on how to use custom variables can be found in cryoSPARC official documentation.
In case you need access to new cluster lanes or cluster lane features on the older cryoSPARC version, please contact the Helpdesk.
Creating additional lanes
You can create additional cluster lanes to fulfill your specific requirements:
- Start cryosparc-master instance as usual (if not already running),
Attach to the cryosparc-master instance job
Interactive jobsrun -N1 -n1 --jobid <job_id> --pty /bin/bash
Load cryoSPARC environment using modules
Load cryoSPARC environmentmodule add cryoSPARC/<version>
Copy cluster config
cluster_info.json
and script templatecluster_script.sh
from$CRYOSPARC_ADDITIONAL_FILES_DIR
to your working directoryCopy filescp $CRYOSPARC_ADDITIONAL_FILES_DIR/cluster_info.json . cp $CRYOSPARC_ADDITIONAL_FILES_DIR/cluster_script.sh .
- Modify files accordingly
- in config
cluster_info.json
change the name of lane/cluster to avoid overwriting default athena-* lanes - in
cluster_script.sh
change--time
,--mem
or other parts of the script template
- in config
Run command
add linecryosparcm cluster connect
- Repeat steps 5 and 6 to create other lanes if necessary
End interactive job
end interactive jobexit
For more details, please refer to the official cryoSPARC documentation.
External utilities
Topaz
A Topaz utility is installed on the Athena cluster and available for cryoSPARC Topaz jobs. The latest version of Topaz officially supported by cryoSPARC is 0.2.4, but version 0.2.5 is also expected to work. Please use one of the following paths to the topaz executable:
/net/software/v1/software/topaz/0.2.4/bin/topaz or /net/software/v1/software/topaz/0.2.5/bin/topaz
MotionCor2
The most recent version of MotionCor2 supported by cryoSPARC (as of cryoSPARC version 4.5.1) is 1.4.5. Please use the following path to the MotionCor2 executable:
/net/software/v1/software/MotionCor2/1.4.5-GCCcore-11.3.0/bin/motioncor2
deppEMhancer
Version 0.15 of deepEMhancer is available on the Athena cluster. Please use the following path to the deepEMhancer executable:
/net/software/v1/software/deepEMhancer/0.15/bin/deepemhancer.sh
DeepEMhancer models are located in the following directory
/net/software/v1/software/deepEMhancer/0.15/deepEMhancerModels/production_checkpoints
Maintenance procedures
cryoSPARC database backup
A backup of the cryoSPARC database will be automatically created every seven or more days during cryosparc-master startup in $SCRATCH/.cryosparc/database_backup
directory, and three previous backups will be kept by default. Different locations and other options can be specified in $HOME/.cryosparc/cyfronet
file.
Please refer to the official cryoSPARC documentation for more details about the cryoSPARC database backup and how to restore it.
Upgrading to a newer version
cryoSPARC is updated frequently with updates and patches (see https://cryosparc.com/updates). For each new version, a new module cryosparc/<version>
is created. To upgrade your instance to a new version, you will have to
- End all running cryoSPARC jobs.
Shut down the cryosparc-master instance
scancel --batch --signal=2 <cryosparc_master_jobid>
- Load module with the new version of cryoSPARC.
- Start a new cryosparc-master instance as usual.
- (Optional) Manually update all your custom lanes (specifically,
worker_bin_path
entry incluster_info.json
file should be adjusted and the lane re-added). Default lanes will be updated automatically.
Migrating from Ares cluster
Athena is a computing cluster designed to handle tasks that heavily use GPU resources and require the highest IO performance. With 384 NVIDIA A100 GPUs and access to SSD-based SCRATCH storage, we expect computing and queue waiting times to be much shorter than Ares. For users with the largest cryoSPARC databases, we also expect the web interface to become more responsive. That all should result in a vastly improved overall experience when using cryoSPARC, and thus, migrating from Ares is highly recommended.
The cryoSPARC setup on the Athena cluster is very similar to one already established on Ares and previously on Prometheus, with the main difference being that SSD caching mechanisms are enabled.
As Athena shares group storage space (pr2) with Ares, it is easy to migrate existing projects from Ares via built-in cryoSPARC importing and exporting utilities that were highly improved in the 4.0 release. For details, please refer to the official guide on migrating projects to the new cryoSPARC instance.
With the new installation of cryoSPARC being well-tested in production, Athena is now the main cluster on which we support cryoSPARC. New versions are expected to be introduced only on Athena.