Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Configuration of cryoSPARC environment

Info
titleLicense

Each user should have his/her own their license obtained from https://cryosparc.com/download and apply for membership in plggcryospar team in Portal PLGrid

To get access to cryoSPARC instalation installation at Prometheus cluster

  1. Apply  for apply for membership  for membership in plggcryospar team in Portal PLGrid and ask for registration in Cyfronet's internal cryoSPARC users' database and a dedicated port for access to cryoSPARC master  through through Helpdesk PLGrid.
  2. Log in to the Prometheus login node.

    Code Block
    languagebash
    titleLog into Prometheus login node
    ssh <login>@pro.cyfronet.pl


  3. Load cryoSPARC module  using command

    Code Block
    languagebash
    titleSet cryoSPARC environment
    module add plgrid/apps/cryosparc/3.1


  4. Run cryoSPARC configuration script. It will configure your cryoSPARC environment and create your user in cryoSPARC database and configure two lanes for external jobs - prometheus-gpu which is going to use the plgrid-gpu partition for GPU jobs and prometheus-gpu-v100 for plgrid-gpu-v100 partition. Both lanes are going to use plgrid partition for CPU-only jobs. As an argument for script pass license id, your e-mail and password (they are going to will be used to login to the cryoSPARC webappweb app), and your first and last name.

    Code Block
    languagebash
    titleConfigure cryoSPARC
    cryosparc_configuration --license <XXXX> --email <your-email> --password <password> --firstname <Givenname> --lastname <Surname> 


    Info
    titleAccess problems

    In the case of "cryosparc_configuration: command not found" error run in terminal

    Code Block
    languagebash
    newgrp plggcryospar

    to start a new subshell with permissions of plggcryospar team.


    Info
    titleAccess to GPU partitions

    To use GPUs on the Promehteus cluster, you have to must apply for GPU resources at Portal PLGrid in your computational grant. At Prometheus, there are two kinds of GPGPUs available - Nvidia Tesla K40 XL in plgrid-gpu partition and Nvidia Telsa V100 in plgrid-gpu-v100 partition, and you have to describe in your grant which would you like to use.

    To check whether you have an access to the partition on the Prometheus login node, run the below command and check whether your PLGrid computational grants are on AllowAccounts list

    • partition plgrid-gpu

      Code Block
      languagebash
      scontrol show partition plgrid-gpu | grep Accounts | grep <PLGrid grant name>


    • partition plgrid-gpu-v100

      Code Block
      languagebash
      scontrol show partition plgrid-gpu-v100 | grep Accounts | grep <PLGrid grant name>


    In case that If you do not have access to one or both of the above partitions, check your PLGrid computational grant details at Portal PLGrid If your grant lists GPU resources , and access to the required queue or queues is not possible please impossible, don't hesitate to contact Helpdesk at https://helpdesk.plgrid.pl.


  5. Your cryoSPARC master setup is already done. All succeeding crypoSPARC master instances  should should be run in batch jobs.


Warning
titlecryoSPARC master job

cryoSPARC master process must not be run on login nodes of the Prometheus cluster. It should be run in using plgrid-servicies trough  partition through the SLURM job described below.

...

cryoSPARC master could be started trough through the batch job. 

Code Block
languagebash
titlecryospark-master.slurm
#!/bin/bash
#SBATCH --partition plgrid-services
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 14
#SBATCH --mem 10GB
#SBATCH --time 14-0
#SBATCH -C localfs
#SBATCH --dependency=singleton
#SBATCH --job-name cryosparc-master
#SBATCH --output cryosparc-master-log-%J.txt
#SBATCH --signal=B:2@240


## Load environment for cryoSPARC
module add plgrid/apps/cryosparc/3.1

## get tunneling info
ipnport=$CRYOSPARC_BASE_PORT
ipnip=$(hostname -i)
user=$USER

## print tunneling instructions to cryosparc-master-log-<JobID>.txt
echo -e "
    Copy/Paste this in your local terminal to ssh tunnel with remote
    -----------------------------------------------------------------
    ssh -o ServerAliveInterval=300 -N -L $ipnport:$ipnip:$ipnport ${user}@pro.cyfronet.pl
    -----------------------------------------------------------------

    Then open a browser on your local machine to the following address
    ------------------------------------------------------------------
    localhost:$ipnport
    ------------------------------------------------------------------
    "

## start a cryoSPARC master server
cryosparcm restart

## loop which keep job running till "scancel --batch --signal=2 <JobID>" by user or automatic kill by SLURM at end of requested walltime
while true; do sleep 600; done
source $CRYOSPARC_GLOBAL_ADDITIONAL_FILES_DIR/cryosparc_additional_bash_functions.sh
loop-and-stop

The above script is located at $CRYOSPARC_ADDITIONAL_FILES_DIR/cryosparc-master.slurm. The $CRYOSPARC_ADDITIONAL_FILES_DIR variable is available when cryoSPARC module is loaded into the environment. You could copy it into your working folder.Above script is located at /net/software/local/cryosparc/3.1/cyfronet/cryosparc-master.slurm.  You could copy it your working folder

Code Block
languagebash
titleSLURM script template copy command
cp /net/software/local/cryosparc/3.1/cyfronet/cryosparc$CRYOSPARC_ADDITIONAL_FILES_DIR/cryosparc-master.slurm .


  1. Submit job

    Code Block
    languagebash
    titlejob submision
    sbatch cryosparc-master.slurm


    Warning
    titlecryoSPARC master job

    There should be only one job which run runs cryoSPARC master in plgrid-servicies partition per user.


  2. Check whether the job was started.

    Code Block
    languagebash
    titlejobs status
    squeue -j <JobID>


  3. Common states of jobs

    • PD - PENDING - Job is awaiting resource allocation.
    • R - RUNNING - Job currently has an allocation and is running.
    • CF - CONFIGURING  - Job has been allocated resources , but are is waiting for them to become ready for use (e.g. booting). On Prometheus CF state could last for up to 8 minutes in a case when nodes that have been in power save mode.
    • CG - COMPLETING  - Job is in the process of completing. Some processes on some nodes may still be active.
  4. Make a tunnel

    In your directory cat job log file:

    Code Block
    languagebash
    titleListing of job's log
    cat cryosparc-master-log-<JobID>.txt

    where `XXXXXXX` is your sbatch batch job id which is displayed after you run it f.e. `cat cryosparc-master-log-49145683.txt`

    It will show you something like this:

    Code Block
    languagebash
    titleExample of job log
    Copy/Paste this in your local terminal to ssh tunnel with remote
    -----------------------------------------------------------------
    ssh -o ServerAliveInterval=300 -N -L 40100:172.20.68.193:40100 plgusername@pro.cyfronet.pl
    -----------------------------------------------------------------
    Then open a browser on your local machine to the following address
    ------------------------------------------------------------------
    localhost:48511 
    ------------------------------------------------------------------


  5. Exec in another shell at your local computer given the command to make a tunnel:

    Code Block
    languagebash
    titleTunneling
    ssh -o ServerAliveInterval=300 -N -L 40100:172.20.68.193:40100 plgusername@pro.cyfronet.pl


  6. Log into cryoSPARK cryoSPARC web application - open in browser: `localhost:40100`
  7. To gracefully stop cryoSPARC master in a batch job, send signal "2" using scancel --batch --signal=2 <JobID> command. 

    Code Block
    languagebash
    titleEnding master job
    scancel --batch --signal=2 <JobID>

    In case when job won't be stopped by a user through scancel it is going to be ended gracefully by sending signal "2" just before maximal time (it is done through #SBATCH --signal=B:2@240 command in the script).

Adding optional lanes

You could create additional lanes for other maximal duration of SLURM job:

  1. Start interactive job using command

    Code Block
    languagebash
    titleInteractive job
    srun -p plgrid-services --nodes=1 --ntasks=1 --mem=5GB --time=0-1 --pty bash


  2. Load cryoSPARC environment using modules

    Code Block
    languagebash
    titleLoad cryoSPARC environment
    module add plgrid/apps/cryosparc/3.2


  3. Copy cluster config cluster_info.json  and script template cluster_script.sh from $CRYOSPARC_ADDITIONAL_FILES_DIR directory  to your working directory

    Code Block
    languagebash
    titleCopy files
    cp $CRYOSPARC_ADDITIONAL_FILES_DIR/cluster_info.json .
    cp $CRYOSPARC_ADDITIONAL_FILES_DIR/cluster_script.sh .


  4. Modify files accordingly
    1. in config cluster_info.json change the name of lane/cluster to avoid overwriting default prometheus* lanes
    2. in cluster_script.sh change --time, --partition or other parts of the script template accordingly
  5. Start  cryoSPARC master

    Warning
    titlecryoSPARC master job

    There should be only one job which runs cryoSPARC master per user. Therefore you should stop the job with cryoSPARC master before this step.


    Code Block
    languagebash
    titlerun cryoSPARC master
    cryosparcm restart



  6. run command cryosparcm cluster connect <name-of-cluster-form-cluster_info.json> to add lane/cluster

    Code Block
    languagebash
    titleadd line
    cryosparcm cluster connect <name-of-cluster-form-cluster_info.json>


  7. Repeat the above points to create another lane if necessary
  8. Stop cryoSPARC master

    Code Block
    languagebash
    titlerun cryoSPARC master
    cryosparcm stop


  9. End interactive job

    Code Block
    languagebash
    titleend interactive job
    exit


Upgrading to a newer version

cryoSPARC is updated frequently with updates and patches (see https://cryosparc.com/updates). For each new version, a new module plgrid/apps/cryosparc/<version> is created, which is made default after testing (among other tests also with standard benchmark). To upgrade to a new version, one has to

  1. End all running cryoSPARC jobs.
  2. Update cryosparc-master.slurm script (or check that it uses the default cryoSPARC module).
  3. Manually update all lanes created (those created during configuration and added later). Script described in Known problems and issues section could be used to automate this process.
  4. Restart cryoSPARC master server.

Known problems and issues