Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagebash
titlecryospark-master.slurm
#!/bin/bash
#SBATCH --partition plgrid-services
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 4
#SBATCH --mem 10GB
#SBATCH --time 14-0
#SBATCH -C localfs
#SBATCH --dependency=singleton
#SBATCH --job-name cryosparc-master
#SBATCH --output cryosparc-master-log-%J.txt
#SBATCH --signal=B:2@240


## Load environment for cryoSPARC
module add plgrid/apps/cryosparc

## get tunneling info
ipnport=$CRYOSPARC_BASE_PORT
ipnip=$(hostname -i)
user=$USER

## print tunneling instructions to cryosparc-master-log-<JobID>.txt
echo -e "
    Copy/Paste this in your local terminal to ssh tunnel with remote
    -----------------------------------------------------------------
    ssh -o ServerAliveInterval=300 -N -L $ipnport:$ipnip:$ipnport ${user}@pro.cyfronet.pl
    -----------------------------------------------------------------

    Then open a browser on your local machine to the following address
    ------------------------------------------------------------------
    localhost:$ipnport
    ------------------------------------------------------------------
    "

## start a cryoSPARC master server
cryosparcm restart

## loop which keep job running till "scancel --batch --signal=2 <JobID>" by user or automatic kill by SLURM at end of requested walltime
while true; do sleep 600; donesource $CRYOSPARC_GLOBAL_ADDITIONAL_FILES_DIR/cryosparc_additional_bash_functions.sh
loop-and-stop

Above script is located at $CRYOSPARC_ADDITIONAL_FILES_DIR/cryosparc-master.slurm.  $CRYOSPARC_ADDITIONAL_FILES_DIR variable is available when cryoSPARC module is loaded to environment. You could copy it your working folder

...

  1. Submit job

    Code Block
    languagebash
    titlejob submision
    sbatch cryosparc-master.slurm


    Warning
    titlecryoSPARC master job

    There should be only one job which run cryoSPARC master in plgrid-servicies partition per user.


  2. Check whether job was started

    Code Block
    languagebash
    titlejobs status
    squeue -j <JobID>


  3. Common states of jobs

    • PD - PENDING - Job is awaiting resource allocation.
    • R - RUNNING - Job currently has an allocation and is running.
    • CF - CONFIGURING  - Job has been allocated resources, but are waiting for them to become ready for use (e.g. booting). On Prometheus CF state could last for up to 8 minutes in case when nodes that have been in power save mode.
    • CG - COMPLETING  - Job is in the process of completing. Some processes on some nodes may still be active.
  4. Make a tunnel

    In your directory cat job log file:

    Code Block
    languagebash
    titleListing of job's log
    cat cryosparc-master-log-<JobID>.txt

    where `XXXXXXX` is your sbatch job id which is displayed after you run it f.e. `cat cryosparc-master-log-49145683.txt`

    It will show you something like this:

    Code Block
    languagebash
    titleExample of job log
    Copy/Paste this in your local terminal to ssh tunnel with remote
    -----------------------------------------------------------------
    ssh -o ServerAliveInterval=300 -N -L 40100:172.20.68.193:40100 plgusername@pro.cyfronet.pl
    -----------------------------------------------------------------
    Then open a browser on your local machine to the following address
    ------------------------------------------------------------------
    localhost:48511 
    ------------------------------------------------------------------


  5. Exec in another shell at your local computer given command to make a tunnel:

    Code Block
    languagebash
    titleTunneling
    ssh -o ServerAliveInterval=300 -N -L 40100:172.20.68.193:40100 plgusername@pro.cyfronet.pl


  6. Log into cryoSPARK web application - open in browser: `localhost:40100`
  7. To gracefully stop cryoSPARC master in batch job send signal "2" using scancel --batch --signal=2 <JobID> command. 

    Code Block
    languagebash
    titleEnding master job
    scancel --batch --signal=2 <JobID>

    In case when job won't be stopped by user through scancel it is going to be stopped gracefully by sending signal "2" just before maximal time (it is done through #SBATCH --signal=B:2@240 command in script).

Adding optional lanes

You could create additional lanes for other maximal duration of SLURM job:

...