Initial Setup
Prerequisites
Python
module load migration-testing-spack/python/3.10.13
Install Toil
Option 1 - Install Toil in Python Virtual Environment [Recommended for first time users]
Create a python virtual environment
python3 -m venv venv
This will create python virtual environment folder called venv
in your current working directory
Activate the virtual environment
source venv/bin/activate
Update pip in the virtual environment
pip install --upgrade pip
This step only needs to be done once per virtual environment
Install toil in the virtual environment
Install latest version of Toil
pip install toil[cwl]
Option 2 - Alternative Installation Methods
An alternative installation method using conda
is included with the pluto-cwl
repo here.
Resources:
Install Node.js
Check if node is installed on your system with which node
If Node is not installed. Create a node
folder in your shared workspace (/work/...
on juno).
In the node
folder run:
wget https://nodejs.org/dist/v17.8.0/node-v17.8.0-linux-x64.tar.xz
tar xvfJ node-v17.8.0-linux-x64.tar.xz
export PATH=$PATH:$PWD/node-v17.8.0-linux-x64/bin
Singularity
module load migration-testing/singularity/3.7.1
Setup the singularity cache:
Follow this link to setup the cache:
Setup Singularity cacheEnvironment Variables
The following environment variables should be set before running a CWL pipeline
Required
export CWL_SINGULARITY_CACHE=[cache]
: as shown above, a directory with read/write access where Singularity containers are cachedexport SINGULARITYENV_LC_ALL=en_US.UTF-8
Recommended
If you have an sla or specific LSF or SLURM requirements, you can configure toil to pass extra arguments to its worker jobs
TOIL_SLURM_ARGS
: Arguments for sbatch for the SLURM batch system
Example: export TOIL_SLURM_ARGS='--partition test01'
Optional
The environment variables here are for informative purposes. Not recommended for first time users.
If you are automatically generating the cwl-cache
SINGULARITY_CACHEDIR
If you are running a large pipeline you might need to set up dockerhub authentication, so you do not hit the rate limit
SINGULARITY_DOCKER_USERNAME
: Dockerhub login usernameSINGULARITY_DOCKER_PASSWORD
: Dockerhub login password
Create the demo CWL and yaml file hello.cwl:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: ['echo']
stdout: stdout.txt
requirements:
DockerRequirement:
dockerPull: alpine:latest
inputs:
message:
type: string
inputBinding:
position: 1
outputs:
stdout_txt:
type: stdout
hello.yaml
message: Hello world!
The file is also on the file system:
/juno/work/ci/first_time_setup/hello.cwl
Pull Singularity container and then run the CWL
Insert the toil command into a slurm script, such as
hello-world.slurm
#!/bin/bash
#SBATCH --job-name=toil_cwl
#SBATCH --partition=test01 # Replace with the correct partition name
#SBATCH --time=02:00:00 # Wall time (hh:mm:ss)
#SBATCH -o single.%j.out
#SBATCH -e single.%j.err
source venv/bin/activate # Activate your virtual environment
mkdir singularity_cache
mkdir tmp
#Export queue name
export TOIL_SLURM_ARGS='--partition test01' ;
# Navigate to your workflow directory (if needed)
#cd /path/to/your/workflow
# Run your workflow
export TMPDIR=$PWD/tmp
export CWL_SINGULARITY_CACHE=$PWD/singularity_cache
export SINGULARITYENV_LC_ALL=en_US.UTF-8
toil-cwl-runner \
--batchSystem slurm \
--jobStore file:job-store-name \
--singularity \
--disableCaching \
--logFile cwltoil.log \
--logDebug \
--preserve-environment SINGULARITY_DOCKER_USERNAME SINGULARITY_DOCKER_PASSWORD TMPDIR \
--outdir hello-world \
hello.cwl hello.yaml
Run command
sbatch hello-world.slurm
The standard output and standard error are directed to
single.%j.out
andsingle.%j.err
where "%j" is replaced by the job number
If you get the response Hello World
, you are all set! Congrats you finished your first cwl run!
Last updated
Was this helpful?