# Initial Setup

## Prerequisites

* [Python 3](#python)
* [Toil](#install-toil)
* [Node.js](#install-node.js)
* [Singularity](#singularity)

## Python

{% tabs %}
{% tab title="Iris" %}
`module load migration-testing-spack/python/3.10.13`
{% endtab %}

{% tab title="Juno" %}
`module load python/3.7.1`
{% endtab %}
{% endtabs %}

## Install Toil&#x20;

### Option 1 - Install Toil in Python Virtual Environment \[Recommended for first time users]

#### Create a python virtual environment

<pre><code><strong>python3 -m venv venv
</strong></code></pre>

This will create python virtual environment folder called `venv` in your current working directory

#### Activate the virtual environment

```
source venv/bin/activate
```

{% hint style="info" %}
You need to activate the virtual environment in order to use it. Remeber this command, as you need to do this at the start of any session where you need to use Toil.
{% endhint %}

#### Update pip in the virtual environment

```
pip install --upgrade pip
```

{% hint style="success" %}
This step only needs to be done once per virtual environment
{% endhint %}

#### Install toil in the virtual environment&#x20;

{% tabs %}
{% tab title="Iris" %}
**Install latest version of Toil**\
\
`pip install toil[cwl]`
{% endtab %}

{% tab title="Juno" %}
**Install toil version 5.4.3**

```
pip install "git+https://github.com/mskcc/toil.git@5.4.3#egg=toil[cwl]"
```

**Modify process.py file**

```
<virtualenv_location>/lib/python3.7/site-packages/cwltool/process.py
```

<pre><code><strong># change shutil.copy2(src, dst) -> shutil.copy(src, dst);
</strong># below is a git diff snippet from stock cwltool/process.py and modified cwltool/process.py
<strong>
</strong><strong>--- a/venv/lib/python3.7/site-packages/cwltool/process.py
</strong>+++ b/compute/juno/bic/ROOT/work/voyager/prod_ridgeback2/lib/python3.7/site-packages/cwltool/process.p
@@ -394,7 +394,7 @@ def relocateOutputs(
                     os.unlink(dst)
                 shutil.copytree(src, dst)
             else:
-                shutil.copy2(src, dst)
+                shutil.copy(src, dst)

</code></pre>

{% endtab %}
{% endtabs %}

### Option 2 - Alternative Installation Methods

An alternative installation method using `conda` is included with the `pluto-cwl` repo [here](https://github.com/mskcc/pluto-cwl#installation--setup).

#### Resources:

* [official installation documentation](https://toil.readthedocs.io/en/latest/gettingStarted/install.html)

## Install Node.js

Check if node is installed on your system with `which node`

If Node is not installed. Create a `node` folder in your shared workspace (`/work/...` on juno).

In the `node` folder run:

```python
wget https://nodejs.org/dist/v17.8.0/node-v17.8.0-linux-x64.tar.xz
tar xvfJ node-v17.8.0-linux-x64.tar.xz
export PATH=$PATH:$PWD/node-v17.8.0-linux-x64/bin
```

{% hint style="info" %}
Make sure to add the node path to your `.profile` so you don't need to load every time you log in to juno. You can get the full node path by running `which node`
{% endhint %}

## Singularity

{% tabs %}
{% tab title="Iris" %}

```
module load migration-testing/singularity/3.7.1
```

{% endtab %}

{% tab title="Juno" %}

```
module load singularity/3.7.1
```

{% endtab %}
{% endtabs %}

#### Setup the singularity cache:

Follow this link to setup the cache:&#x20;

{% content-ref url="initial-setup/setup-singularity-cache" %}
[setup-singularity-cache](https://mskcc-ci.gitbook.io/cmo-informatics-operations/common-operations/initial-setup/setup-singularity-cache)
{% endcontent-ref %}

### Environment Variables

The following environment variables should be set before running a CWL pipeline

#### Required

* `export CWL_SINGULARITY_CACHE=[cache]`: as shown above, a directory with read/write access where Singularity containers are cached
* `export SINGULARITYENV_LC_ALL=en_US.UTF-8`

#### Recommended

If you have an sla or specific LSF or SLURM requirements, you can configure toil to pass extra arguments to its worker jobs

{% tabs %}
{% tab title="SLURM" %}
`TOIL_SLURM_ARGS`: Arguments for sbatch for the SLURM batch system

Example: `export TOIL_SLURM_ARGS='--partition test01'`&#x20;
{% endtab %}

{% tab title="LSF" %}
`TOIL_LSF_ARGS`: Extra args added to the bsub on TOIL jobs

Example: `export TOIL_LSF_ARGS='-sla CMOPI'`
{% endtab %}
{% endtabs %}

#### Optional

The environment variables here are for informative purposes. Not recommended for first time users.

* If you are automatically generating the cwl-cache
  * `SINGULARITY_CACHEDIR`&#x20;
  * If you are running a large pipeline you might need to set up dockerhub authentication, so you do not hit the rate limit
    * `SINGULARITY_DOCKER_USERNAME`: Dockerhub login username&#x20;
    * `SINGULARITY_DOCKER_PASSWORD`: Dockerhub login password

**Create the demo CWL and yaml file** \
\
hello.cwl:

```
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: ['echo']
stdout: stdout.txt
requirements:
  DockerRequirement:
    dockerPull: alpine:latest

inputs:
  message:
    type: string
    inputBinding:
      position: 1

outputs:
  stdout_txt:
    type: stdout
```

{% file src="<https://548551270-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MXY9KcVjrKoEWe0OQVs%2Fuploads%2Fm90DTEMPF30EILqimQhy%2Fhello.cwl?alt=media&token=35c7f8d1-3417-426b-a0b3-e50aa670a760>" %}

hello.yaml

```
message: Hello world!
```

**The file is also on  the file system:**&#x20;

{% tabs %}
{% tab title="Juno" %}
`/juno/work/ci/first_time_setup/hello.cwl`
{% endtab %}
{% endtabs %}

**Pull Singularity container and then run the CWL**

{% tabs %}
{% tab title="Iris" %}

1. Insert the toil command into a slurm script, such as `hello-world.slurm`&#x20;

<pre><code>
#!/bin/bash

#SBATCH --job-name=toil_cwl

#SBATCH --partition=test01        # Replace with the correct partition name

#SBATCH --time=02:00:00            # Wall time (hh:mm:ss)

#SBATCH -o single.%j.out

#SBATCH -e single.%j.err


source venv/bin/activate           # Activate your virtual environment
mkdir singularity_cache
mkdir tmp

#Export queue name

export TOIL_SLURM_ARGS='--partition test01' ;



# Navigate to your workflow directory (if needed)

#cd /path/to/your/workflow



# Run your workflow

<strong>export TMPDIR=$PWD/tmp
</strong>export CWL_SINGULARITY_CACHE=$PWD/singularity_cache
export SINGULARITYENV_LC_ALL=en_US.UTF-8



toil-cwl-runner \
--batchSystem slurm \
--jobStore file:job-store-name \
--singularity \
--disableCaching \
--logFile cwltoil.log \
--logDebug \
--preserve-environment SINGULARITY_DOCKER_USERNAME SINGULARITY_DOCKER_PASSWORD TMPDIR \
--outdir hello-world \
hello.cwl hello.yaml
</code></pre>

2. Run command `sbatch hello-world.slurm`
3. The standard output and standard error are directed to `single.%j.out` and `single.%j.err` where "%j" is replaced by the job number
   {% endtab %}

{% tab title="Juno" %}
**Pull Singularity container and then run the CWL**

<pre data-overflow="wrap"><code>module load singularity/3.7.1
source venv/bin/activate
mkdir singularity_cache
mkdir tmp
<strong>export TMPDIR=$PWD/tmp
</strong>export CWL_SINGULARITY_CACHE=$PWD/singularity_cache
export SINGULARITYENV_LC_ALL=en_US.UTF-8

/juno/work/ci/docker_extract/update_cache.sh -s $CWL_SINGULARITY_CACHE -c hello.cwl

toil-cwl-runner --preserve-environment CWL_SINGULARITY_CACHE SINGULARITYENV_LC_ALL --singularity hello.cwl --message "Hello World"

cat stdout.txt
</code></pre>

{% endtab %}
{% endtabs %}

{% hint style="success" %}
If you get the response `Hello World`, you are all set! Congrats you finished your first cwl run!
{% endhint %}
