Parabricks on the Genomics Compute Cluster

How to run Parabricks, the licensed GPU version of GATK 4, on the Genomics Compute Cluster on Quest.

NVIDIA’s Clara Parabricks is a licensed GPU version of GATK 4 which runs 10x faster than the open-source CPU version of GATK, and is available to genomics researchers at Northwestern who are members of the Genomics Compute Cluster.  To run the CPU version of GATK 4, load the gatk/4.1.0 module.  Information on running Parabrick's GPU version of GATK 4 is below.

Checking out Parabricks Licenses

When running Parabricks, your job will require a license for each gpu card it runs on. We have two Parabricks licenses in the Genomics Compute Cluster (GCC). To check out a Parabricks license for your job, include the license directive (-L) in your job submission command:
sbatch -L parabricks:2 /projects/b1042/Parabricks_Training/dv.sh
In this example, two Parabricks licenses are being checked out for this job. The scheduler will keep track of checked out licenses and your job will not begin unless licenses are available for it. 

You can run your Parabricks job on a one or two GPU cards, requiring one or two Parabricks licenses respectively. If you’d like to check if Parabricks licenses are currently available, run:
/projects/genomicsshare/software/parabricks_avail.sh
Parabricks licenses currently available on Quest:
2

Running Parabricks on Quest 

Parabricks requires Python and Singularity to run, so before running Parabricks load the following Quest modules
module load python/anaconda3.6
module load singularity
Parabricks’s pbrun executable is installed in /projects/genomicsshare/software/parabricks/pbrun.
Here’s an example of running the help command, which returns command options for Parabricks:
/projects/genomicsshare/software/parabricks/pbrun --help

Sample submission scripts are in /projects/b1042/Parabricks_Training.  Researchers do not have write permissions into that directory so launch these job submission scripts from your own projects directory.

Fastq to Bam example script

cd to your projects directory before submitting this example script: 
sbatch -L parabricks:2 /projects/b1042/Parabricks_Training/fq2bam_quest.sh

Deep Variant example script

cd to your projects directory; the output of the deep variant test script will be written to a new sub-directory called “deepvariant”. 
sbatch -L parabricks:2 /projects/b1042/Parabricks_Training/dv.sh 

For additional information about running Parabricks on Quest, please see the Quest Parabricks Training video, or reach out to quest-help@northwestern.edu.

See Also:




Keywords:research computing, parabricks, genomics, genomics compute cluster, nvidia, clara, quest, gatk   Doc ID:113323
Owner:Research Computing .Group:Northwestern
Created:2021-08-30 11:30 CDTUpdated:2021-09-03 11:20 CDT
Sites:Northwestern
CleanURL:https://kb.northwestern.edu/parabricks
Feedback:  0   0