Designed to be scalable, these tools provide extremely accurate findings in a high-performance computing environment. They enable DNA, RNA, and single-cell processes that begin with base calling and go through tertiary analysis, among other things.
Basics of Clara Parabricks container
Clara Parabricks is a software package for faster genomic analysis developed by NVIDIA to serve the three core NGS applications: somatic variant identification, germline variant analysis, and RNA-Seq analysis. With the Clara Parabricks program, the general objective is to provide a minimum order of magnitude improvement in compute time while still providing equivalent outputs and lowering analysis expenses.
Clara Parabricks, a sophisticated suite of genetic analysis tools, is now available on AWS as an AWS Marketplace AMI, and it can be downloaded through the AWS Marketplace. It achieves excellent performance over a wide range of instances and may be used right out of the box to meet the most basic bioinformatics requirements.
At present, the Clara Parabricks accelerated analysis tools start with a FASTQ file and go from there for variant calling, alignment and expression analysis, as well as quality control tools for the different outputs. The 33 tools suite support somatic, germline, and RNA-Seq pipelines in their entire workflow. The modular designs of the tools allow them to easily address the specific demands of the projects.
Because of the accelerations of the pipelines, customers may build several variant callers to extract maximum information from their data while still generating outcomes in a short time and at a cost lesser than they would have incurred with traditional baseline software solutions. The HaplotypeCaller tool from GATK and the DeepVariant tool from Google, for example, may be used to build two VCFs using the same dataset. Researchers may use this information to either reduce the false-negative rates or use the intersection to enhance their false-positive rates. The processing of a normal 30x WGS sample can be completed within an hour on a p4d.24xlarge instance of AWS, which can run both variant callers simultaneously.
Interacting with the Parabricks NGC container is accomplished using the pbrun command line program. The NGC Parabricks container is transparently downloaded, provisioned, and operated by the host via the pbrun command-line tool.
The steps outlined here will walk you through the installation process and configuration of pbrun on your local machine.
You can download the Parabricks installer by following the procedures on the NVIDIA Parabricks developer site to receive a free license for 90 days. To do so, please visit the NVIDIA Parabricks developer page. Once you have completed the registration process, a link to download the Parabricks installer package will be mailed to the email address you provided. Download the package to your local machine and copy the package installed to the GPU server using file transfer clients, if needed.
Follow the steps outlined below to browse the directory containing parabricks installer.tar.gz and install it.
# tar -xzf parabricks.tar.gz
# cd parabricks
Install the Parabricks suite
Installer.py must be executed for pbrun to be installed on the host system. Below is a summary of the most popular installs; for a comprehensive list of installation choices, check the Parabricks installation manual.
nvidia-container-toolkit, nvidia-docker2, and parabricks 2.5.0 NGC container image are all included in this Parabricks 2.5.0 image, and they are required packages for the parabricks installation to function properly.
# sudo ./installer.py --ngc --container docker
Check the pbrun version to ensure that the installation was successful.
# pbrun --version
Why Should You Use Nvidia Parabricks?
The analysis of genomic data requires a significant amount of computer power. The use of genomic data for precision medicine is hampered by considerable time and financial constraints. This is made possible through the NVIDIA Parabricks Genomics Analysis Toolkit, which provides GPU-accelerated genomic analysis. Data analysis that used to take days to complete is now completed in under an hour.
Parabricks is a software package that allows you to do secondary analysis on DNA data from next-generation sequencing (NGS). One of the most significant advantages of Parabricks is that it is intended to produce results at breakneck rates thanks to the use of NVIDIA GPU acceleration. Using Parabricks, an entire human genome may be analyzed in around 45 minutes compared to 30 hours for 30x WGS data. The finest thing is that the output results are precisely the same as those produced by frequently used software. As a result, it is rather straightforward to check the output for correctness.