modules and utility scripts for processing scRNA data
Installation requires the use of Nextflow, a workflow description language (WDL) that enables reproducible parallelization of common bioinformatics tasks. Nextflow provides an executable that requires both Groovy and Java/JDK. Installation of the portable executable is as follows:
wget -qO- https://get.nextflow.io | bash
or
curl -s https://get.nextflow.io | bash
More specific installation instructions for Nextflow can be found here.
git clone https://github.com/GaitiLab/scRNA-utils.git
git checkout main
git pull
Modules represent individual processes for dedicated single cell tasks, such as executing cellranger to running scrublet doublet detection on a series of matrices. They are designed to be run individually, or as part of a larger workflow/pipeline.
Modules can be run using the following generic command:
nextflow run scRNA-utils/modules/{module_selection}/
where the module_selection is the name of the specific module to be run. Currently the available modules can be found in the modules
directory:
cellranger count
on either a directory (recursive or not) of FASTQ files, or a sample sheet with samples and their file outputs specified. See below for more information.split-pipe --mode all
and/or split-pipe --mode comb
. See below for more information.Within the modules directory are two basic pipelines for processing scRNA-seq data from raw FASTQ files:
Below are the links to the specific user documentation for each type of scRNA-seq data.
parseBio split-pipe analysis pipeline
split-pipe instructions using Nextflow
cellranger count pipeline for 10X scRNA
cellranger instructions using Nextflow
Workflows represent more complex and linked series of processes. Currently there is one workflow in development for toggling between both ParseBio and 10X scRNA data. The workflow can be found in workflows
and enables the following behaviour:
--method
as either split-pipe
or cellranger
.Currently under development.