Snakemake pipeline that corrects for ambient RNA expression in single-cell RNA sequencing (scRNA-seq) data using DecontX.
This project contains the code to run a two-phase islet decontamination protocol, as well as configurations and wrapper scripts to run the pipeline on sample data.
- Clone the GitHub repo and cd into the repo directory.
# clone the repo
git clone https://github.com/CollinsLabBioComp/islet_decontamination.git
# set code base path
SNK_REPO="$(pwd)/islet_decontamination"
cd ${SNK_REPO}
- Launch the Docker app and download the Docker image
docker pull letaylor/sc_decontx:latest
- Run the sample data:
chmod +x ./run_docker.sh
./run_docker.sh
The expected input data is a folder containing standard 10x outputs. Each folder should contain the following standard folders:
[sample]/outs/filtered_feature_bc_matrix
[sample]/outs/raw_feature_bc_matrix
with each [raw/filtered]_feature_bc_matrix
folder containing barcodes.tsv.gz
, features.tsv.gz
, and matrix.mtx.gz
. For reference, see the provided sample data.
To configure to use your own data:
- Update
workflow/src/threeprime.yaml
to change the run ID (name
) and specify the sample IDs (samples
). note: if your 10x output directory format differs, you may need to updateinput_dir_basename
andinput_path_format
to match. - Place the 10x output folder (containing a minimum of
outs/
, see here for reference) in the./data/
folder. Alternatively, you can modifyinput_dir_base
andinput_path_format
so the base (i.e.data/
) points to your parent directory containing all samples.