CirSeq_Human

Description

CirSeq_Human is a pipeline that generates consensus sequences from tandem repeat sequence reads from human genome. It accepts sequence data as gzipped FASTQ files (extension .fastq.gz) and requires a reference genome indexed for HISAT2 and a genome annotation file indicating splicing junctions. The output generated by CirSeq_Human includes the following files:

consensus.fastq.gz

Contains all consensus sequences generated in FASTQ format. Per base quality scores can be converted to estimated error probabilities using the formula: 10**(3*Quality score/-10).
consensus_alignment.bam

Contains all consensus sequences that map to the user-supplied reference sequence. Data is in BAM format.
RepeatCopyDistribution.txt

Contains the counts of reads with 3 - 14 repeat copies. This distribution can aid in diagnosis of issues with sequencing library preparation. The columns in this file are as follows: Repeat copy, counts of reads with the indicated copy.
RepeatLengthDistribution.txt

Contains the counts of reads with repeats from 25 - 99 bases long. This distribution can aid in diagnosis of issues with sequencing library preparation. The columns in this file are as follows: Repeat length (in bases), counts of reads with the indicated length.
Length_Copy_Distribution.txt

Contains the counts of reads with 3-14 repeats from 25 - 99 bases long. This distribution can aid in diagnosis of issues with sequencing library preparation. The columns in this file are as follows: Repeat copy, repeat length (in bases), counts of reads with the indicated copy and length.
ProcessingStats.txt

Summarizes important statistics of the sequencing data processing that may aid in evaluating the quality of sequencing libraries and diagnosing problems with library preparation and sequencing.

System requirements

The following packages are prerequisites for running CirSeq:

Python (version 2.7.5)
Cython (version 0.19.1)
NumPy (version 1.7.1)
SciPy (version 0.13.3)
HISAT2 (version 2.1.0)

NOTE 1: Cython requires a compiler. For OSX this may require installation of Xcode.

NOTE 2: HISAT2 must be in the PATH.

Setup

Execute the following command in the script directory to compile the ConsensusGeneration module:

python setup.py build_ext --inplace

Usage

.[Script directory]/run.sh [output directory] [indexed reference genome] [core] [Script directory] [splicing info] [gzipped FASTQ file(s)]

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
cirseq_human		cirseq_human
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CirSeq_Human

Description

System requirements

Setup

Usage

About

Releases

Packages

Languages

PelechanoLab/CirSeq_Human

Folders and files

Latest commit

History

Repository files navigation

CirSeq_Human

Description

System requirements

Setup

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages