-
Notifications
You must be signed in to change notification settings - Fork 82
Home
jts edited this page Sep 20, 2011
·
17 revisions
SGA is a de novo assembler designed to assemble large genomes from high coverage short read data. It is designed as a modular set of programs, which are used to form an assembly pipeline. A description of the SGA design is found here.
When first learning SGA, it is highly recommended to run one of the example assemblies from the src/examples directory to become familiar with the flow of data through the program. A page containing frequently asked questions can be found here.
The major subcommands of SGA are:
-
preprocess
- Prepare a set of sequence reads for assembly -
index
- Build the FM index for a set of sequence reads -
merge
- Merge two indices together. This can be used to build a distributed indexing pipeline. -
overlap
- Find overlaps between reads to construct a string graph -
fm-merge
- Efficiently merge reads that can be unambiguously assembled -
correct
- Correct base calling errors in a set of reads -
filter
- Remove duplicate and low quality sequences -
assemble
- Construct contigs from a string graph
Detail usage information for each command is printed from the --help option. For example, this command will print the options for the index subprogram:
sga index --help