-
Notifications
You must be signed in to change notification settings - Fork 133
Blast2Sam
Pierre Lindenbaum edited this page Dec 19, 2013
·
5 revisions
##Motivation
Convert a BLASTN-XML input to SAM
##Compilation
See also Compilation.
$ ant blast2sam
##Usage
$ java -jar dist/blast2sam.jar -r ref.fa (stdin|blastn.xml)
##Options
Option | Description |
---|---|
-r (file) | fasta sequence file indexed with picard. Required. |
-o (file.bam) | filename out . Default: SAM stdout |
-p (int expected size) | input is an interleaved list of sequences forward and reverse (paired-ends) |
-h | get help (this screen) and exit. |
-v | print version and exit. |
-L (level) | log level. One of java.util.logging.Level . Optional. |
##Source Code Main code at: https://github.com/lindenb/jvarkit/blob/master/src/main/java/com/github/lindenb/jvarkit/tools/blast2sam/BlastToSam.java ##Example
The following Makefile downloads a reference , generates some FASTQs, align them with blastn and convert it to SAM:
BLASTN=/commun/data/packages/ncbi/ncbi-blast-2.2.28+/bin/blastn
SAMTOOLS=/commun/data/packages/samtools-0.1.19
JVARKIT=/home/lindenb/src/jvarkit-git/dist/
SHELL=/bin/bash
.PHONY:all reads clean
all: out.sam
out.sam: ref.fa ref.fa.fai out.read1.fq out.read2.fq
paste \
<(cat out.read1.fq | paste - - - - | cut -f 1,2 ) \
<(cat out.read2.fq | paste - - - - | cut -f 1,2 ) |\
tr "\t" "\n" |\
sed 's/^@/>/' |\
${BLASTN} -subject ref.fa -dust no -outfmt 5 | \
java -jar ${JVARKIT}/blast2sam.jar -r ref.fa -p 500 |\
${SAMTOOLS}/samtools view -Sh -f 2 - > $@
reads: out.read1.fq out.read2.fq
out.read1.fq out.read2.fq: ref.fa ref.fa.fai
${SAMTOOLS}/misc/wgsim -d 100 -N 500 -1 50 -2 50 $< out.read1.fq out.read2.fq > /dev/null
ref.fa:
curl -k -o $@ "https://raw.github.com/lindenb/genomehub/master/data/rotavirus/rf/rf.fa"
ref.fa.fai: ref.fa
${SAMTOOLS}/samtools faidx $<
clean:
rm -f ref.fa.fai ref.fa out.sam
###Ouput
@HD VN:1.4 SO:unsorted
@SQ SN:RF01 LN:3302
@SQ SN:RF02 LN:2687
@SQ SN:RF03 LN:2592
@SQ SN:RF04 LN:2362
@SQ SN:RF05 LN:1579
@SQ SN:RF06 LN:1356
@SQ SN:RF07 LN:1074
@SQ SN:RF08 LN:1059
@SQ SN:RF09 LN:1062
@SQ SN:RF10 LN:751
@SQ SN:RF11 LN:666
@RG ID:g1 LB:blast DS:blast SM:blast
@PG ID:0 PN:blastn VN:BLASTN_2.2.28+
@PG ID:1 PN:com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam PP:0 VN:3365d9b714aa43d4fba44bfbf102a179a1f1573f CL:-r ref.fa -p 500
RF01_445_573_0:0:0_0:0:0_0/1 83 RF01 524 40 50= = 445 -30 GTGCCTTGGTACACCATATTTATTTACTGTTGAAGCTACTATAGTGAATA JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:93.4528 BE:f:9.71473e-24 RG:Z:g1 NM:i:0 BS:f:50
RF01_445_573_0:0:0_0:0:0_0/2 163 RF01 445 40 50= = 524 30 AATGCAGTTATGTTCTGGTTGGAAAAACATGAAAATGACGTTGCTGAAAA JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:93.4528 BE:f:9.71473e-24 RG:Z:g1 NM:i:0 BS:f:50
RF01_1193_1294_1:0:0_1:0:0_1/1 83 RF01 1245 40 38=1X11= = 1193 -3 CCATTACATGCATATTCTTTTTAGTCGAAAAAATTGTCATTCTACCAAAT JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:87.9128 BE:f:4.51982e-22 RG:Z:g1 NM:i:0 BS:f:47
RF01_1193_1294_1:0:0_1:0:0_1/2 163 RF01 1193 40 4=1X45= = 1245 3 CTGGATTACTATCAATGTCATCAGCGTCGAATGGTGAATCAAGACAACTA JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:87.9128 BE:f:4.51982e-22 RG:Z:g1 NM:i:0 BS:f:47
RF01_638_718_1:0:0_0:0:0_2/1 83 RF01 669 40 50= = 638 18 ATGACAGTACTATCAGTTCTCTCGCAATTAAATAATCTTCATGAGAAAAA JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:93.4528 BE:f:9.71473e-24 RG:Z:g1 NM:i:0 BS:f:50
RF01_638_718_1:0:0_0:0:0_2/2 163 RF01 638 40 4=1X45= = 669 -18 CAAAATCTTCAATTGAAATGCTGATGTCAGTTTTTTCTCATGAAGATTAT JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:87.9128 BE:f:4.51982e-22 RG:Z:g1 NM:i:0 BS:f:47
RF01_1404_1584_0:0:0_2:0:0_3/1 99 RF01 1404 40 50= = 1535 179 ATTTATCTTACCATATGAATATTTCATAGCACAACATGCTGTAGTTGAAA JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:93.4528 BE:f:9.71473e-24 RG:Z:g1 NM:i:0 BS:f:50
RF01_1404_1584_0:0:0_2:0:0_3/2 147 RF01 1535 40 1S42=1X6= = 1404 -179 NGACACGTCTGTATATAGTACCATAGAGTTATTAGATAAAAAGGGTGTAA #JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:86.0662 BE:f:1.62562e-21 RG:Z:g1 NM:i:0 BS:f:46
RF01_284_373_0:0:0_1:0:0_5/1 99 RF01 284 40 50= = 324 89 TAGTAAAATATGCAAAAGGTAAGCCGCTAGAAGCAGATTTGACAGTGAAT JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:93.4528 BE:f:9.71473e-24 RG:Z:g1 NM:i:0 BS:f:50
RF01_284_373_0:0:0_1:0:0_5/2 147 RF01 324 40 8=1X41= = 284 -89 AAAGTTCATATGTTATCTTGTTATTTTCATAATCCAACTCATTCACTGTC JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:87.9128 BE:f:4.51982e-22 RG:Z:g1 NM:i:0 BS:f:47
RF01_1704_1823_1:0:0_0:0:0_7/1 83 RF01 1774 40 50= = 1704 -21 ATTGAATTCGCTGCTTTCGTCTGCTTCTCTCCTGACGCTACAGCCCCATA JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:93.4528 BE:f:9.71473e-24 RG:Z:g1 NM:i:0 BS:f:50
RF01_1704_1823_1:0:0_0:0:0_7/2 163 RF01 1704 40 5=1X44= = 1774 21 ACAGAGGCAAATTAATCTAATGGATTCATACGTTCAAATACCAGATGGTA JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:87.9128 BE:f:4.51982e-22 RG:Z:g1 NM:i:0 BS:f:47
RF01_689_741_1:0:0_1:0:0_8/1 83 RF01 692 40 19=1X30= = 689 46 TGCCAGAGTCGATCTATTATAATATGACAGTACTATCAGTTCTCTCGCAA JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:87.9128 BE:f:4.51982e-22 RG:Z:g1 NM:i:0 BS:f:47
RF01_689_741_1:0:0_1:0:0_8/2 163 RF01 689 40 30=1X19= = 692 -46 TAATTGCGAGAGAACTGATAGTACTGTCATCTTCTAATAGATCGACTCTG JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:87.9128 BE:f:4.51982e-22 RG:Z:g1 NM:i:0 BS:f:47
RF01_532_688_0:0:0_1:0:0_9/1 99 RF01 532 40 50= = 639 156 ATAGTAGCTTCAACAGTAAATAAATATGGTGTACCAAGGCACAACGCGAA JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ BB:f:93.4528 BE:f:9.71473e-24 RG:Z:g1 NM:i:0 BS:f:50
(...)
##See also
.
##History
- 2013: Creation