Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sga-align: prepareReads: Cannot parse record #121

Open
sjackman opened this issue Jul 25, 2016 · 5 comments
Open

sga-align: prepareReads: Cannot parse record #121

sjackman opened this issue Jul 25, 2016 · 5 comments

Comments

@sjackman
Copy link
Contributor

sga-align -t 64 --name pe400 hsapiens-contigs.fa pe400.fa.gz
…
Completed Task = 'indexContigs' 
Task enters queue = 'prepareReads' 
Cannot parse record >HISEQ1:93:H2YHMBCXX:1:1101:1165:2015 at /gsc/btl/linuxbrew/bin/sga-deinterleave.pl line 63, <IN> line 2.

The file pe400.fa.gz is interleaved paired-end reads. The first 8 lines are:

>HISEQ1:93:H2YHMBCXX:1:1101:1165:2015 ec:Z:0_0:1_0_1:0_0
TTATACAAAGAATTAAGAACAAAAGTGAAATTGAATATTTTTTAATTGCTCTAAAAGTTAATGGACTATTTAAAACAAAAATTATAAAAATATGTTTATACCATTAATAGAAGTAAAATATATAAAACCATGGAATAACACACAGACTAGGAGGACTTGGGAATATGCTGTTACATTGCATATTAAGTGGTATTATATTATTTGAAGTTAGATTTATTAACAATTACAGAGCTAATTTTTTTTTTAAAAA
>HISEQ1:93:H2YHMBCXX:1:1101:1165:2015 ec:Z:0_0:1_0_1:0_0
CTGACATCTTTCTGGCATCCTTAAAAGCCCTGGCTTTTAAGCATAACTTCTTGACCTACTTGTTCCCTTCCTGAGCATGAGAGCAGTGGTGACTCAGGAACAGGAAAGGCAGACCACAGTGGTGACAGTGTTTTCCTCAAAGAGGATTTATACCTGTTTTTTTAAAAAAAAAATTAGCTCTGTAATTGTTAATAAATCTAACTTCAAATAATATAATACCACTTAATATGCAATGTAACAGCATATTCCC
>HISEQ1:93:H2YHMBCXX:1:1101:1157:2041 ec:Z:0_0:3_0_3:0_0
GACCCGGTCCTGCGATTTGTCCCGTTGTAGACCTGGGAACAGGCAGGCGGGAACTGGGGGCTTTACTGGGGGATTTGAGGCTGGGGAGGGGGAGGGAGCAAATGTCATGGCTGGCTCGCTCAAGCATCCAGGGAACCGAAGCTAAGCGCATCCTGACGGGCTTTTAAAATGACATTGATTAGGACAAGCTGTTCCCAACCCCAGTAAGAGTTAATCTGCCTGTTAATCAAGGCACTAAGGGGCTCAATGC
>HISEQ1:93:H2YHMBCXX:1:1101:1157:2041 ec:Z:0_0:29_0_28:2_0
CCCCGGGCAGCGGTTTTCCCCGCTAGCCAGGTTTGGAAGTCACCCTCTGTGAGACTGGGTTAGGAAGTGACGAAAAGCGCCGAATTGTTTTCAAATTGAAAATACTTTTTTTTTTTTTTTTGGAGATAGCGCTGACAAATATATGGGATCCCGGCTTTTGATCCCTGGCTGCCGCCTCTGTTCTCCTGTCGCTAATAAAACTCGCATTGAGCCCCTTAGTGCCTTGATTAACAGGCAGATTAACTCTTAC
@jts
Copy link
Owner

jts commented Jul 25, 2016

What variant of FASTQ is that? I don't recognise the SAM-like key/value
pair.

On Mon, Jul 25, 2016 at 5:57 PM, Shaun Jackman [email protected]
wrote:

sga-align -t 64 --name pe400 hsapiens-contigs.fa pe400.fa.gz

Completed Task = 'indexContigs'
Task enters queue = 'prepareReads'
Cannot parse record >HISEQ1:93:H2YHMBCXX:1:1101:1165:2015 at /gsc/btl/linuxbrew/bin/sga-deinterleave.pl line 63, line 2.

The file pe400.fa.gz is interleaved paired-end reads. The first 8 lines
are:

HISEQ1:93:H2YHMBCXX:1:1101:1165:2015 ec:Z:0_0:1_0_1:0_0
TTATACAAAGAATTAAGAACAAAAGTGAAATTGAATATTTTTTAATTGCTCTAAAAGTTAATGGACTATTTAAAACAAAAATTATAAAAATATGTTTATACCATTAATAGAAGTAAAATATATAAAACCATGGAATAACACACAGACTAGGAGGACTTGGGAATATGCTGTTACATTGCATATTAAGTGGTATTATATTATTTGAAGTTAGATTTATTAACAATTACAGAGCTAATTTTTTTTTTAAAAA
HISEQ1:93:H2YHMBCXX:1:1101:1165:2015 ec:Z:0_0:1_0_1:0_0
CTGACATCTTTCTGGCATCCTTAAAAGCCCTGGCTTTTAAGCATAACTTCTTGACCTACTTGTTCCCTTCCTGAGCATGAGAGCAGTGGTGACTCAGGAACAGGAAAGGCAGACCACAGTGGTGACAGTGTTTTCCTCAAAGAGGATTTATACCTGTTTTTTTAAAAAAAAAATTAGCTCTGTAATTGTTAATAAATCTAACTTCAAATAATATAATACCACTTAATATGCAATGTAACAGCATATTCCC
HISEQ1:93:H2YHMBCXX:1:1101:1157:2041 ec:Z:0_0:3_0_3:0_0
GACCCGGTCCTGCGATTTGTCCCGTTGTAGACCTGGGAACAGGCAGGCGGGAACTGGGGGCTTTACTGGGGGATTTGAGGCTGGGGAGGGGGAGGGAGCAAATGTCATGGCTGGCTCGCTCAAGCATCCAGGGAACCGAAGCTAAGCGCATCCTGACGGGCTTTTAAAATGACATTGATTAGGACAAGCTGTTCCCAACCCCAGTAAGAGTTAATCTGCCTGTTAATCAAGGCACTAAGGGGCTCAATGC
HISEQ1:93:H2YHMBCXX:1:1101:1157:2041 ec:Z:0_0:29_0_28:2_0
CCCCGGGCAGCGGTTTTCCCCGCTAGCCAGGTTTGGAAGTCACCCTCTGTGAGACTGGGTTAGGAAGTGACGAAAAGCGCCGAATTGTTTTCAAATTGAAAATACTTTTTTTTTTTTTTTTGGAGATAGCGCTGACAAATATATGGGATCCCGGCTTTTGATCCCTGGCTGCCGCCTCTGTTCTCCTGTCGCTAATAAAACTCGCATTGAGCCCCTTAGTGCCTTGATTAACAGGCAGATTAACTCTTAC


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#121, or mute the thread
https://github.com/notifications/unsubscribe-auth/AAXxn2Tz8F5jw5EYs3NFVv5que0qPcB9ks5qZTFKgaJpZM4JUlv6
.

@sjackman
Copy link
Contributor Author

It's produced by BFC.

@jts
Copy link
Owner

jts commented Jul 25, 2016

Is it safe to assume that the first record is always the first end of the
pair? Alternatively you could use the uncorrected reads in scaffolding
(which I typically recommend anyway)

On Mon, Jul 25, 2016 at 6:45 PM, Shaun Jackman [email protected]
wrote:

It's produced by BFC https://github.com/lh3/bfc.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#121 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAXxn1dTwKCRs2M9GuuBbdvDEtWXBsL_ks5qZTybgaJpZM4JUlv6
.

@sjackman
Copy link
Contributor Author

Yes, the first record is always the first read of the pair / mate-pair. FR orientation for PE and RF orientation for MP. Good suggestion. If there's no easy workaround for using the corrected reads, I'll use the uncorrected reads.

@sjackman
Copy link
Contributor Author

I instead aligned the reads using bwa mem

bwa mem -t32 -p contigs.fa reads.fa.gz | samtools view -F2304 -b -o reads.bam -

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants