Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with gaps in fasta file #28

Open
AleSR13 opened this issue Jul 30, 2021 · 7 comments
Open

Error with gaps in fasta file #28

AleSR13 opened this issue Jul 30, 2021 · 7 comments

Comments

@AleSR13
Copy link

AleSR13 commented Jul 30, 2021

Hello! I am trying to run CanSNPer2 in three F. tularensis samples. With one sample it works as expected. However, with the other two I get this error:

Run 1 alignments to references using progressiveMauve
2021-07-30 10:55:06,036 CanSNPer2 [WARNI]  Input sequence is not free of gaps, replace gaps with N and retry!!
2021-07-30 10:55:06,047 CanSNPer2 [WARNI]  Input sequence is not free of gaps, replace gaps with N and retry!!
2021-07-30 10:55:06,056 CanSNPer2 [WARNI]  Input sequence is not free of gaps, replace gaps with N and retry!!
2021-07-30 10:55:06,060 CanSNPer2 [WARNI]  Input sequence is not free of gaps, replace gaps with N and retry!!
2021-07-30 10:55:06,525 CanSNPer2 [WARNI]  Input sequence is not free of gaps, replace gaps with N and retry!!
Traceback (most recent call last):
  File ".../conda/envs/cansnper/lib/python3.9/site-packages/CanSNPer2/modules/CanSNPer2.py", line 350, in run
    logger.warning("Mauve error skip {sample}".format(q))
KeyError: 'sample'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".../conda/envs/cansnper/bin/CanSNPer2", line 10, in <module>
    sys.exit(main())
  File "/mnt/scratch_dir/hernanda/conda/envs/cansnper/lib/python3.9/site-packages/CanSNPer2/CanSNPerTree.py", line 164, in main
    CanSNPer2_obj.run(database=args.database)
  File ".../conda/envs/cansnper/lib/python3.9/site-packages/CanSNPer2/modules/CanSNPer2.py", line 407, in run
    raise CanSNPer2Error("A file did not run correctly exit CanSNPer2 (use --keep_going to continue with next file!)")
CanSNPer2.modules.CanSNPer2.CanSNPer2Error: 'A file did not run correctly exit CanSNPer2 (use --keep_going to continue with next file!)'

I saw in #15 that a function had been implemented to correct for these gaps but either it is not working or I don't know how to activate it. I tried also to simly look for them and replace them but if I do:

grep "-" samples/my_sample1.fasta

I cannot find any dashes. Am I missing something?

Btw, I installed CanSNPer through conda using this yaml file:

name: cansnper
channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - cansnper2=2.0.6
  - ete3
@CarolineOhrman
Copy link
Contributor

Hi AleSR13,
Sorry for not replying untli now. Have totally missed it.

I dont know if you still have this problem but I will try to answer from my perspective.

We are trying to find a replacement for progressiveMauve for doing the alignments in canSNPer2 but have not implemented this yet. My experience is that it is not always easy to know why progressiveMauve crashes and the dashes is only one reason, but "-" isnt always the problem. Try to reformat the headers in your fasta to something simple like ">1" ">2" etc. Sometimes if the headers are too long or have any special characters this can cause the same problem. Is it working with other genomes? Try downloading a public one and try and tell me if that one has the same problem., or could you send me the fasta so I can try?

Kind regards
Caroline

@habix87
Copy link

habix87 commented Jan 22, 2024

Hi! I am trying to run CanSNPer2 for one F. tularensis sample. no luck

Run 1 alignments to references using progressiveMauve
Traceback (most recent call last):
File "/home/habix87/miniforge3/envs/myenvname/lib/python3.6/site-packages/CanSNPer2/modules/CanSNPer2.py", line 350, in run
logger.warning("Mauve error skip {sample}".format(q))
KeyError: 'sample'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/habix87/miniforge3/envs/myenvname/bin/CanSNPer2", line 10, in
sys.exit(main())
File "/home/habix87/miniforge3/envs/myenvname/lib/python3.6/site-packages/CanSNPer2/CanSNPerTree.py", line 164, in main
CanSNPer2_obj.run(database=args.database)
File "/home/habix87/miniforge3/envs/myenvname/lib/python3.6/site-packages/CanSNPer2/modules/CanSNPer2.py", line 407, in run
raise CanSNPer2Error("A file did not run correctly exit CanSNPer2 (use --keep_going to continue with next file!)")
CanSNPer2.modules.CanSNPer2.CanSNPer2Error: 'A file did not run correctly exit CanSNPer2 (use --keep_going to continue with next file!)'

@CarolineOhrman
Copy link
Contributor

Hi! Could you please specify which database (own or dowloaded from CanSNPer2-data), the references and also the command you are using when you get this error, or if you are able to share the sequence I can make a try.

progressiveMauve are prone to get errors when the sequence id contains characters like "-". One thing to test is to rename the sequence ids to something like ">mygenome1" ">mygenome2" and so on.

Kind regards
Caroline

@habix87
Copy link

habix87 commented Jan 23, 2024

Hi, i realised that i had installled cansnper instead of cansnper2. So I installed cansnper2 back and try again. However I still getting error message. I'm using francisella_tularensis.db downloaded from CaSNPer2 and I already renamed sequence ID to >mygenome1.

(cansnper2) habix87@DESKTOP-1L3NIPV:~$ CanSNPer2 --database downloaded_database.db fastadir/*.fasta --summary
Run 1 alignments to references using progressiveMauve
2024-01-23 13:51:29,664 CanSNPer2 [WARNI] WARNING progressiveMauve finished with a exitcode: 11
2024-01-23 13:51:29,676 CanSNPer2 [WARNI] WARNING progressiveMauve finished with a exitcode: 11
2024-01-23 13:51:29,679 CanSNPer2 [WARNI] WARNING progressiveMauve finished with a exitcode: 11
2024-01-23 13:51:29,682 CanSNPer2 [WARNI] WARNING progressiveMauve finished with a exitcode: 11
2024-01-23 13:51:29,685 CanSNPer2 [WARNI] WARNING progressiveMauve finished with a exitcode: 11
2024-01-23 13:51:29,685 CanSNPer2 [WARNI] Mauve ran into an error with sequence fastadir/RZ272.fasta may contain a dash, replace with N characters and retry mauve
2024-01-23 13:51:29,707 CanSNPer2 [WARNI] WARNING progressiveMauve finished with a exitcode: 11
2024-01-23 13:51:29,709 CanSNPer2 [WARNI] WARNING progressiveMauve finished with a exitcode: 11
2024-01-23 13:51:29,715 CanSNPer2 [WARNI] WARNING progressiveMauve finished with a exitcode: 11
2024-01-23 13:51:29,719 CanSNPer2 [WARNI] WARNING progressiveMauve finished with a exitcode: 11
2024-01-23 13:51:29,722 CanSNPer2 [WARNI] WARNING progressiveMauve finished with a exitcode: 11
Traceback (most recent call last):
File "/home/habix87/miniforge3/envs/cansnper2/lib/python3.6/site-packages/CanSNPer2/modules/CanSNPer2.py", line 350, in run
logger.warning("Mauve error skip {sample}".format(q))
KeyError: 'sample'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/habix87/miniforge3/envs/cansnper2/bin/CanSNPer2", line 10, in
sys.exit(main())
File "/home/habix87/miniforge3/envs/cansnper2/lib/python3.6/site-packages/CanSNPer2/CanSNPerTree.py", line 164, in main
CanSNPer2_obj.run(database=args.database)
File "/home/habix87/miniforge3/envs/cansnper2/lib/python3.6/site-packages/CanSNPer2/modules/CanSNPer2.py", line 407, in run
raise CanSNPer2Error("A file did not run correctly exit CanSNPer2 (use --keep_going to continue with next file!)")
CanSNPer2.modules.CanSNPer2.CanSNPer2Error: 'A file did not run correctly exit CanSNPer2 (use --keep_going to continue with next file!)'
(cansnper2) habix87@DESKTOP-1L3NIPV:~$

@habix87
Copy link

habix87 commented Jan 24, 2024

Hi! Could you please specify which database (own or dowloaded from CanSNPer2-data), the references and also the command you are using when you get this error, or if you are able to share the sequence I can make a try.

progressiveMauve are prone to get errors when the sequence id contains characters like "-". One thing to test is to rename the sequence ids to something like ">mygenome1" ">mygenome2" and so on.

Kind regards Caroline

How can I share the sequence with you?

@CarolineOhrman
Copy link
Contributor

Zip the file and upload it here or send it to my email caroline.ohrman[at]foi.se

@habix87
Copy link

habix87 commented Jan 29, 2024

Hi, I already share the sequence. Have you tried yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants