Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Applications #45

Open
QuentinPerriere opened this issue May 2, 2024 · 12 comments
Open

Applications #45

QuentinPerriere opened this issue May 2, 2024 · 12 comments

Comments

@QuentinPerriere
Copy link

QuentinPerriere commented May 2, 2024

Hello,
I hope that you can help me with this :

Can I use nanocaller on my fastq.gz files generated using ONT minion technology ?
Can I use it to detect variants in fungus ?

@kaichop
Copy link
Contributor

kaichop commented May 2, 2024 via email

@QuentinPerriere
Copy link
Author

yes it can work but you may need to adjust ploidy setting

On Thu, May 2, 2024 at 6:23 AM QuentinPerriere @.> wrote: Hello, I hope that you can help me with this : Can I use nanocaller on my fastq.gz files generated using ONT minion technology and a flow Cell R10.4.1 ? Can I use it to detect variants in fungus ? — Reply to this email directly, view it on GitHub <#45>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OHXFI4A5KLIVU2I443ZAIHZJAVCNFSM6AAAAABHDN4OU2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3TKMJUGEZTANY . You are receiving this because you are subscribed to this thread.Message ID: @.>

thank you , which parameter exactly I have to take into account ?

@umahsn
Copy link
Collaborator

umahsn commented May 2, 2024

You can use --haploid_genome to run haploid models.

@emilydolivo97
Copy link

emilydolivo97 commented May 2, 2024

You can use --haploid_genome to run haploid models.
soeey for interferring in this issue:
In the case where I have diploid organism does it cause problem ? hoow can I set the ploidy ?

@QuentinPerriere
Copy link
Author

You can use --haploid_genome to run haploid models.
soeey for interferring in this issue:
In the case where I have diploid organism does it cause problem ? hoow can I set the ploidy ?

I think that by default it's haploid so when u use this argument , nanocaller will no longer considerate it as haploid.
@umahsn correct me if I'm wrong please

@umahsn
Copy link
Collaborator

umahsn commented May 2, 2024 via email

@QuentinPerriere
Copy link
Author

@umahsn , the model was trained on the human genome, whereas in my case, I'm working on fungi. Will this cause any problems? I'm saying this because I'm not detecting the variants that I'm supposed to detect

@umahsn
Copy link
Collaborator

umahsn commented May 8, 2024

Hi,

I tested NanoCaller on fungi dataset and found a problem with calling variants in a relatively lower depth region where the coverage may drop by a factor of 10 or so compared to neighboring few kbp regions. Are you having a similar problem? I am working on a fix for this issue and will make an update soon.

@emilydolivo97
Copy link

Hi,

I tested NanoCaller on fungi dataset and found a problem with calling variants in a relatively lower depth region where the coverage may drop by a factor of 10 or so compared to neighboring few kbp regions. Are you having a similar problem? I am working on a fix for this issue and will make an update soon.

yes exactly I don't find the expected variants.

@umahsn
Copy link
Collaborator

umahsn commented May 14, 2024

I have added an option to disable coverage normalization: --disable_coverage_normalization which is recommended for high coverage samples such as amplicon sequencing or ultra-deep microbial samples if you are using haploid model. The problem may have been happening if there is a candidate site that has, lets say 100X coverage, but within 1-2 kbp there average coverage is 1000X, then NanoCaller haploid model has less confident variant prediction due to relatively lower coverage at that site compared to surrounding region. This is usually very helpful in whole genome sequencing datasets where low coverage regions are usually tandem repeat or low complexity regions where variant calls may not be reliable and coverage normalization takes that into account. However, it may not be necessary for ultra deep coverage samples.

This update is in github repo only so you would need to use git pull to get latest changes. I will add it to the next release if this fixes the problem for you.

@emilydolivo97
Copy link

I have added an option to disable coverage normalization: --disable_coverage_normalization which is recommended for high coverage samples such as amplicon sequencing or ultra-deep microbial samples if you are using haploid model. The problem may have been happening if there is a candidate site that has, lets say 100X coverage, but within 1-2 kbp there average coverage is 1000X, then NanoCaller haploid model has less confident variant prediction due to relatively lower coverage at that site compared to surrounding region. This is usually very helpful in whole genome sequencing datasets where low coverage regions are usually tandem repeat or low complexity regions where variant calls may not be reliable and coverage normalization takes that into account. However, it may not be necessary for ultra deep coverage samples.

This update is in github repo only so you would need to use git pull to get latest changes. I will add it to the next release if this fixes the problem for you.

@umahsn , Thank you for taking the time to answer my question. Based on your previous response, since I'm dealing with fungi (a diploid organism), I don't need to use the parameter --haploid_genome

you answer :
By default NanoCaller assumes diploid genome for all chromosomes if no
ploidy is specified. If you use --haploid_genome flag then it will use
haploid model and genotype predictions. We suggested using haploid model
assuming your fungus sample is in a haploid life cycle. If not, please
ignore the --haploid_genome flag and use default parameters.

what I should do in this case please ?

@umahsn
Copy link
Collaborator

umahsn commented May 29, 2024

If you are processing a diploid organism, then do not use --haploid_genome parameter, and keep default parameters. Use --disable_coverage_normalization if you sample is processed with amplicon or targeted sequencing. If it is whole genome sequencing then you do not need to use --disable_coverage_normalization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants