Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict a single class for all bases in a non-repeat subsequence #53

Open
williamstark01 opened this issue Aug 24, 2022 · 0 comments
Open

Comments

@williamstark01
Copy link
Collaborator

williamstark01 commented Aug 24, 2022

I notice that the model does a good job with predicting a repeat but struggles with replicating the sequence, here is the first parts of the subsequences of this sample prediction:

AGAACCTATTATTTGCATGA🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑TAGAAGAAACCTGTATTTTTTTCATCA
CGAAATTTATTATTTATATA🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑TAAAAAAAATTTATATTTTTTTTATTA

I realize that we don't need this functionality from the model, as we only need the absence of a repeat in these subsequences. Would it make sense then to predict a single additional class for bases in non-repeat subsequences, making the prediction and output of the model like this?

AGAACCTATTATTTGCATGA🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑TAGAAGAAACCTGTATTTTTTTCATCA
____________________🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑🥑___________________________

(Or any other character to represent the absence of a repeat.)

Would that be easy to test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant