Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAPPAS accumulates floating-point rounding errors during computation #9

Open
nromashchenko opened this issue Sep 9, 2022 · 0 comments
Labels
bug Something isn't working

Comments

@nromashchenko
Copy link
Member

nromashchenko commented Sep 9, 2022

RAPPAS may miscalculate the scores of k-mers in a window for k sufficiently high if many k-mers are alive in this window. I found examples (D652 dataset, k=10, o=1.5) where the scores computed by RAPPAS v1.21 differ from the real values (computed manually) by up to 1e-5 (in non-log values). While it does not seem to be a lot, the compound effect of those little differences while placing queries produces placements that are different compared to RAPPAS2. (for the examples I found, XPAS scores are much closer to the real values).

This happens in src/core/algos/WordExplorer_v3.java:

currentLogSum+=session.parsedProbas.getPP(nodeId, i, j);
...
currentLogSum-=session.parsedProbas.getPP(nodeId, i, j);

where the class variable currentLogSum accumulates the rounding error from adding and subtracting the same value. The error is the higher the more k-mers are alive in the window (i.e., the number of times we change the variable is O(|alive k-mers|))

The easy fix is to make currentLogSum a local variable (change it only O(k) times instead).

@nromashchenko nromashchenko added the bug Something isn't working label Sep 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant