Releases: WillFilipski/entro.py
Major Fixes
It has come to my attention that the .count()
function (upon which this module relies heavily) is shockingly slow as it iterates through the sequence each time it finds a pair. Replacing it would therefore save O(n) computation time, so fixing it has become a necessity.
In addition, I found out that count() doesn't actually count all of the doublets in a given sequence... To test this, run sum(entro.fij(n, sequence))
and you will find it is not equal to 1. In the sequence "111," the .count()
function will count only one doublet "11" even though there are two! The newer Counter()
function fixes this, while also reducing run time.
The silver lining about replacing .count()
is that the collections package is excellent. I have removed all dependencies on input alphabets (previously defined as n = ["e", "t", "c"]
) as the code should be able to calculate that itself from the input sequence.
This update represents massive fixes for run times and corrections to calculations. So while the code is not as pretty to look at anymore, it is far more functional. My honest recommendation is if you did ANY calculations at all with the previous version, they are likely wrong and should be redone!! Whoops!
My plan for the next update is to get triplet frequencies working and the associated informational parameters. Hopefully in the future I can add nth frequencies. Stay tuned!
Full Release
Code is now modularized and useable. Follow instructions on updated README.md.
Initial release
v0.1.0 Added documentation for functions.