You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I stumbled upon this paper and would like to reproduce some of the results in table 1.
However, when running the code as indicated in the README, values seem to be quite off.
Should it be possible to reproduce the results in table 1 with this codebase?
If yes, what arguments are necessary to get these results.
Concretely, I tried to reproduce the ZINC results by following the README (as close as possible).
After setting up the environment and downloading the zinc250k.csv file from moflow, I was able to run the data_preprocess.py script.
After downloading the models, I managed to run the following scripts (if I remember correctly):
However, it might be that I already had to fix the mflow import statements at this stage and ran the generate_prop_ranges.py script at this point.
After creating the zinc250k.txt file from zinc250k.csv and after running generate_prop_ranges.py I should have been able to run calculate_statistics_single_prop.py --mani_range 1, although this also might have required some changes to the original code already.
After some further modifications (most notably by creating directories that were missing for the code to work), I also managed to run the random and largest baselines as follows:
Thanks for your interest in our paper! We have refactored the code before we release it. From first glance the results make sense that ChemSpacE outperforms the baseline methods by a large margin as they are very simple. I will try to find some time to look through it but I think the results are not very surprising despite different than what we reported in the paper.
I stumbled upon this paper and would like to reproduce some of the results in table 1.
However, when running the code as indicated in the README, values seem to be quite off.
Should it be possible to reproduce the results in table 1 with this codebase?
If yes, what arguments are necessary to get these results.
Concretely, I tried to reproduce the ZINC results by following the README (as close as possible).
After setting up the environment and downloading the
zinc250k.csv
file from moflow, I was able to run thedata_preprocess.py
script.After downloading the models, I managed to run the following scripts (if I remember correctly):
However, it might be that I already had to fix the
mflow
import statements at this stage and ran thegenerate_prop_ranges.py
script at this point.After creating the
zinc250k.txt
file fromzinc250k.csv
and after runninggenerate_prop_ranges.py
I should have been able to runcalculate_statistics_single_prop.py --mani_range 1
, although this also might have required some changes to the original code already.After some further modifications (most notably by creating directories that were missing for the code to work), I also managed to run the random and largest baselines as follows:
which allowed me to run
calculate_statiscs_single_prop.py
on these baselines as well.All of this eventually provided me with the following results:
whereas table 1 (together with tables 5 and 6) in the paper seems to suggest something closer to
Any chance you could provide me with some papers (or explain the discrepancies)?
The text was updated successfully, but these errors were encountered: