-
Notifications
You must be signed in to change notification settings - Fork 1
5. Evaluation
leADS can be evaluated using a pre-trained model (see Training). A pre-trained model ("leADS.pkl") made available to users that were trained on the Enzyme Commission (EC) number indices with embedding (biocyc21_Xe.pkl) and the pathway indices (biocyc21_y.pkl) data.
Note: As before make sure to put the source code leADS
(Installing leADS) into the same directory as explained in Download files. Additionally, create a log
and result
(if you have not already created one during pathway prediction) folder in the same leADS_materials/
directory. The final structure should look like this:
leADS_materials/
├── objectset/
│ └── ...
├── model/
│ └── ...
├── dataset/
│ └── ...
├── result/
│ └── ...
└── leADS/
└── ...
For all experiments, using a terminal
navigate to the src
folder in the leADS directory and then run the commands. To display leADS's running options use: python main.py --help
. It should be self-contained.
The essential input file used for evaluation are two matrices: [DATANAME]_X*.pkl and the [DATANAME]_y.pkl
Note: Data such "[DATANAME]_Xe.pkl", "[DATANAME]_Xa.pkl", "[DATANAME]_X.pkl" can be used for evaluation, prodived leADS was trained using the corresponding data.
python main.py \
--evaluate \
--pred-labels \
--soft-voting \
--X-name "[DATANAME]_X*.pkl" \
--y-name "[DATANAME]_y.pkl" \
--file-name "[save file name]" \
--dspath "[absolute path to the dataset directory (e.g. dataset)]" \
--rspath "[absolute path to the result directory (e.g. result)]" \
--batch 50 \
--num-jobs 2
[XXX]
[XXX]
[XXX]. Run the following command:
python main.py --train --train-labels --calc-ads --ads-percent 0.7 --acquisition-type "psp" --top-k 50 --ssample-input-size 0.7 --ssample-label-size 2000 --calc-subsample-size 1000 --lambdas 0.01 0.01 0.01 0.01 0.01 10 --penalty "l21" --X-name "biocyc21_Xe.pkl" --y-name "biocyc21_y.pkl" --model-name "leADS_retrained_1" --batch 50 --max-inner-iter 5 --num-epochs 10 --num-models 3 --num-jobs 2
After running the command, the output will be saved to the result/
folder. A short description of the output is given in the [XXX] above. The tree structure for the folder with the outputs will look like this:
leADS_materials/
├── objectset/
│ └── ...
├── model/
│ ├── leADS.pkl
│ └── ...
├── dataset/
│ └── ...
├── result/
| ├── [XXX]
│ └── ...
└── leADS/
└── ...
[XXX]. Run the following command:
python main.py --train --train-labels --calc-ads --ads-percent 0.7 --acquisition-type "entropy" --ssample-input-size 0.7 --ssample-label-size 2000 --calc-subsample-size 1000 --lambdas 0.01 0.01 0.01 0.01 0.01 10 --penalty "l21" --X-name "biocyc21_Xe.pkl" --y-name "biocyc21_y.pkl" --model-name "leADS_retrained_2" --batch 50 --max-inner-iter 5 --num-epochs 10 --num-models 3 --num-jobs 2
After running the command, the output will be saved to the result/
folder. A short description of the output is given in the [XXX] above. The tree structure for the folder with the outputs will look like this:
leADS_materials/
├── objectset/
│ └── ...
├── model/
│ ├── leADS.pkl
│ └── ...
├── dataset/
│ └── ...
├── result/
| ├── [XXX]
│ └── ...
└── leADS/
└── ...