5. Evaluation

Overview

leADS can be evaluated using a pre-trained model (see Training). A pre-trained model ("leADS.pkl") made available to users that were trained on the Enzyme Commission (EC) number indices with embedding (biocyc21_Xe.pkl) and the pathway indices (biocyc21_y.pkl) data.

Note: As before make sure to put the source code leADS (Installing leADS) into the same directory as explained in Download files. Additionally, create a log and result (if you have not already created one during pathway prediction) folder in the same leADS_materials/ directory. The final structure should look like this:

leADS_materials/
	├── objectset/
        │       └── ...
	├── model/
        │       └── ...
	├── dataset/
        │       └── ...
	├── result/
        │       └── ...
	└── leADS/
                └── ...

For all experiments, using a terminal navigate to the src folder in the leADS directory and then run the commands. To display leADS's running options use: python main.py --help. It should be self-contained.

Input:

The essential input file used for evaluation are two matrices: [DATANAME]_X*.pkl and the [DATANAME]_y.pkl

Note: Data such "[DATANAME]_Xe.pkl", "[DATANAME]_Xa.pkl", "[DATANAME]_X.pkl" can be used for evaluation, prodived leADS was trained using the corresponding data.

Command:

python main.py \
--evaluate \
--pred-labels \
--soft-voting \
--X-name "[DATANAME]_X*.pkl" \
--y-name "[DATANAME]_y.pkl" \
--file-name "[save file name]" \
--dspath "[absolute path to the dataset directory (e.g. dataset)]" \
--rspath "[absolute path to the result directory (e.g. result)]" \
--batch 50 \
--num-jobs 2

Argument descriptions:

[XXX]

Output:

[XXX]

Examples

Example 1:

[XXX]. Run the following command:

python main.py --train --train-labels --calc-ads --ads-percent 0.7 --acquisition-type "psp" --top-k 50 --ssample-input-size 0.7 --ssample-label-size 2000 --calc-subsample-size 1000 --lambdas 0.01 0.01 0.01 0.01 0.01 10 --penalty "l21" --X-name "biocyc21_Xe.pkl" --y-name "biocyc21_y.pkl" --model-name "leADS_retrained_1" --batch 50 --max-inner-iter 5 --num-epochs 10 --num-models 3 --num-jobs 2

After running the command, the output will be saved to the result/ folder. A short description of the output is given in the [XXX] above. The tree structure for the folder with the outputs will look like this:

leADS_materials/
	├── objectset/
        │       └── ...
	├── model/
        │       ├── leADS.pkl
        │       └── ...
	├── dataset/
        │       └── ...
	├── result/
        |       ├── [XXX]
        │       └── ...
	└── leADS/
                └── ...

Example 2: