PerMetrics is a python library for performance metrics of machine learning models. We aim to implement all performance metrics for problems such as regression, classification, clustering, ... problems. Helping users in all field access metrics as fast as possible. The number of available metrics include 111 (47 regression metrics, 20 classification metrics, 44 clustering metrics)
Please include these citations if you plan to use this library:
-
LaTeX:
@article{Thieu_PerMetrics_A_Framework_2024, author = {Thieu, Nguyen Van}, doi = {10.21105/joss.06143}, journal = {Journal of Open Source Software}, month = mar, number = {95}, pages = {6143}, title = {{PerMetrics: A Framework of Performance Metrics for Machine Learning Models}}, url = {https://joss.theoj.org/papers/10.21105/joss.06143}, volume = {9}, year = {2024} }
-
APA:
Thieu, N. V. (2024). PerMetrics: A Framework of Performance Metrics for Machine Learning Models. Journal of Open Source Software, 9(95), 6143. https://doi.org/10.21105/joss.06143
Install the current PyPI release:
$ pip install permetrics
After installation, you can import Permetrics as any other Python module:
$ python
>>> import permetrics
>>> permetrics.__version__
Below is the most efficient and effective way to use this library compared to other libraries. The example below returns the values of metrics such as root mean squared error, mean absolute error...
from permetrics import RegressionMetric
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
evaluator = RegressionMetric(y_true, y_pred)
results = evaluator.get_metrics_by_list_names(["RMSE", "MAE", "MAPE", "R2", "NSE", "KGE"])
print(results["RMSE"])
print(results["KGE"])
In case your y_true and y_pred data have multiple columns, and you want to return multiple outputs, something that other libraries cannot do, you can do it in Permetrics as follows:
import numpy as np
from permetrics import RegressionMetric
y_true = np.array([[0.5, 1], [-1, 1], [7, -6]])
y_pred = np.array([[0, 2], [-1, 2], [8, -5]])
evaluator = RegressionMetric(y_true, y_pred)
## The 1st way
results = evaluator.get_metrics_by_dict({
"RMSE": {"multi_output": "raw_values"},
"MAE": {"multi_output": "raw_values"},
"MAPE": {"multi_output": "raw_values"},
})
## The 2nd way
results = evaluator.get_metrics_by_list_names(
list_metric_names=["RMSE", "MAE", "MAPE", "R2", "NSE", "KGE"],
list_paras=[{"multi_output": "raw_values"},] * 6
)
## The 3rd way
result01 = evaluator.RMSE(multi_output="raw_values")
result02 = evaluator.MAE(multi_output="raw_values")
The more complicated cases in the folder: examples. You can also read the documentation for more detailed installation instructions, explanations, and examples.
There are lots of ways how you can contribute to Permetrics's development, and you are welcome to join in! For example, you can report problems or make feature requests on the issues pages. To facilitate contributions, please check for the guidelines in the CONTRIBUTING.md file.
- Official source code repository
- Official document
- Download releases
- Issue tracker
- Notable changes log
- Official discussion group
-
Currently, there is a huge misunderstanding among frameworks around the world about the notation of R, R2, and R^2.
-
Please read the file R-R2-Rsquared.docx to understand the differences between them and why there is such confusion.
-
My recommendation is to denote the Coefficient of Determination as COD or R2, while the squared Pearson's Correlation Coefficient should be denoted as R^2 or RSQ (as in Excel software).
Developed by: Thieu @ 2023