-
Notifications
You must be signed in to change notification settings - Fork 29
Roadmap for Geochemistry π
Can (Sany) He edited this page Oct 13, 2024
·
39 revisions
-
CLI Pipeline
- New Mode
- Network analysis
- Time Series Analysis with statistical resampling🌟
- New algorithms
-
Local outlier factor algorithm in anomaly detection mode🌟 Mean shift algorithm in clustering mode
-
- New application functions
-
Precision-recall vs. threshold diagram in classification mode🌟
-
- New features
-
Customized mean normalization🌟
-
- Refactoring
- Allow users to submit batch processing file
- Use enum class to replace the comparability of string value
- Use @property to access and set instance's data
-
Unique identifier in the output data🌟
- Other
Silence of dependency downloading when first launching-
Download by one-click with input data in the desktop🌟
- New Mode
-
Documentation
-
Rewrite Add New Model to Framework -
Rewrite Regression Model example -
Fix the warnings and errors when compiling docs using sphinx -
Add testing procedure - [] Add the details of the hyperparameters in automl
- [] Add the basic use of sphinx framework
-
-
Web Development
-
Launch chatbot for online docs🌟 - UI
- Product page
Introduction page
- Frontend
Setup frontend code in Zhejiang University cloud
- Backend
- Research and demo the CLI product in web portal
-
-
Thermodynamic Modeling
-
Testing
- Exhaust all options automatically
-
Web Development
-
Single sign-on via DDE -
UI designDraft of training pipelineFigma version
-
Web PageMain PageIntroduction Page- Product Page
-
Storage Research - SQLAlchemyStore and FilesystemStoreObject Storage Service via Alibaba CloudMySQL via Alibaba Cloud
-
MLflow IntegrationMLflow YML file errorManagement of multiple users' resource using OSS and MySQL
-
CI/CDCode formatting of js code is controlled by frontend
-
-
Pipeline
-
The storage of the predicted value of the training data -
Optimization of the loading of built-in application data -
Dropping rows with missing values in selected column -
Summary directory for all result -
Prediction for the training set - Algorithm
-
Abnormal detectionIsolation forest algorithm addition
-
Decomposition2d scatter_diagramHeatmap diagramContour diagram
-
RegressionFormula showing error fix in linear regressionRidge regression addition
-
ClusteringAffinity propagation addition
-
-
-
Documentation
-
Make developer-related documentsAppend the corresponding operations on Windows in the developer section
-
Mind map of all options -
Online docs layout mismatch -
add citation info -
update installation manual -
update clustering algorithm example -
add abnormal detection example
-
-
Pipeline
-
Missing value process section-
provide three ways to deal with missing valuesKeep the missing values. Subsequently, only the models that support missing values are available.Drop the rows with missing valuesImpute the missing values with one of the imputation techniques.
-
-
Fixed random state for all models
-
-
Algorithm & Function
-
Clustering & Decompositionmore common application functions
-
New algorithmBayesian ridge regressionAgglomerative clustering
-
-
Documentation
-
Make developer-related documentsAlgorithm addition procedurefunctionality addition procedure
-
-
Video
-
Machine Learning Lifecycle management
-
-
Access Layer
-
Encapsulate a.exe
software version
-
-
Core Components
-
Model inference in customized ML pipeline (CLI version) by transform pipeline
-
-
Documentation
-
Make design diagrams of the whole project
-
-
Pipeline
-
Feature selection -
CSV data file import -
Data selection function with null, space and Chinese parentheses dection functionality -
Feature scaling for unsupervised learning -
Built-in inference dataset loading
-
-
Algorithm
-
New algorithmKNN classificationSGD classificationGradient Boosting classificationSGD regressionElastic Net Regression
-
ClassificationSample balanceLabels customizationMulti-class label and binary label training for all classification models
-
DecompositionReduced data storage for all decomposition model
-
ClusteringSilhouette score frequency diagram for all clustering modelTwo clustering model score for all clustering model
-
-
Enhancement
-
Lasso regression model with automatic parameter tuning functionality
-
-
Functionality Building
-
Build T-SNE algorithm for decomposition section -
Build advanced operation for feature engineering section -
Build CLI command to activate offline server
-
-
Access Layer
-
Build up web portal templateLogin pageHome page
-
Build up RESTful APIs templateOAuth2 authorizationSQLite database
-
-
ML Task Storage
-
Define and implement the schema to store metadata and artifactSchema specification - MLflow-
ImplementationScikit-learn & MLflowFLAML & MLflowRay & MLflow
Artifact
-
Formalize the local store formatData StoreImage StoreModel Store
-
-
Documentation
-
Migrate online document into organization repository -
Optimize and integrate online documents -
Optimize and resolve sphinx related problem -
Make developer-related documentsGit operationLocal deploymentDocker deployment
-
Make framework-related design drawingSystem architecture diagramCustomized ML pipeline diagramDesign pattern diagramWorkflow class diagramStorage mechanism diagram
-
Algorithm & Function Progress Form
-
-
Optimization
-
Abstract core functionalities to simplify the pipelineRefactor four modes application functionsBuild up Mixin class
-
Regulate CI for Python and Typescript -
Regulate environment variable management -
Fix statistic analysis bug -
Improve World Map Projection by using dynamical importing
-