Skip to content

Roadmap for Geochemistry π

Can (Sany) He edited this page Oct 13, 2024 · 39 revisions

v0.6.0 -> v0.7.0 🎯

  • CLI Pipeline

    • New Mode
      • Network analysis
      • Time Series Analysis with statistical resampling🌟
    • New algorithms
      • Local outlier factor algorithm in anomaly detection mode🌟
      • Mean shift algorithm in clustering mode
    • New application functions
      • Precision-recall vs. threshold diagram in classification mode🌟
    • New features
      • Customized mean normalization🌟
    • Refactoring
      • Allow users to submit batch processing file
      • Use enum class to replace the comparability of string value
      • Use @property to access and set instance's data
      • Unique identifier in the output data🌟
    • Other
      • Silence of dependency downloading when first launching
      • Download by one-click with input data in the desktop🌟
  • Documentation

    • Rewrite Add New Model to Framework
    • Rewrite Regression Model example
    • Fix the warnings and errors when compiling docs using sphinx
    • Add testing procedure
    • [] Add the details of the hyperparameters in automl
    • [] Add the basic use of sphinx framework
  • Web Development

    • Launch chatbot for online docs🌟
    • UI
      • Product page
      • Introduction page
    • Frontend
      • Setup frontend code in Zhejiang University cloud
    • Backend
      • Research and demo the CLI product in web portal
  • Thermodynamic Modeling

  • Testing

    • Exhaust all options automatically

v0.5.0 -> v0.6.0 🏆

  • Web Development

    • Single sign-on via DDE
    • UI design
      • Draft of training pipeline
      • Figma version
    • Web Page
      • Main Page
      • Introduction Page
      • Product Page
    • Storage Research - SQLAlchemyStore and FilesystemStore
      • Object Storage Service via Alibaba Cloud
      • MySQL via Alibaba Cloud
    • MLflow Integration
      • MLflow YML file error
      • Management of multiple users' resource using OSS and MySQL
    • CI/CD
      • Code formatting of js code is controlled by frontend
  • Pipeline

    • The storage of the predicted value of the training data
    • Optimization of the loading of built-in application data
    • Dropping rows with missing values in selected column
    • Summary directory for all result
    • Prediction for the training set
    • Algorithm
      • Abnormal detection
        • Isolation forest algorithm addition
      • Decomposition
        • 2d scatter_diagram
        • Heatmap diagram
        • Contour diagram
      • Regression
        • Formula showing error fix in linear regression
        • Ridge regression addition
      • Clustering
        • Affinity propagation addition
  • Documentation

    • Make developer-related documents
      • Append the corresponding operations on Windows in the developer section
    • Mind map of all options
    • Online docs layout mismatch
    • add citation info
    • update installation manual
    • update clustering algorithm example
    • add abnormal detection example

v0.4.0 -> v0.5.0 🏆

  • Pipeline

    • Missing value process section
      • provide three ways to deal with missing values
        • Keep the missing values. Subsequently, only the models that support missing values are available.
        • Drop the rows with missing values
        • Impute the missing values with one of the imputation techniques.
    • Fixed random state for all models
  • Algorithm & Function

    • Clustering & Decomposition
      • more common application functions
    • New algorithm
      • Bayesian ridge regression
      • Agglomerative clustering
  • Documentation

    • Make developer-related documents
      • Algorithm addition procedure
      • functionality addition procedure
  • Video

    • Machine Learning Lifecycle management

v0.3.0 -> v0.4.0 🏆

  • Access Layer

    • Encapsulate a .exe software version
  • Core Components

    • Model inference in customized ML pipeline (CLI version) by transform pipeline
  • Documentation

    • Make design diagrams of the whole project
  • Pipeline

    • Feature selection
    • CSV data file import
    • Data selection function with null, space and Chinese parentheses dection functionality
    • Feature scaling for unsupervised learning
    • Built-in inference dataset loading
  • Algorithm

    • New algorithm
      • KNN classification
      • SGD classification
      • Gradient Boosting classification
      • SGD regression
      • Elastic Net Regression
    • Classification
      • Sample balance
      • Labels customization
      • Multi-class label and binary label training for all classification models
    • Decomposition
      • Reduced data storage for all decomposition model
    • Clustering
      • Silhouette score frequency diagram for all clustering model
      • Two clustering model score for all clustering model
  • Enhancement

    • Lasso regression model with automatic parameter tuning functionality

v0.2.1 -> v0.3.0 🏆

  • Functionality Building

    • Build T-SNE algorithm for decomposition section
    • Build advanced operation for feature engineering section
    • Build CLI command to activate offline server
  • Access Layer

    • Build up web portal template
      • Login page
      • Home page
    • Build up RESTful APIs template
      • OAuth2 authorization
      • SQLite database
  • ML Task Storage

    • Define and implement the schema to store metadata and artifact
      • Schema specification - MLflow
      • Implementation
        • Scikit-learn & MLflow
        • FLAML & MLflow
        • Ray & MLflow
      • Artifact
    • Formalize the local store format
      • Data Store
      • Image Store
      • Model Store
  • Documentation

    • Migrate online document into organization repository
    • Optimize and integrate online documents
    • Optimize and resolve sphinx related problem
    • Make developer-related documents
      • Git operation
      • Local deployment
      • Docker deployment
    • Make framework-related design drawing
      • System architecture diagram
      • Customized ML pipeline diagram
      • Design pattern diagram
      • Workflow class diagram
      • Storage mechanism diagram
    • Algorithm & Function Progress Form
  • Optimization

    • Abstract core functionalities to simplify the pipeline
      • Refactor four modes application functions
      • Build up Mixin class
    • Regulate CI for Python and Typescript
    • Regulate environment variable management
    • Fix statistic analysis bug
    • Improve World Map Projection by using dynamical importing