Skip to content

Commit

Permalink
Merge pull request #378 from ZJUEarthData/web
Browse files Browse the repository at this point in the history
perf: imporve meanshift-realted code and the common functions 'cluster center' and 'cluster label'.
  • Loading branch information
SanyHe committed Aug 31, 2024
2 parents 5a78ac9 + 14c5d15 commit 6374180
Show file tree
Hide file tree
Showing 9 changed files with 137 additions and 84 deletions.
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -307,18 +307,20 @@ The whole package is under construction and the documentation is progressively e
+ Mengqi Gao (China University of Geosciences, Beijing, China)
+ Chengtu Li(Trenki, Henan Polytechnic University, Beijing, China)
+ Yucheng Yan (Andy, University of Sydney, Australia)
+ Ruitao Chang (China University of Geosciences Beijing, China)
+ Panyan Weng (The University of Sydney, Australia)

**Product Group**:

+ Yang Lyu (Daisy, Zhejiang University, China)
+ Bailun Jiang (EPSI / Lille University, France)
+ Ruitao Chang (China University of Geosciences Beijing, China)
+ Panyan Weng (The University of Sydney, Australia)
+ Siqi Yao (Clara, Dongguan University of Technology, China)
+ Zhelan Lin(Lan, Fuzhou University, China)
+ ShuYi Li (Communication University Of China, Beijing, China)
+ Junbo Wang (China University Of Geosciences, Beijing, China)
+ Haibin Wang(Watson, University of Sydney, Australia)
+ Guoqiang Qiu(Elsen, Fuzhou University, China)
+ Yating Dong (Yetta,Dongguan University of Technology,China)
+ Haibin Lai (Michael, Southern University of Science and Technology, China)

## Join Us :)

Expand Down Expand Up @@ -398,3 +400,4 @@ More Videos will be recorded soon.
+ Zhenglin Xu (Garry, Jilin University, China)
+ Jianing Wang (National University of Singapore, Singapore)
+ Junchi Liao(Roceda, University of Electronic Science and Technology of China, China)
+ Bailun Jiang (EPSI / Lille University, France)
40 changes: 39 additions & 1 deletion docs/source/For Developer/Add New Model To Framework.md
Original file line number Diff line number Diff line change
Expand Up @@ -865,8 +865,46 @@ Only for those algorithms, they belong to either regression or classification an

## 5. Test Model Workflow Class

After the model workflow class is added, you can test it through running the command `python start_cli_pipeline.py` on the terminal. If the test reports an error, you need to debug and fix it. If there is no error, it can be submitted.

After the model workflow class is added, you can test it through running the command `python start_cli_pipeline.py` on the terminal.

If you can successfully run the pipeline, there are three aspects to verify the correctness of your modification:

(1) Check whether the output info in the console is what you expect.

<img width="1347" alt="image" src="https://github.com/user-attachments/assets/6530bd18-d196-4829-997d-08222194a34f">

(2) Check whether the artifacts (e.g., dataset, images) produced saved properly in `geopi_output` folder and whether the content of the artifacts is what you expect. You can know where the `geopi_output` folder via the path in the console.

<img width="1400" alt="image" src="https://github.com/user-attachments/assets/773e5b61-c45e-4c18-8747-cd2753831f6b">

(3) Check whether the same artifacts (e.g., dataset, images) produced saved properly in MLflow. You can use this command `mlflow ui --backend-store-uri file:/path/to/geopi_tracking --port PORT_NUMBER` to launch the web interface supported by MLflow. Copy the link `http://127.0.0.1:PORT_NUMBER` to the brower. Click the corresponding experiment and run you created and check the artifacts accordingly.

<img width="1353" alt="image" src="https://github.com/user-attachments/assets/3ddda308-00e1-4a40-a392-91e0440a5d26">

<img width="1394" alt="image" src="https://github.com/user-attachments/assets/56c4d1b6-2458-4a93-9956-0993d3ffa058">

<img width="1288" alt="image" src="https://github.com/user-attachments/assets/e3ebebdb-2910-4826-a4dd-19a079be0b0d">

For more details on how to use MLflow, you can watch the video as below:

MLflow UI user guide - Geochemistry π v0.5.0 [[Bilibili]](https://b23.tv/CW5Rjmo) | [[YouTube]](https://www.youtube.com/watch?v=Yu1nzNeLfRY)

If you fail to run the pipeline, you need to debug and fix it. Here is a recommended way - **breakpoint debugging**. In VSCode, you need to open the file `start_cli_pipeline.py` and click the button VSCode provides.

<img width="1396" alt="image" src="https://github.com/user-attachments/assets/3eb2082b-1dca-48cd-9897-089355ff566a">

You can search the benefits of using **breakpoint debugging** to debug. There are two major benefits:

(1) Lookup the value of the variable in the stack frame in memory directly.

<img width="1396" alt="image" src="https://github.com/user-attachments/assets/91b45e99-1123-40bc-8190-0f982be695a8">

(2) Create temporary watch (code to debug) to evaluate in the current stack frame.

<img width="1397" alt="image" src="https://github.com/user-attachments/assets/232f59bd-e48d-40e9-9174-7ebe4e8d2fb2">

After fixing the problem, don't forget to verify the produced artifacts in three aspects.

## 6. Completed Pull Request

Expand Down
10 changes: 6 additions & 4 deletions docs/source/Home/Introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -308,18 +308,20 @@ The whole package is under construction and the documentation is progressively e
+ Mengqi Gao (China University of Geosciences, Beijing, China)
+ Chengtu Li(Trenki, Henan Polytechnic University, Beijing, China)
+ Yucheng Yan (Andy, University of Sydney, Australia)
+ Ruitao Chang (China University of Geosciences Beijing, China)
+ Panyan Weng (The University of Sydney, Australia)

**Product Group**:

+ Yang Lyu (Daisy, Zhejiang University, China)
+ Bailun Jiang (EPSI / Lille University, France)
+ Ruitao Chang (China University of Geosciences Beijing, China)
+ Junchi Liao(Roceda, University of Electronic Science and Technology of China, China)
+ Panyan Weng (The University of Sydney, Australia)
+ Siqi Yao (Clara, Dongguan University of Technology, China)
+ Zhelan Lin(Lan, Fuzhou University, China)
+ ShuYi Li (Communication University Of China, Beijing, China)
+ Junbo Wang (China University Of Geosciences, Beijing, China)
+ Haibin Wang(Watson, University of Sydney, Australia)
+ Guoqiang Qiu(Elsen, Fuzhou University, China)
+ Yating Dong (Yetta,Dongguan University of Technology,China)
+ Haibin Lai (Michael, Southern University of Science and Technology, China)

## Join Us :)

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
geochemistrypi.data\_mining.model.func.algo\_anomalydetection package
======================================================================
=====================================================================

Module contents
---------------
Expand Down
7 changes: 3 additions & 4 deletions geochemistrypi/data_mining/model/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -377,11 +377,10 @@ class ClusteringMetricsMixin:
"""Mixin class for clustering metrics."""

@staticmethod
def _get_num_clusters(func_name: str, algorithm_name: str, trained_model: object, store_path: str) -> None:
"""Get and log the number of clusters."""
labels = trained_model.labels_
num_clusters = len(np.unique(labels))
def _get_num_clusters(labels: pd.Series, func_name: str, algorithm_name: str, store_path: str) -> None:
"""Get and log the number of clusters. It is only used in those algorithms which don't allow to set the number of cluster in advance."""
print(f"-----* {func_name} *-----")
num_clusters = len(np.unique(labels.to_numpy()))
print(f"{func_name}: {num_clusters}")
num_clusters_dict = {f"{func_name}": num_clusters}
mlflow.log_metrics(num_clusters_dict)
Expand Down
Loading

0 comments on commit 6374180

Please sign in to comment.