Add random seed replicates to predict step #131

jds485 · 2022-08-22T16:59:17Z

Noted in #125, here:

Random draws of samples into the training and testing sets have resulted in different outcomes for training using all available gages vs. using gages only within the region for which testing will be completed. Adding several replicates and comparing the resulting distribution of RMSEs is one way around this problem.

I think the replicates should use the same selected attributes (from correlation and Boruta screening) and same hyperparameters (to reduce computation times). So, the only difference between replicates would be the dataset used.

With replicate models, we should edit:

feature importance plots to display the average value over all replicates (can add error bars)
Model RMSE comparison barplots (e.g., p6_compare_RMSE_RF_png) should show average test error and average validation error over all replicates (can add error bars)
predicted vs. observed scatterplots to show distributions over all replicates (not sure how is best to show this - would be too crowded to add error bars).
residual maps based on average residual over replicates

The text was updated successfully, but these errors were encountered:

jds485 added the MetricsPaper label Nov 1, 2022

jds485 self-assigned this Nov 1, 2022

jds485 mentioned this issue Nov 21, 2022

Add significance tests for regional vs. national models #166

Open

jds485 assigned cstillwellusgs May 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add random seed replicates to predict step #131

Add random seed replicates to predict step #131

jds485 commented Aug 22, 2022 •

edited

Loading

Add random seed replicates to predict step #131

Add random seed replicates to predict step #131

Comments

jds485 commented Aug 22, 2022 • edited Loading

jds485 commented Aug 22, 2022 •

edited

Loading