Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to WQX ref (ft, in, m) and TADA (mass/mass to mass/volume) ref #522

Open
wants to merge 26 commits into
base: develop
Choose a base branch
from

Conversation

wokenny13
Copy link
Collaborator

Kevin has updated the WQX ref to reflect the changes. We decided to keep target units to inches, and fixing those 3 rows for the conversion factors.

I've added additional rows in TADA.PriorityCharConvertRef for common mass/mass to mass/volume conversions.

I ran a random testing TADA_DataRetrieval and found a #/100mL unit with no conversions to CFU/100mL, and added on this as a row to TADA.PriorityCharConvertRef in the csv.

Checking the conversion factors for reasonableness will be good for review, and if these additional rows make sense.

Copy link
Collaborator

@hillarymarler hillarymarler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add conversions to ug/L and mg/L for other mass/mass units. Once you've added those, I will approve this PR as it is working well, just needs a few additions. Nice work!

ug/kg,8/29/2024 12:00,mg/L,1.00E-03,0
#/100mL,8/29/2024 12:00,CFU/100ML,1.00E+00,0
%,8/29/2024 12:00,ug/L,1.00E-03,0
%,8/29/2024 12:00,mg/L,1.00E-06,0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In test data sets, I am still seeing a number of mass/mass units that are not converting because conversion factors to common mass/volume units are not included in TADAPriorityCharConvertRef.

For example: ug/g, mg/kg, ng/g, ng/kg

I think looking at the "Concentration Percentage" units in the WQX MeasureUnit.csv can help generate a list of the missing ones. It would be great to include conversion factors for ug/L and mg/L (as you already have for fg/kg, g/g, ug/kg, etc.)

hillarymarler and others added 7 commits September 4, 2024 15:00
Rank characteristics in pair ref by # of results in data frame
Added comments in code
…-data-t-ph-hardness-dependent-criteria

392 create function to find paired data t ph hardness dependent criteria
- Updates reference tables and example data
- Removes TADA.WQXTargetUnit and adds TADA.Target.ResultMeasure.MeasureUnitCode to required cols
- Update ordering of columns and comparable data ID when transform = FALSE
- Adds TADA.WQXUnitConversionCoefficient and TADA.WQXResultUnitConversion to column ordering (only relevant when transform = FALSE)
- Update documentation
- Allows function to run without autoclean being run first
- found that PPM was "Accepted" unit for "Depth, Bottom". Changed this to "Rejected" in WQX validation table. Reviewed many other depth result units as well as part of this update.
- Updated example data
@cristinamullin
Copy link
Collaborator

cristinamullin commented Sep 6, 2024

The unit conversions (output from TADA_ConvertResultUnits) and the TADA version of the WQX unit reference table look good now. However, the output df from TADA_CreateUnitRef still does not reflect the correct conversion factor (TADA.WQXUnitConversionFactor). See test2 output below.

I am sure this can be fixed (ran out of time today), but I was also thinking that it might not be necessary to include the conversion factors and coefficients in the table produced by TADA_CreateUnitRef, since that information is available to users if they want it via TADA_ConvertResultUnits when transform = FALSE. Either way, it looks like TADA_CreateUnitRef needs additional review/updating to finish off this issue.

image

`
test <- TADA_DataRetrieval(
startDate = "2020-01-01",
endDate = "null",
countrycode = "null",
countycode = "null",
huc = "null",
siteid = "null",
siteType = "null",
characteristicName = "Depth, bottom",
characteristicType = "null",
sampleMedia = "null",
statecode = "null",
organization = "null",
project = "null",
providers = "null",
applyautoclean = FALSE
)

test2 <- TADA_CreateUnitRef(test)

test3 <- TADA_ConvertResultUnits(test, transform = FALSE)

test4 <- TADA_ConvertResultUnits(test)
`

Note: I addressed a few other issues as part of the PR:

  • Updated TADA_FlagResultUnit so it can now run without autoclean being run first
  • I found that PPM was "Accepted" unit for "Depth, Bottom". Changed this to "Rejected" in WQX validation table. Reviewed many other depth result units as well & updated the WQX validation table for those as part of this update.
  • Updated reference tables and example data again
  • Removed TADA.WQXTargetUnit and added TADA.Target.ResultMeasure.MeasureUnitCode to required cols
  • Updated ordering of columns and comparable data ID when transform = FALSE
  • Adds TADA.WQXUnitConversionCoefficient and TADA.WQXResultUnitConversion to column ordering (only relevant when transform = FALSE)
  • Updated documentation

@hillarymarler
Copy link
Collaborator

@cristinamullin - the reason I had left the units in the TADA_CreateUnitRef output was for cases where a user might want to convert to a unit that we don't have in the unit refs. But if that scenario is not going to happen (or extremely unlikely), then I agree we can remove those columns as users can access the conversion info through TADA_ConvertResultUnits when transform = FALSE.

@wokenny13 - do you want to work on the updates needed for TADA_CreateUnitRef? Or would you like me to do that?

@wokenny13
Copy link
Collaborator Author

wokenny13 commented Sep 9, 2024

test <- TADA_DataRetrieval(
startDate = "2020-01-01",
endDate = "null",
countrycode = "null",
countycode = "null",
huc = "null",
siteid = "null",
siteType = "null",
characteristicName = "Depth, bottom",
characteristicType = "null",
sampleMedia = "null",
statecode = "null",
organization = "null",
project = "null",
providers = "null",
applyautoclean = FALSE
)

@hillarymarler I can work on making these updates. I will let you know if I have any questions as I catch up on these comments.

Additional rows for other mass/mass to mass/volume

Removed the two columns for TADA.WQXUnitConversionFactor and
TADA.WQXUnitConversionCoefficient in the function TADA_CreateUnitRef
Changed order of columns to make it more clear on which columns are using the conversion factor
The test_that should re-run the TADA_RandomTestingData pull up to 10 times for testdat1. It will stop and determine if dimensions are the same. if testdat2 is able to run without error, that is if paired parameters in the testdat1 are in its dataframe.
@wokenny13 wokenny13 linked an issue Sep 17, 2024 that may be closed by this pull request
updated comments and styler
@wokenny13
Copy link
Collaborator Author

@hillarymarler @cristinamullin

  • Updated list of mass/mass to mass/volume TADAPriorityCharConvertRef.csv units to be converted to mg/L and UG/L.
  • Rearranged columns of output for TADA_CreateUnitRef. Keeping in the Conversion Factor and Coefficient but hope to make it clear which columns are being referenced with this rearrangement.
  • Included the updates to test-that CriteriaComparison.R in this pull request - see comments in Improve test for TADA_PairForCriteriaCalc #525

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants