Allow `n()` in `data_modify()` #535

strengejacke · 2024-08-27T15:18:59Z

@etiennebacher Maybe, if checks pass, you can merge this PR for the planned release? Just a minor code change.

etiennebacher

Thanks! I think there are cases where this implementation would fail, e.g. if someone passes 1:fun(). Can you check for this?

You have more usecases than me with datawizard but it seems that we gradually try to replace all functions from dplyr, which I thought was not the main objective of datawizard. I don't know if this needed in another package internally but if not, do we actually need to implement this instead of using dplyr?

Also, if you want to add more bug fixes/features, I don't plan on releasing before the end of the week.

NEWS.md

R/data_modify.R

tests/testthat/test-data_modify.R

Co-authored-by: Etienne Bacher <[email protected]>

strengejacke · 2024-08-27T18:55:15Z

but if not, do we actually need to implement this

You're right, but we had this already for data_summary(), because it's an more often used function, and I thought it should then also work for data_modify(). That's the main reason why I added this. And actually, I don't use dplyr and tidyr anymore, since 99.5% of my tasks can be done with datawizard. :-)

etiennebacher

But I have no good idea for a use-case for fun()?

I think you can just check that the error message of data_modify(mtcars, b = fun()) is comprehensible.

But before that, please check my other comment.

R/data_modify.R

etiennebacher · 2024-08-28T19:11:10Z

You're right, but we had this already for data_summary(), because it's an more often used function, and I thought it should then also work for data_modify().

I'm also not a fan of having this in data_summary(). I think it adds confusion for a tidyverse user that could think "why is n() supported but not first() or last()?" and basically we don't have a good answer to that. To me the tidyverse and datawizard are more complements than substitutes: datawizard provides nice statistical functions that can be included in a pipeline, and it also provides dependency-free functions that we need internally in other packages anyway. Adding support for n() doesn't seem to meet any of those cases.

strengejacke · 2024-08-28T21:32:17Z

But I have no good idea for a use-case for fun()?

I think you can just check that the error message of data_modify(mtcars, b = fun()) is comprehensible.

But before that, please check my other comment.

But does fun() necessarily result in an error? I'm not sure whether there are use cases of user-defined functions?

strengejacke · 2024-08-28T21:37:27Z

datawizard provides nice statistical functions that can be included in a pipeline, and it also provides dependency-free functions that we need internally in other packages anyway. Adding support for n() doesn't seem to meet any of those cases.

Yes, that was certainly the initial idea of the package, but a) there are probably other people who use datawizard instead of dplyr/tidyr (e.g., @DominiqueMakowski - at least in the past - never really liked the NSE of tidyverse and prefers working with strings to identify variables etc., same like me) and b) maybe there will be other packages that prefer using datawizard for its minimal dependencies in their packages, and then it could be nice to have some features that are also supported by other packages. I personally don't mind implementing "common" features, but what do the other @easystats/maintainers think? Should we have a "stopping rule" of implementing features?

etiennebacher · 2024-08-29T11:03:37Z

But does fun() necessarily result in an error? I'm not sure whether there are use cases of user-defined functions?

I just meant that we should have a test for the error below:

library(datawizard)

head(mtcars) |> 
  data_modify(x = n())
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb x
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4 6
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4 6
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1 6
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1 6
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 6
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1 6

head(mtcars) |> 
  data_modify(x = fun())
#> Error: There was an error in the first expression. Attempt to apply
#>   non-function. Possibly misspelled or not yet defined?

strengejacke · 2024-09-01T08:44:02Z

@easystats/core-team Any opinions to
#535 (comment)
#535 (comment)

?

rempsyc · 2024-09-11T11:57:32Z

I personally don't mind implementing "common" features, but what do the other easystats/maintainers think?

Personally, I think the harm in adding extra features is limited (although I can also see Etienne's point about creating confusion). I think the potential benefit of new features ouweight the cons (although it also adds to future maintenance and development), so as long as you're motivated to do the associated work Daniel (and I think you are), I'd say go ahead 🥸

Sometimes in my packages I might want to internally rely on datawizard/easystats and I need dplyr just for one or two functions like n(), then it might be a plus that I have the option to rely on datawizard instead of adding an additional dependency (dplyr) or of rewriting the code myself.

strengejacke requested a review from etiennebacher August 27, 2024 15:20

strengejacke added 2 commits August 27, 2024 17:20

Allow n() in data_modify()

ea3ab59

lintr, styler

6698eba

etiennebacher requested changes Aug 27, 2024

View reviewed changes

NEWS.md Outdated Show resolved Hide resolved

R/data_modify.R Outdated Show resolved Hide resolved

R/data_modify.R Outdated Show resolved Hide resolved

tests/testthat/test-data_modify.R Outdated Show resolved Hide resolved

strengejacke and others added 3 commits August 27, 2024 20:22

Update NEWS.md

bae3ea5

Co-authored-by: Etienne Bacher <[email protected]>

Update R/data_modify.R

d91a375

Co-authored-by: Etienne Bacher <[email protected]>

comments

db57cb6

strengejacke added 3 commits August 27, 2024 21:22

fix test

c700a3b

update rd

6a851aa

modify error msg

df613b8

etiennebacher requested changes Aug 28, 2024

View reviewed changes

R/data_modify.R Show resolved Hide resolved

error on invalid function

40cae97

Merge branch 'main' into n_in_data_modify

96d4a71

strengejacke mentioned this pull request Sep 1, 2024

Put vignettes in .Rbuildignore #534

Merged

Merge branch 'main' into n_in_data_modify

6f83f33

strengejacke added 2 commits September 13, 2024 09:54

Merge branch 'main' into n_in_data_modify

3a86c48

Merge branch 'main' into n_in_data_modify

a0dadf8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow `n()` in `data_modify()` #535

Allow `n()` in `data_modify()` #535

strengejacke commented Aug 27, 2024 •

edited

Loading

etiennebacher left a comment

strengejacke commented Aug 27, 2024

etiennebacher left a comment •

edited

Loading

etiennebacher commented Aug 28, 2024

strengejacke commented Aug 28, 2024

strengejacke commented Aug 28, 2024 •

edited

Loading

etiennebacher commented Aug 29, 2024

strengejacke commented Sep 1, 2024

rempsyc commented Sep 11, 2024 •

edited

Loading

Allow n() in data_modify() #535

Are you sure you want to change the base?

Allow n() in data_modify() #535

Conversation

strengejacke commented Aug 27, 2024 • edited Loading

etiennebacher left a comment

Choose a reason for hiding this comment

strengejacke commented Aug 27, 2024

etiennebacher left a comment • edited Loading

Choose a reason for hiding this comment

etiennebacher commented Aug 28, 2024

strengejacke commented Aug 28, 2024

strengejacke commented Aug 28, 2024 • edited Loading

etiennebacher commented Aug 29, 2024

strengejacke commented Sep 1, 2024

rempsyc commented Sep 11, 2024 • edited Loading

Allow `n()` in `data_modify()` #535

Allow `n()` in `data_modify()` #535

strengejacke commented Aug 27, 2024 •

edited

Loading

etiennebacher left a comment •

edited

Loading

strengejacke commented Aug 28, 2024 •

edited

Loading

rempsyc commented Sep 11, 2024 •

edited

Loading