-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a vignette about equivalence with tidyverse #183
Add a vignette about equivalence with tidyverse #183
Conversation
Codecov Report
@@ Coverage Diff @@
## main #183 +/- ##
=======================================
Coverage 83.79% 83.79%
=======================================
Files 52 52
Lines 3196 3196
=======================================
Hits 2678 2678
Misses 518 518 Help us with your feedback. Take ten seconds to tell us how you rate us. |
vignettes/basic_data_wrangling.Rmd
Outdated
one of its main features is that it has very few dependencies: `stats` and `utils` | ||
(included in base R) and `insight`, which is the core package of the easystats | ||
ecosystem. One drawback of this approach is that not all features of the | ||
`tidyverse` packages are not supported and we will have to rely on base R, or on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ambiguous, because we don't rely on poorman anywhere in the code.
vignettes/basic_data_wrangling.Rmd
Outdated
|
||
`data_select()` is the equivalent of `dplyr::select()`. The main difference | ||
between these two functions is that `data_select()` uses two arguments (`select` | ||
and `exclude`) and requires quoted column names, while `dplyr::select()` accepts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there are many possibilities, including unquoted column names, but only for a single variable, I think (because we don't use ...
to capture comma-separated, unquoted names). But maybe this is too much to explain here, and for now, we can keep it like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing thought, maybe we should add in our style guide in easystats the fact that we prefer quoted names rather than non-standard eval (in easystats, as opposed to the tidyverse), because it's easier to program with, because it's flexible, it makes more sense to distinguish names from objects, it's a much more standard programming good practice, and that in general NSE is an eldritch invention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(also, quoted names are much easier to work with for debugging)
Nice draft! I like the structure of this vignette very much! |
This is a fantastic start, @etiennebacher! Thanks for working on this. Lovely use of pandoc's fenced div! I am going to make some additions and add a few edits, but overall this is a good structure. I think, once finalized, we can also melt this down a bit and convert it to a JOSS paper (#59). |
Thanks @IndrajeetPatil and @strengejacke! Before finishing this vignette, we need to improve |
I have added a couple of TODOs for this in the vignette. I will keep updating them as I think about more. |
@etiennebacher I am changing the targeted issue for this PR to #130 because that's what the PR is targeting. For #90, I had a very different kind of vignette in mind. |
I see, I think a good example of a messy dataset could be to download a csv file from the World bank catalogue (not using the package
|
An example: colors <- data.frame(
group = c("a", "b", "c"),
color = c("black", "forestgreen", "lightblue")
)
scale_color_manual(values = data_extract(colors, "color", name = group)) |
I think it's now ok to review this but it should be merged after #189 because the vignette uses some args of Also, I prefer using |
I don't have strong opinions on that, but I personally prefer showcasing data wrangling functions having the |
I also feel that we should remove the |
This comment was marked as resolved.
This comment was marked as resolved.
@etiennebacher I'd like to merge this soon. I think most of the We can keep updating it as we add more tidyverse equivalents (e.g. |
@IndrajeetPatil if my last commit doesn't break anything, I agree to merge this and update it later, I think it's ready |
Merci beaucoup, Étienne! |
Closes #130
I put datawizard and tidyverse chunks side by side to easily compare the syntax, for example:
(I wanted to put different background colors to each chunk to distinguish between the two syntax, but apparently it is not yet supported by pkgdown (r-lib/downlit#149) so I added comments in each chunk).