Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apply-functions-to-columns solution #5

Open
dpastoor opened this issue Mar 19, 2015 · 7 comments
Open

apply-functions-to-columns solution #5

dpastoor opened this issue Mar 19, 2015 · 7 comments

Comments

@dpastoor
Copy link

(refers to https://github.com/noamross/zero-dependency-problems-r/blob/master/apply-functions-to-columns.md)

seems like a perfect chance to use mapply

This stack overflow discusses how to you use mapply like zip in python

http://stackoverflow.com/questions/9281323/zip-or-enumerate-in-r

@noamross
Copy link
Owner

OK, a user with Python familiarity might google "python zip in r".

apply() and variants are covered in the SWC R Novice lesson, so perhaps a beginner would think about apply() functions generally, and mapply() is in the reference material. mapply() can also be found via the "See also" part of the lapply() documentation, but not apply().

The 2nd result from searching "apply list of functions to columns of data frame" gets you to a StackOverflow mapply() answer. You need to include "columns" because "r apply list of functions to data frame" doesn't get you there. In general it seems that mapply() is mentioned less in various pages that talk about apply(), sapply(), and lapply().

@dpastoor
Copy link
Author

frankly, even after playing with examples, mapply can be tough to grasp, IMO. The 'best' best chance I would think a user would have in figuring this out is understanding

  • functions as first class objects
  • "iterators" in a general sense.

In that case, as long as max speed wasn't an issue or terseness, there are a multitude of 'easy' ways to solve it.

funs <- list(fun1, fun2, fun3)

for (i in seq_along(df)) {
df[[i]] <- funs[i]
}

but for a beginner with a non-coding background, especially functions as objects is not something I would anticipate someone picking up naturally, and looks like it isn't covered in the SWC novice lessions (not surprisingly)

@jennybc
Copy link

jennybc commented Mar 20, 2015

I would push back on this question, because it clearly doesn't scale. This is not a general novice problem, applying: unique functions 1 through n for variables 1 through n in a data.frame for n of any size. I'd ask: "what are you really trying to do?" I wouldn't just start trying to solve the problem, taking it as face value. It smells like someone describing the step, not the goal.

@dpastoor
Copy link
Author

Completely agree - that was actually my first impression as well.

Its also really not ever going to have generalizable/flexible (in the sense that you'd need to write a custom function for each new column) and would likely be unable to use across df's.

Though, there are some situations where you do need to leverage this concept (from reading the source a while back I believe this is how qplot/ggplot do apply various bits to each layer).

@jennybc
Copy link

jennybc commented Mar 20, 2015

@dpastoor You're right, it's an interesting puzzle. But not a legit novice question, I suspect. So @noamross you'll have to decide how to handle this situation, since it will come up a lot I suspect in other questions. Often novices pose rather thorny programming problems but, if you peel back the onion a bit, you can design the question away by, e.g., helping them use a more natural data structure. Lots of questions about iterating over this and that, in particular, go away, once you pick the right way to store or shape the data in R.

@bbolker
Copy link

bbolker commented Mar 20, 2015

my answer to this one, for a novice, would be "go ahead and use a loop. Why not? Speed is unlikely to be a serious problem and you'll have a lot easier time understanding what you did."

@noamross
Copy link
Owner

@jennybc Thanks for the meta-response! I agree. At least a few example questions should illustrate this point. I have the advantage of knowing the questioner, so I might go back and see if we can peel back the onion now and see whether this would be a good example. (Though the question is a couple of years old; they have since become quite the expert and will probably be a SWC instructor soon.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants