2

I'm currently working on writing a large file that will reference several ".R" code scripts to read a number of saved datasets to create a series of plots. I would like to avoid reading the data into my environment to minimize clutter (there are a lot of files).

For this reason, I would like to reference a dataset and pipe it to a "chunk" of ggplot options to create my desired aesthetics, etc. However, it appears that ggplot "forgets" the piped dataset after the first layer and I would like to find out if there is a way to get around this behavior.

There is a similar question here on SO (Referencing piped dataset in ggplot layers for subsetting) but the reference to the dataset occurs in a "geom" step with an explicit call for a data object. Oddly, I have no issues using the suggestions from this other question but they don't work for my particular problem.

In particular, I need to reference the dataset in order to tell ggplot how many axis ticks to create (I would like the data to inform this step so that a new axis will be created if the data changes in the future, e.g. more "years" are added). As an example:

dataset <- data.frame(year = c("2014", "2015", "2016"), measure = c(10, 15, 20))

dataset %>% 
ggplot(aes(x = year, y = measure)) +
geom_bar(stat = "identity") +
scale_x_continuous(breaks = unique(.$year))

Gives the error

Error in unique(.$year) : object '.' not found

The pipe works fine in the first step (explicitly, 'data = ., ...'), but referencing the object that '.' is a placeholder for doesn't work in "downstream" layers.

Frequently I can use curly bracketing {} to circumvent this issue (for reasons I don't fully understand), but this doesn't work either:

dataset %>% 
ggplot(aes(x = year, y = measure)) +
geom_bar(stat = "identity") +
{scale_x_continuous(breaks = unique(.$year))}

I suspect I may be too new to magrittr and ggplot to fully understand why "%=>% and "+" don't seem to play nicely together, but was hoping someone may be able to point me in the right direction. Thanks!

aosmith
  • 34,856
  • 9
  • 84
  • 118
G. Vece
  • 133
  • 5
  • simply remove the `scale_x_continuous` you are already addressing the issue with your setup of `x = year`. Simply add a new year and measure to test this. Further you are calling for a discrete value of x, but providing year as a factor - so would be better suited with `scale_x_discrete` – B Williams Jul 19 '17 at 18:58
  • 2
    I agree with @BWilliams, but just in case this is a dummy example and you have a reasonable use case: You were on the right track with curly braces `{...}` but wrap them around the entire ggplot call, from `{ggplot(...) + ... + scale*(...)}. Then you can use the `.` dot pronoun to reference the contents of the pipe anywhere in the call (and indeed, must do so in your first call, `ggplot(data = ., aes(...))`). – Brian Jul 19 '17 at 19:09
  • @B Williams: I see what you are saying, but I would like to explicitly impose the number of ticks, since ggplot's defaults are not always preferable (ex: when only a few years are present, creates "minor ticks" at years "2013.5", "2014.5", etc. that I would like to remove). @Brian This seems to work perfectly! I'm still not sure why exactly the entire call has to be wrapped and, in addition, why the first use of the "." has to appear explicitly in the code when usually it can be omitted (when it is being used as the first argument). In any case, I can't argue with results! – G. Vece Jul 19 '17 at 19:49
  • 1
    Wrapping a set of functions in `{...}` creates an "anonymous function", that is one that isn't named and stored. It's equivalent to `function(.) {somefuns(.)}`. Because of that, the pipe doesn't automatically just stick the result of the pipe into the first argument slot, you have to call it explicitly. – Brian Jul 19 '17 at 21:40

0 Answers0