TITLE: Writing ggplot2 grobs in a loop to maintain data values
DATE: 2019-10-15
AUTHOR: John L. Godlee
====================================================================


Sometimes I encounter the need to make multiple plots in a loop and
then arrange them in a grid.arrange() panel. A recent example was
when I had a bunch of ANCOVA linear models in a list, each with a
different predictor variable, and I wanted to create ggplot()
objects for each linear model in a for loop, showing the
distribution of points for the predictor and response variable and
slopes for a categorical grouping variable, then I wanted to put
each plot into a single grid image and export it, like the image
below:

 ![Plots of biomass across different vegetation
clusters](https://johngodlee.xyz/img_full/ggplot_loop/biomass.png)

The code to generate this involves creating a list for the plot
objects, then filling the list with each ggplot() called within a
for loop, where each iteration of the loop is a different linear
model with a different predictor variable, then calling the list of
plots with do.call("grid.arrange", c(plot_list)) to build the grid
image ready to be exported. To access the data needed for the x
axis of the plot, which changes with each plot/linear model, I took
the variable name from the linear model it is based on with:

   x_var <- rownames(summary(model_list[[i]])$coefficients)[2]

This means the ggplot() for the scatter plots can then be called
like this:

   ggplot() +
       geom_point(aes(x = df[,x_var], y = df[,bchave_log])

You would think this works fine, but when the ggplot() object is
called again outside the loop, x_var is nowhere to be found, as it
only exists inside the loop. So instead I had to build each
ggplot() into a ggplot grob inside the loop where x_var still
exists before plotting it outside the loop. Building a grob from a
ggplot() object simply requires wrapping it in
ggplot_gtable(ggplot_build(...)). Below is an example with inbuilt
data to illustrate the point:

   # Example data
   df <- mtcars

   # Run a for loop to plot each column aginst column 1
   plot_list <- list()

   for(i in 1:length(df)){
     plot_list[[i]] <- ggplot() + geom_point(aes(df[,i], df[,1]))
   }

   do.call("grid.arrange", c(plot_list, ncol = 2, nrow = 6))

   ##' All the plots are the same because `df[,i]` takes the
   ##' value from the final loop iteration.

   # Fix the ggplot objects in the loop so they maintain variable
values
   for(i in 1:length(df)){
     plot_list[[i]] <- ggplot_gtable(ggplot_build(ggplot() +
         geom_point(aes(df[,i], df[,1]))))
   }

   do.call("grid.arrange", c(plot_list, ncol = 2, nrow = 6))

First, ggplot_build() takes the plot object and produces an object
that can be rendered as a standalone, with a list of dataframes for
each plot layer (points, lines) and a panel object containing
metadata for the plot like axis limits and themes.

Then, ggplot_gtable() builds a grob which can display the image and
stores it as a gtable. This bit is necessary so that grid.arrange()
can draw the plot.