redundant_coding.Rmd

```{r echo = FALSE, message = FALSE}
# run setup script
source("_common.R")

library(lubridate)
library(ggridges)
```


# Redundant coding {#redundant-coding}

In Chapter \@ref(color-pitfalls), we have seen that color cannot always convey information as effectively as we might wish. If we have many different items we want to identify, doing so by color may not work. It will be difficult to match the colors in the plot to the colors in the legend (Figure \@ref(fig:popgrowth-vs-popsize-colored)). And even if we only need to distinguish two to three different items, color may fail if the colored items are very small (Figure \@ref(fig:colors-thin-lines)) and/or the colors look similar for people suffering from color-vision deficiency (Figures \@ref(fig:red-green-cvd-sim) and \@ref(fig:blue-green-cvd-sim)). The general solution in all these scenarios is to use color to enhance the visual appearance of the figure without relying entirely on color to convey key information. I refer to this design principle as *redundant coding*, because it prompts us to encode data redundantly, using multiple different aesthetic dimensions.

## Designing legends with redundant coding 

Scatter plots of several groups of data are frequently designed such that the points representing different groups differ only in their color. As an example, consider Figure \@ref(fig:iris-scatter-one-shape), which shows the sepal width versus the sepal length of three different *Iris* species. (Sepals are the outer leafs of flowers in flowering plants.) The points representing the different species differ in their colors, but otherwise all points look exactly the same. Even though this figure contains only three distinct groups of points, it is difficult to read even for people with normal color vision. The problem arises because the data points for the two species *Iris virginica* and *Iris versicolor* intermingle, and their two respective colors, green and blue, are not particularly distinct from each other.

(ref:iris-scatter-one-shape) Sepal width versus sepal length for three different iris species (*Iris setosa*, *Iris virginica*, and *Iris versicolor*). Each point represents the measurements for one plant sample. A small amount of jitter has been applied to all point positions to prevent overplotting. The figure is labeled "bad" because the *virginica* points in green and the *versicolor* points in blue are difficult to distinguish from each other.

```{r iris-scatter-one-shape, fig.cap = '(ref:iris-scatter-one-shape)'}

breaks = c("setosa", "virginica", "versicolor")
labels = paste0("Iris ", breaks)

iris_scatter_base <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, fill = Species, color = Species)) + 
    scale_color_manual(
      values = darken(c("#E69F00", "#56B4E9", "#009E73"), 0.3),
      breaks = breaks,
      labels = labels,
      name = NULL
    ) +
    scale_fill_manual(
      values = c("#E69F0080", "#56B4E980", "#009E7380"),
      breaks = breaks,
      labels = labels,
      name = NULL
    ) +
    scale_x_continuous(
      limits = c(3.95, 8.2), expand = c(0, 0),
      labels = c("4.0", "5.0", "6.0", "7.0", "8.0"),
      name = "sepal length"
    ) +
    scale_y_continuous(
      limits = c(1.9, 4.6), expand = c(0, 0),
      name = "sepal width"
    )

iris_scatter <- iris_scatter_base +
  geom_point(
    size=2.5, shape=21, stroke = 0.5,
    position = position_jitter(
      width = 0.01 * diff(range(iris$Sepal.Length)),
      height = 0.01 * diff(range(iris$Sepal.Width)),
      seed = 3942
    )
  ) +
  theme_dviz_grid() +
  theme(
    legend.title.align = 0.5,
    legend.text = element_text(face = "italic"),
    legend.spacing.y = unit(3.5, "pt"),
    plot.margin = margin(7, 7, 3, 1.5)
  )

stamp_bad(iris_scatter)
```

Surprisingly, the green and blue points look more distinct for people with red--green color-vision-deficiency (deuteranomaly or protanomaly) than for people with normal color vision (compare Figure \@ref(fig:iris-scatter-one-shape-cvd), top row, to Figure \@ref(fig:iris-scatter-one-shape)). On the other hand, for people with blue--yellow deficiency (tritanomaly) the blue and green points look very similar (Figure \@ref(fig:iris-scatter-one-shape-cvd), bottom left). And if we print out the figure in gray-scale (i.e., we *desaturate* the figure), we cannot distinguish any of the iris species (Figure \@ref(fig:iris-scatter-one-shape-cvd), bottom right).

(ref:iris-scatter-one-shape-cvd) Color-vision-deficiency simulation of Figure \@ref(fig:iris-scatter-one-shape).

```{r iris-scatter-one-shape-cvd, fig.width = 5.5*6/4.2, fig.asp = 0.66, fig.cap = '(ref:iris-scatter-one-shape-cvd)'}
iris_scatter_small <- iris_scatter_base +
  geom_point(
    size=.655*2.5, shape=21, stroke = .655*0.5,
    position = position_jitter(
      width = 0.01 * diff(range(iris$Sepal.Length)),
      height = 0.01 * diff(range(iris$Sepal.Width)),
      seed = 3942
    )
  ) +
  theme_dviz_grid(
    .655*14,
    line_size = .85*.5 # make line size a little bigger than mathematically correct
  ) + 
  theme(
    legend.title.align = 0.5,
    legend.text = element_text(face = "italic"),
    legend.spacing.y = grid::unit(.655*3, "pt"),
    plot.margin = margin(18, 1, 1, 1)
  )

cvd_sim2(iris_scatter_small, label_size = 14, label_y = .98, scale = .95)
```

There are two simple improvements we can make to Figure \@ref(fig:iris-scatter-one-shape) to alleviate these issues. First, we can swap the colors used for *Iris setosa* and *Iris versicolor*, so that the blue is no longer directly next to the green (Figure \@ref(fig:iris-scatter-three-shapes)). Second, we can use three different symbol shapes, so that the points all look different. With these two changes, both the original version of the figure (Figure \@ref(fig:iris-scatter-three-shapes)) and the versions under color-vision-deficiency and in grayscale (Figure \@ref(fig:iris-scatter-three-shapes-cvd)) become legible.

(ref:iris-scatter-three-shapes) Sepal width versus sepal length for three different iris species. Compared to Figure \@ref(fig:iris-scatter-one-shape), we have swapped the colors for *Iris setosa* and *Iris versicolor* and we have given each iris species its own point shape.

```{r iris-scatter-three-shapes, fig.cap = '(ref:iris-scatter-three-shapes)'}
iris_scatter2_base <- ggplot(
  iris, aes(x = Sepal.Length, y = Sepal.Width, shape = Species, fill = Species, color = Species)
) +     
    scale_shape_manual(
      values = c(21, 22, 23),
      breaks = breaks,
      labels = labels,
      name = NULL
    ) +
    scale_color_manual(
      values = darken(c("#56B4E9", "#E69F00", "#009E73"), 0.3),
      breaks = breaks,
      labels = labels,
      name = NULL
    ) +
    scale_fill_manual(
      values = c("#56B4E980", "#E69F0080", "#009E7380"),
      breaks = breaks,
      labels = labels,
      name = NULL
    ) +
    scale_x_continuous(
      limits = c(3.95, 8.2), expand = c(0, 0),
      labels = c("4.0", "5.0", "6.0", "7.0", "8.0"),
      name = "sepal length"
    ) +
    scale_y_continuous(
      limits = c(1.9, 4.6), expand = c(0, 0),
      name = "sepal width"
    )

iris_scatter2 <- iris_scatter2_base +
  geom_point(
    size=2.5, stroke = 0.5,
    position = position_jitter(
      width = 0.01 * diff(range(iris$Sepal.Length)),
      height = 0.01 * diff(range(iris$Sepal.Width)),
      seed = 3942)
  ) +
  theme_dviz_grid() +
  theme(
    legend.title.align = 0.5,
    legend.text = element_text(face = "italic"),
    legend.spacing.y = unit(3.5, "pt"),
    plot.margin = margin(7, 7, 3, 1.5)
  )

iris_scatter2
```


(ref:iris-scatter-three-shapes-cvd) Color-vision-deficiency simulation of Figure \@ref(fig:iris-scatter-three-shapes). Because of the use of different point shapes, even the fully desaturated gray-scale version of the figure is legible.

```{r iris-scatter-three-shapes-cvd, fig.width = 5.5*6/4.2, fig.asp = 0.66, fig.cap = '(ref:iris-scatter-three-shapes-cvd)'}
iris_scatter2_small <- iris_scatter2_base +
  geom_point(
    size=.655*2.5, stroke = .655*0.5,
    position = position_jitter(
      width = 0.01 * diff(range(iris$Sepal.Length)),
      height = 0.01 * diff(range(iris$Sepal.Width)),
      seed = 3942)
  ) +
  theme_dviz_grid(
   .655*14,
    line_size = .85*.5 # make line size a little bigger than mathematically correct
  ) + 
  theme(
    legend.title.align = 0.5,
    legend.text = element_text(face = "italic"),
    legend.spacing.y = grid::unit(.655*3, "pt"),
    plot.margin = margin(18, 1, 1, 1)
  )

cvd_sim2(iris_scatter2_small, label_size = 14, label_y = .98, scale = .95)
```

Changing the point shape is a simple strategy for scatter plots but it doesn't necessarily work for other types of plots. In line plots, we could change the line type (solid, dashed, dotted, etc., see also Figure \@ref(fig:common-aesthetics)), but using dashed or dotted lines often yields sub-optimal results. In particular, dashed or dotted lines usually don't look good unless they are perfectly straight or only gently curved, and in either case they create visual noise. Also, it frequently requires significant mental effort to match different types of dash or dot--dash patterns from the plot to the legend. So what do we do with a visualization such as Figure \@ref(fig:tech-stocks-bad-legend), which uses lines to show the change in stock price over time for four different major tech companies?

(ref:tech-stocks-bad-legend) Stock price over time for four major tech companies. The stock price for each company has been normalized to equal 100 in June 2012. This figure is labeled as "bad" because it takes considerable mental energy to match the company names in the legend to the data curves. Data source: Yahoo Finance

```{r tech-stocks-bad-legend, fig.cap = '(ref:tech-stocks-bad-legend)'}
price_plot_base <- ggplot(tech_stocks, aes(x = date, y = price_indexed, color = ticker)) +
  geom_line(size = 0.66, na.rm = TRUE) +
  scale_color_manual(
    values = c("#000000", "#E69F00", "#56B4E9", "#009E73"),
    name = "",
    breaks = c("GOOG", "AAPL", "FB", "MSFT"),
    labels = c("Alphabet", "Apple", "Facebook", "Microsoft")
  ) +
  scale_x_date(
    name = "year",
    limits = c(ymd("2012-06-01"), ymd("2017-05-31")),
    expand = c(0,0)
  ) + 
  scale_y_continuous(
    name = "stock price, indexed",
    limits = c(0, 560),
    expand = c(0,0)
  )

stamp_bad(
  price_plot_base + 
    theme_dviz_hgrid() + 
    theme(plot.margin = margin(3, 7, 3, 1.5))
)
```


The figure contains four lines representing the stock prices of the four different companies. The lines are color coded using a colorblind-friendly color scale. So it should be relatively straightforward to associate each line with the corresponding company. Yet it is not. The problem here is that the data lines have a clear visual order. The yellow line, representing Facebook, is clearly the highest line, and the black line, representing Apple, is clearly the lowest, with Alphabet and Microsoft in between, in that order. Yet the order of the four companies in the legend is Alphabet, Apple, Facebook, Microsoft (alphabetic order). Thus, the perceived order of the data lines differs from the order of the companies in the legend, and it takes a surprising amount of mental effort to match data lines with company names.

This problem arises commonly with plotting software that autogenerates legends. The plotting software has no concept of the visual order the viewer will perceive. Instead, the software sorts the legend by some other order, most commonly alphabetical. We can fix this problem by manually reordering the entries in the legend so they match the perceived ordering in the data (Figure \@ref(fig:tech-stocks-good-legend)). The result is a figure that makes it much easier to match the legend to the data.

(ref:tech-stocks-good-legend) Stock price over time for four major tech companies. The stock price for each company has been normalized to equal 100 in June 2012. Data source: Yahoo Finance

```{r tech-stocks-good-legend, fig.cap = '(ref:tech-stocks-good-legend)'}
price_plot_base_good <- ggplot(tech_stocks, aes(x = date, y = price_indexed, color = ticker)) +
  scale_color_manual(
    values = c("#000000", "#E69F00", "#56B4E9", "#009E73"),
    name = "",
    breaks = c("FB", "GOOG", "MSFT", "AAPL"),
    labels = c("Facebook", "Alphabet", "Microsoft", "Apple")
  ) +
  scale_x_date(
    name = "year",
    limits = c(ymd("2012-06-01"), ymd("2017-05-31")),
    expand = c(0,0)
  ) + 
  scale_y_continuous(
    name = "stock price, indexed",
    limits = c(0, 560),
    expand = c(0,0)
  )

price_plot_base_good +
  geom_line(size = 0.66, na.rm = TRUE) +
  theme_dviz_hgrid() + 
  theme(plot.margin = margin(3, 7, 3, 1.5))
```

```{block type='rmdtip', echo=TRUE}
If there is a clear visual ordering in your data, make sure to match it in the legend.
```


Matching the legend order to the data order is always helpful, but the benefits are particularly obvious under color-vision deficiency simulation (Figure \@ref(fig:tech-stocks-good-legend-cvd)). For example, it helps in the tritanomaly version of the figure, where the blue and the green become difficult to distinguish (Figure \@ref(fig:tech-stocks-good-legend-cvd), bottom left). It also helps in the grayscale version (Figure \@ref(fig:tech-stocks-good-legend-cvd), bottom right). Even though the two colors for Facebook and Alphabet have virtually the same gray value, we can see that Microsoft and Apple are represented by darker colors and take the bottom two spots. Therefore, we correctly assume that the highest line corresponds to Facebook and the second-highest line to Alphabet.

(ref:tech-stocks-good-legend-cvd) Color-vision-deficiency simulation of Figure \@ref(fig:tech-stocks-good-legend).

```{r tech-stocks-good-legend-cvd, fig.width = 5.5*6/4.2, fig.asp = 0.66, fig.cap = '(ref:tech-stocks-good-legend-cvd)'}
price_plot_good_small <-
  price_plot_base_good + 
  geom_line(
    size = .85*0.66, # make line size a little bigger than mathematically correct
    na.rm = TRUE
  ) +
  theme_dviz_hgrid(
    .655*14,
    line_size = .85*.5 # make line size a little bigger than mathematically correct
  ) + 
  theme(plot.margin = margin(18, 1, 1, 1))

cvd_sim2(price_plot_good_small, label_size = 14, label_y = .98, scale = .95)
```


## Designing figures without legends

Even though legend legibility can be improved by encoding data redundantly, in multiple aesthetics, legends always put an extra mental burden on the reader. In reading a legend, the reader needs to pick up information in one part of the visualization and then transfer it over to a different part. We can typically make our readers' lives easier if we eliminate the legend altogether. Eliminating the legend does not mean, however, that we simply not provide one and instead write sentences such as "The yellow dots represent *Iris versicolor*" in the figure caption. Eliminating the legend means that we design the figure in such a way that it is immediately obvious what the various graphical elements represent, even if no explicit legend is present.

The general strategy we can employ is called *direct labeling*, whereby we place appropriate text labels or other visual elements that serve as guideposts to the rest of the figure. We have previously encountered direct labeling in Chapter \@ref(color-pitfalls) (Figure \@ref(fig:popgrowth-vs-popsize-bw)), as an alternative to drawing a legend with over 50 distinct colors. To apply the direct labeling concept to the stock-price figure, we place the name of each company right next to the end of its respective data line (Figure \@ref(fig:tech-stocks-good-no-legend)). 

(ref:tech-stocks-good-no-legend) Stock price over time for four major tech companies. The stock price for each company has been normalized to equal 100 in June 2012. Data source: Yahoo Finance

```{r tech-stocks-good-no-legend, fig.cap = '(ref:tech-stocks-good-no-legend)'}
price_plot <- price_plot_base_good + 
  geom_line(size = 0.66, na.rm = TRUE) +
  theme_dviz_hgrid()

yann <- axis_canvas(price_plot, axis = "y") +
  geom_text(
    data = filter(tech_stocks, date == "2017-06-02"),
    aes(y = price_indexed, label = paste0(" ", company)),
    family = dviz_font_family,
    x = 0, hjust = 0, size = 12/.pt
  )

price_plot_ann <- insert_yaxis_grob(
  price_plot + theme(legend.position = "none"),
  yann,
  width = grid::unit(0.3, "null")
)
ggdraw(price_plot_ann)
```


```{block type='rmdtip', echo=TRUE}
Whenever possible, design your figures so they don't need a legend.
```

We can also apply the direct labeling concept to the iris data from the beginning of this chapter, specifically Figure \@ref(fig:iris-scatter-three-shapes). Because it is a scatter plot of many points that separate into three different groups, we need to direct label the groups rather than the individual points. One solution is to draw ellipses that enclose the majority of the points and then label the ellipses (Figure \@ref(fig:iris-scatter-with-ellipses)).

(ref:iris-scatter-with-ellipses) Sepal width versus sepal length for three different iris species. I have removed the background grid from this figure because otherwise the figure was becoming too busy.

```{r iris-scatter-with-ellipses, fig.width = 4.6, fig.asp = 0.8, fig.cap = '(ref:iris-scatter-with-ellipses)'}
label_df <- data.frame(
  Species = c("setosa", "virginica", "versicolor"),
  label = c("Iris setosa", "Iris virginica", "Iris versicolor"),
  Sepal.Width = c(4.2, 3.76, 2.08),
  Sepal.Length = c(5.7, 7, 5.1),
  hjust = c(0, 0.5, 0),
  vjust = c(0, 0.5, 1))

iris_scatter3 <- ggplot(iris, 
      aes(
        x = Sepal.Length,
        y = Sepal.Width,
        color = Species
      )
    ) + 
    geom_point(
      aes(shape = Species, fill = Species),
      size = 2.5,
      position = position_jitter(
        width = 0.01 * diff(range(iris$Sepal.Length)),
        height = 0.01 * diff(range(iris$Sepal.Width)),
        seed = 3942)
    ) +
    stat_ellipse(size = 0.5) +
    geom_text(
      data = label_df,
      aes(
        x = Sepal.Length, y = Sepal.Width, label = label, color = Species,
        hjust = hjust, vjust = vjust
      ),
      family = dviz_font_family,
      size = 14/.pt,
      fontface = "italic",
      inherit.aes = FALSE
    ) +
    scale_shape_manual(
      values = c(21, 22, 23),
      breaks = breaks,
      name = NULL
    ) +
    scale_fill_manual(
      values = c("#56B4E980", "#E69F0080", "#009E7380"),
      breaks = breaks,
      name = NULL
    ) +
    scale_color_manual(
      values = darken(c("#56B4E9", "#E69F00", "#009E73"), 0.3),
      breaks = breaks,
      name = NULL
    ) +
    guides(fill = "none", color = "none", shape = "none") +
    scale_x_continuous(
      limits = c(3.95, 8.2), expand = c(0, 0),
      labels = c("4.0", "5.0", "6.0", "7.0", "8.0"),
      name = "sepal length"
    ) +
    scale_y_continuous(
      limits = c(1.9, 4.6), expand = c(0, 0),
      name = "sepal width"
    ) +
    theme_dviz_open()

iris_scatter3
```

For density plots, we can similarly direct-label the curves rather than providing a color-coded legend (Figure \@ref(fig:iris-densities-direct-label)). In both Figures \@ref(fig:iris-scatter-with-ellipses) and \@ref(fig:iris-densities-direct-label), I have colored the text labels in the same colors as the data. Colored labels can greatly enhance the direct labeling effect, but they can also turn out very poorly. If the text labels are printed in a color that is too light, then the labels become difficult to read. And, because text consists of very thin lines, colored text often appears to be lighter than an adjacent filled area of the same color. I generally circumvent these issues by using two different shades of each color, a light one for filled areas and a dark one for lines, outlines, and text. If you carefully inspect Figure \@ref(fig:iris-scatter-with-ellipses) or \@ref(fig:iris-densities-direct-label), you will see how each data point or shaded area is filled with a light color and has an outline drawn in a darker color of the same hue. And the text labels are drawn in the same darker colors. 

(ref:iris-densities-direct-label) Density estimates of the sepal lengths of three different iris species. Each density estimate is directly labeled with the respective species name.

```{r iris-densities-direct-label, fig.cap = '(ref:iris-densities-direct-label)'}
# compute densities for sepal lengths
iris_dens <- group_by(iris, Species) %>%
  do(ggplot2:::compute_density(.$Sepal.Length, NULL)) %>%
  rename(Sepal.Length = x)

# get the maximum values
iris_max <- filter(iris_dens, density == max(density)) %>%
  ungroup() %>%
  mutate(
    hjust = c(0, 0.4, 0),
    vjust = c(1, 0, 1),
    nudge_x = c(0.11, 0, 0.24),
    nudge_y = c(-0.02, 0.02, -0.02),
    label = paste0("Iris ", Species)
  )

iris_p <- ggplot(iris_dens, aes(x = Sepal.Length, y = density, fill = Species, color = Species)) + 
  geom_density_line(stat = "identity") +
  geom_text(
    data = iris_max,
    aes(
      label = label, hjust = hjust, vjust = vjust, color = Species,
      x = Sepal.Length + nudge_x, 
      y = density + nudge_y
    ), 
    family = dviz_font_family,
    size = 14/.pt,
    inherit.aes = FALSE,
    fontface = "italic"
  ) +
  scale_color_manual(
    values = darken(c("#56B4E9", "#E69F00", "#009E73"), 0.3),
    breaks = c("virginica", "versicolor", "setosa"),
    guide = "none"
  ) +
  scale_fill_manual(
    values = c("#56B4E950", "#E69F0050", "#009E7350"),
    breaks = c("virginica", "versicolor", "setosa"),
    guide = "none"
  ) +
  scale_x_continuous(expand = c(0, 0), name = "sepal length") +
  scale_y_continuous(limits = c(0, 1.5), expand = c(0, 0)) +
  coord_cartesian(clip = "off") +
  theme_dviz_hgrid() +
  theme(
    axis.line.x = element_blank(),
    plot.margin = margin(6.5, 1.5, 3.5, 1.5)
  )
  
iris_p
```

We can also use density plots such as the one in Figure \@ref(fig:iris-densities-direct-label) as a legend replacement, by placing the density plots into the margins of a scatter plot (Figure \@ref(fig:iris-scatter-dens)). This allows us to direct-label the marginal density plots rather than the central scatter plot and hence results in a figure that is somewhat less cluttered than Figure \@ref(fig:iris-scatter-with-ellipses) with directly-labeled ellipses.

(ref:iris-scatter-dens) Sepal width versus sepal length for three different iris species, with marginal density estimates of each variable for each species.

```{r iris-scatter-dens, fig.width = 5*6/4.2, fig.asp=0.85, fig.cap = '(ref:iris-scatter-dens)'}
# compute densities for sepal lengths
iris_dens2 <- group_by(iris, Species) %>%
  do(ggplot2:::compute_density(.$Sepal.Width, NULL)) %>%
  rename(Sepal.Width = x)

dens_limit <- max(iris_dens$density, iris_dens2$density) * 1.05 # upper limit of density curves

# we need different hjust and nudge values here
iris_max <- 
  iris_max %>%
  mutate(
    hjust = c(1, 0.4, 0),
    vjust = c(1, 0, 1),
    nudge_x = c(-0.18, 0, 0.47),
    nudge_y = c(-0.01, 0.06, 0.03),
    label = paste0("Iris ", Species)
  )

xdens <- axis_canvas(iris_scatter2, axis = "x") +
  geom_density_line(
    data=iris_dens,
    aes(x = Sepal.Length, y = density, fill = Species, color = Species),
    stat = "identity", size = .2
  ) +
  geom_text(
    data = iris_max,
    aes(
      label = label, hjust = hjust, vjust = vjust, color = Species,
      x = Sepal.Length + nudge_x, 
      y = density + nudge_y
    ),
    family = dviz_font_family, 
    ize = 12/.pt, 
    #color = "black", inherit.aes = FALSE,
    fontface = "italic"
  ) +
  scale_color_manual(
    values = darken(c("#56B4E9", "#E69F00", "#009E73"), 0.3),
    breaks = c("virginica", "versicolor", "setosa"),
    guide = "none"
  ) +
  scale_fill_manual(
    values = c("#56B4E950", "#E69F0050", "#009E7350"),
    breaks = c("virginica", "versicolor", "setosa"),
    guide = "none"
  ) +
  scale_y_continuous(limits = c(0, dens_limit), expand = c(0, 0))

ydens <- axis_canvas(iris_scatter2, axis = "y", coord_flip = TRUE) +
  geom_density_line(
    data = iris_dens2,
    aes(x = Sepal.Width, y = density, fill = Species, color = Species),
    stat = "identity", size = .2
  )  +
  scale_color_manual(
    values = darken(c("#56B4E9", "#E69F00", "#009E73"), 0.3),
    breaks = c("virginica", "versicolor", "setosa"),
    guide = "none"
  ) +
  scale_fill_manual(
    values = c("#56B4E950", "#E69F0050", "#009E7350"),
    breaks = c("virginica", "versicolor", "setosa"),
    guide = "none"
  ) +
  scale_y_continuous(limits = c(0, dens_limit), expand = c(0, 0)) +
  coord_flip()

p1 <- insert_xaxis_grob(
  iris_scatter2 + theme(legend.position = "none"),
  xdens,
  grid::unit(3*14, "pt"), position = "top"
)
p2 <- insert_yaxis_grob(p1, ydens, grid::unit(3*14, "pt"), position = "right")

ggdraw(p2)
```

And finally, whenever we encode a single variable in multiple aesthetics, we don't normally want multiple separate legends for the different aesthetics. Instead, there should be only a single legend-like visual element that conveys all mappings at once. In the case where we map the same variable onto a position along a major axis and onto color, this implies that the reference color bar should run along and be integrated into the same axis. Figure \@ref(fig:temp-ridgeline-colorbar) shows a case where we map temperature to both a position along the *x* axis and onto color, and where we therefore have integrated the color legend into the *x* axis.

(ref:temp-ridgeline-colorbar) Temperatures in Lincoln, Nebraska, in 2016. This figure is a variation of Figure \@ref(fig:temp-ridgeline). Temperature is now shown both by location along the *x* axis and by color, and a color bar along the *x* axis visualizes the scale that converts temperatures into colors.

```{r temp-ridgeline-colorbar, fig.asp = 0.652, fig.cap = '(ref:temp-ridgeline-colorbar)'}
bandwidth <- 3.4

lincoln_base <- ggplot(lincoln_weather, aes(x = `Mean Temperature [F]`, y = `Month`, fill = ..x..)) +
  geom_density_ridges_gradient(
    scale = 3, rel_min_height = 0.01, bandwidth = bandwidth,
    color = "black", size = 0.25
  ) +
  scale_x_continuous(
    name = "mean temperature (°F)",
    expand = c(0, 0), breaks = c(0, 25, 50, 75), labels = NULL
  ) +
  scale_y_discrete(name = NULL, expand = c(0, .2, 0, 2.6)) +
  scale_fill_continuous_sequential(
    palette = "Heat",
    l1 = 20, l2 = 100, c2 = 0,
    rev = FALSE
  ) +
  guides(fill = "none") +
  theme_dviz_grid() +
  theme(
    axis.text.y = element_text(vjust = 0),
    plot.margin = margin(3, 7, 3, 1.5)
  )

# x axis labels
temps <- data.frame(temp = c(0, 25, 50, 75))

# calculate corrected color ranges
# stat_joy uses the +/- 3*bandwidth calculation internally
tmin <- min(lincoln_weather$`Mean Temperature [F]`) - 3*bandwidth
tmax <- max(lincoln_weather$`Mean Temperature [F]`) + 3*bandwidth

xax <- axis_canvas(lincoln_base, axis = "x", ylim = c(0, 2)) +
  geom_ridgeline_gradient(
    data = data.frame(temp = seq(tmin, tmax, length.out = 100)),
    aes(x = temp, y = 1.1, height = .9, fill = temp),
    color = "transparent"
  ) +
  geom_text(
    data = temps, aes(x = temp, label = temp),
    color = "black", family = dviz_font_family,
    y = 0.9, hjust = 0.5, vjust = 1, size = 14/.pt
  ) +
  scale_fill_continuous_sequential(
    palette = "Heat",
    l1 = 20, l2 = 100, c2 = 0,
    rev = FALSE
  )

lincoln_final <- insert_xaxis_grob(lincoln_base, xax, position = "bottom", height = unit(0.1, "null"))

ggdraw(lincoln_final)
```