Comments from @mine-cetinkaya-rundel

This commit is contained in:
hadley 2016-08-18 07:50:27 -05:00
parent 90079aea74
commit 47aa5a8f0e
1 changed files with 5 additions and 5 deletions

View File

@ -2,7 +2,7 @@
## Introduction
In [exploratory data analysis], you learned how to use plots as tools for _exploration_. When making plots for exploration, you knew---even before looking at them---which variables the plot would display. You made each plot for a purpose, and could quickly look at it and move on to the next plot. In the course of most analyses, you'll produce tens of hundreds of plots, most of which are immediately discarded.
In [exploratory data analysis], you learned how to use plots as tools for _exploration_. When making plots for exploration, you knew---even before looking at them---which variables the plot would display. You made each plot for a purpose, could quickly look at it, and then move on to the next plot. In the course of most analyses, you'll produce tens or hundreds of plots, most of which are immediately discarded.
Now you need to _communicate_ the results of your analysis to others. Your audience will likely not share your background knowledge and will not be deeply invested in the data. To help others quickly build up a good mental model of the data, you will need to invest considerable effort in making your plots as self-explanatory as possible. In this chapter, you'll learn some of the tools that ggplot2 provides to do so.
@ -111,7 +111,7 @@ ggplot(mpg, aes(displ, hwy)) +
geom_label(aes(label = model), data = best_in_class, nudge_y = 2, alpha = 0.5)
```
That helps a bit, but if you look closely in the top-left hand corner, you'll notice that there are two labels practically on top of each other. There's no way that we can fix these by applying the same transformation for every label. Instead, we can use the __ggrepel__ package by Kamil Slowikowski. This useful package will automatically adjust labels so that they don't overlap:
That helps a bit, but if you look closely in the top-left hand corner, you'll notice that there are two labels practically on top of each other. This happens because the highway mileage and displacement for the best cars in the compact and subcompact categories are exactly the same.There's no way that we can fix these by applying the same transformation for every label. Instead, we can use the __ggrepel__ package by Kamil Slowikowski. This useful package will automatically adjust labels so that they don't overlap:
```{r}
ggplot(mpg, aes(displ, hwy)) +
@ -337,7 +337,7 @@ ggplot(mpg, aes(displ, hwy)) +
Instead of just tweaking the detail a little, you can also replace the scale altogether. We'll focus on colour scales because there are many options, and they're the scales you're mostly likely to want to change. The same principles apply to the other aesthetics. All colour scales have two variants: `scale_colour_x()` and `scale_fill_x()` for the `colour` and `fill` aesthetics respectively (the colour scales are available in both UK and US spellings).
The default categorical scale picks colours that are evenly spaced around the colour wheel. Useful alternatives are the ColourBrewer scales which have been hand tuned to work better for people with common types of colour blindness. The two plots below look similar, but there is enough difference in the shades of red and green that the dots on the right can be distinguished even by people with red-green colour blindness.
The default categorical scale picks colours that are evenly spaced around the colour wheel. Useful alternatives are the ColorBrewer scales which have been hand tuned to work better for people with common types of colour blindness. The two plots below look similar, but there is enough difference in the shades of red and green that the dots on the right can be distinguished even by people with red-green colour blindness.
```{r, fig.align = "default", out.width = "50%"}
ggplot(mpg, aes(displ, hwy)) +
@ -356,7 +356,7 @@ ggplot(mpg, aes(displ, hwy)) +
scale_colour_brewer(palette = "Set1")
```
Figure \@ref(fig:brewer) shows the complete list of all palettes. The sequential (top) and diverging (bottom) palettes are particularly useful if your categorical values are ordered, or have a "middle". This often arises if you've used `cut()` to make a continuous variable into a categorical variable.
The ColorBrewer scales are documented online at <http://colorbrewer2.org/> and made available in R via the __RColorBrewer__ package, by Erich Neuwirth. Figure \@ref(fig:brewer) shows the complete list of all palettes. The sequential (top) and diverging (bottom) palettes are particularly useful if your categorical values are ordered, or have a "middle". This often arises if you've used `cut()` to make a continuous variable into a categorical variable.
```{r brewer, fig.asp = 2.5, echo = FALSE, fig.cap = "All ColourBrewer scales."}
par(mar = c(0, 3, 0, 0))
@ -376,7 +376,7 @@ presidential %>%
For continuous colour, you can use the built-in `scale_colour_gradient()` or `scale_fill_gradient()`. If you have a diverging scale, you can use `scale_colour_gradient2()`. That allows you to give, for example, positive and negative values different colours. That's sometimes also useful if you want to distinguish points above or below the mean.
Another option is `scale_colour_viridis()` provided by the __viridis__ package. It's a continuous analog of the categorical Brewer scales. The designers, Nathaniel Smith and Stéfan van der Walt, carefully tailored a continuous colour scheme that has good perceptual properties. Here's an example from the viridis vignette.
Another option is `scale_colour_viridis()` provided by the __viridis__ package. It's a continuous analog of the categorical ColorBrewer scales. The designers, Nathaniel Smith and Stéfan van der Walt, carefully tailored a continuous colour scheme that has good perceptual properties. Here's an example from the viridis vignette.
```{r, fig.align = "default", fig.asp = 1, out.width = "50%", fig.width = 4}
df <- tibble(