diff --git a/EDA.Rmd b/EDA.Rmd index 24abf27..fa3725c 100644 --- a/EDA.Rmd +++ b/EDA.Rmd @@ -453,7 +453,7 @@ ggplot(data = diamonds) + geom_point(mapping = aes(x = carat, y = price)) ``` -Scatterplots become less useful as the size of your dataset grows, because points begin to overplot, and pile up into areas of uniform black (as above). This problem is similar to showing the distribution of price by color using a scatterplot: +Scatterplots become less useful as the size of your dataset grows, because points begin to overplot, and pile up into areas of uniform black (as above). This problem is similar to showing the distribution of price by cut using a scatterplot: ```{r, dev = "png"} ggplot(data = diamonds) + diff --git a/factors.Rmd b/factors.Rmd index 81bbec6..2e96b58 100644 --- a/factors.Rmd +++ b/factors.Rmd @@ -14,9 +14,7 @@ For more historical context on factors, I recommend [_stringsAsFactors: An unaut To work with factors, we'll use the __forcats__ package, which provides tools for dealing with **cat**egorical variables (and it's an anagram of factors!). It provides a wide range of helpers for working with factors. We'll also need dplyr for some data manipulation, and ggplot2 for visualisation. ```{r setup, message = FALSE} -# devtools::install_github("hadley/forcats") library(forcats) - library(ggplot2) library(dplyr) ```