parent
e51ead6c9e
commit
03f1c6c6f4
8
EDA.qmd
8
EDA.qmd
|
@ -560,7 +560,7 @@ For larger plots, you might want to try the heatmaply package, which creates int
|
|||
|
||||
You've already seen one great way to visualize the covariation between two numerical variables: draw a scatterplot with `geom_point()`.
|
||||
You can see covariation as a pattern in the points.
|
||||
For example, you can see an exponential relationship between the carat size and price of a diamond.
|
||||
For example, you can see an exponential relationship between the carat size and price of a diamond:
|
||||
|
||||
```{r}
|
||||
#| dev: "png"
|
||||
|
@ -568,10 +568,12 @@ For example, you can see an exponential relationship between the carat size and
|
|||
#| A scatterplot of price vs. carat. The relationship is positive, somewhat
|
||||
#| strong, and exponential.
|
||||
|
||||
ggplot(diamonds, aes(x = carat, y = price)) +
|
||||
ggplot(smaller, aes(x = carat, y = price)) +
|
||||
geom_point()
|
||||
```
|
||||
|
||||
(In this section we'll use the `smaller` dataset to stay focused on the bulk of the diamonds that are smaller than 3 carats)
|
||||
|
||||
Scatterplots become less useful as the size of your dataset grows, because points begin to overplot, and pile up into areas of uniform black (as above).
|
||||
You've already seen one way to fix the problem: using the `alpha` aesthetic to add transparency.
|
||||
|
||||
|
@ -583,7 +585,7 @@ You've already seen one way to fix the problem: using the `alpha` aesthetic to a
|
|||
#| the number of points is higher than other areas, The most obvious clusters
|
||||
#| are for diamonds with 1, 1.5, and 2 carats.
|
||||
|
||||
ggplot(diamonds, aes(x = carat, y = price)) +
|
||||
ggplot(smaller, aes(x = carat, y = price)) +
|
||||
geom_point(alpha = 1 / 100)
|
||||
```
|
||||
|
||||
|
|
Loading…
Reference in New Issue