Make syntax for multi-line fig-cap and fig-alt consistent

This commit is contained in:
mine-cetinkaya-rundel 2023-05-26 11:44:08 -04:00
parent cbb5b1b01f
commit 205c9922f4
27 changed files with 227 additions and 227 deletions

40
EDA.qmd
View File

@ -81,7 +81,7 @@ We'll start our exploration by visualizing the distribution of weights (`carat`)
Since `carat` is a numerical variable, we can use a histogram:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A histogram of carats of diamonds, with the x-axis ranging from 0 to 4.5
#| and the y-axis ranging from 0 to 30000. The distribution is right skewed
#| with very few diamonds in the bin centered at 0, almost 30000 diamonds in
@ -117,7 +117,7 @@ To turn this information into useful questions, look for anything unexpected:
Let's take a look at the distribution of `carat` for smaller diamonds.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A histogram of carats of diamonds, with the x-axis ranging from 0 to 3 and
#| the y-axis ranging from 0 to roughly 2500. The binwidth is quite narrow
#| (0.01), resulting in a very large number of skinny bars. The distribution
@ -161,7 +161,7 @@ For example, take the distribution of the `y` variable from the diamonds dataset
The only evidence of outliers is the unusually wide limits on the x-axis.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A histogram of lengths of diamonds. The x-axis ranges from 0 to 60 and
#| the y-axis ranges from 0 to 12000. There is a peak around 5, and the
#| data appear to be completely clustered around the peak.
@ -174,7 +174,7 @@ There are so many observations in the common bins that the rare bins are very sh
To make it easy to see the unusual values, we need to zoom to small values of the y-axis with `coord_cartesian()`:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A histogram of lengths of diamonds. The x-axis ranges from 0 to 60 and the
#| y-axis ranges from 0 to 50. There is a peak around 5, and the data
#| appear to be completely clustered around the peak. Other than those data,
@ -270,7 +270,7 @@ It's not obvious where you should plot missing values, so ggplot2 doesn't includ
```{r}
#| dev: "png"
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of widths vs. lengths of diamonds. There is a strong,
#| linear association between the two variables. All but one of the diamonds
#| has length greater than 3. The one outlier has a length of 0 and a width
@ -297,7 +297,7 @@ You can do this by making a new variable, using `is.na()` to check if `dep_time`
[^eda-1]: Remember that when we need to be explicit about where a function (or dataset) comes from, we'll use the special form `package::function()` or `package::dataset`.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A frequency polygon of scheduled departure times of flights. Two lines
#| represent flights that are cancelled and not cancelled. The x-axis ranges
#| from 0 to 25 minutes and the y-axis ranges from 0 to 10000. The number of
@ -340,7 +340,7 @@ The best way to spot covariation is to visualize the relationship between two or
For example, let's explore how the price of a diamond varies with its quality (measured by `cut`) using `geom_freqpoly()`:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A frequency polygon of prices of diamonds where each cut of carat (Fair,
#| Good, Very Good, Premium, and Ideal) is represented with a different color
#| line. The x-axis ranges from 0 to 30000 and the y-axis ranges from 0 to
@ -361,7 +361,7 @@ To make the comparison easier we need to swap what is displayed on the y-axis.
Instead of displaying count, we'll display the **density**, which is the count standardized so that the area under each frequency polygon is one.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A frequency polygon of densities of prices of diamonds where each cut of
#| carat (Fair, Good, Very Good, Premium, and Ideal) is represented with a
#| different color line. The x-axis ranges from 0 to 20000. The lines overlap
@ -382,7 +382,7 @@ But maybe that's because frequency polygons are a little hard to interpret - the
A visually simpler plot for exploring this relationship is using side-by-side boxplots.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side boxplots of prices of diamonds by cut. The distribution of
#| prices is right skewed for each cut (Fair, Good, Very Good, Premium, and
#| Ideal). The medians are close to each other, with the median for Ideal
@ -404,7 +404,7 @@ For example, take the `class` variable in the `mpg` dataset.
You might be interested to know how highway mileage varies across classes:
```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side boxplots of highway mileages of cars by class. Classes are
#| on the x-axis (2seaters, compact, midsize, minivan, pickup, subcompact,
#| and suv).
@ -416,7 +416,7 @@ ggplot(mpg, aes(x = class, y = hwy)) +
To make the trend easier to see, we can reorder `class` based on the median value of `hwy`:
```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side boxplots of highway mileages of cars by class. Classes are
#| on the x-axis and ordered by increasing median highway mileage (pickup,
#| suv, minivan, 2seater, subcompact, compact, and midsize).
@ -429,7 +429,7 @@ If you have long variable names, `geom_boxplot()` will work better if you flip i
You can do that by exchanging the x and y aesthetic mappings.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side boxplots of highway mileages of cars by class. Classes are
#| on the y-axis and ordered by increasing median highway mileage.
@ -468,7 +468,7 @@ To visualize the covariation between categorical variables, you'll need to count
One way to do that is to rely on the built-in `geom_count()`:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of color vs. cut of diamonds. There is one point for each
#| combination of levels of cut (Fair, Good, Very Good, Premium, and Ideal)
#| and color (D, E, F, G, G, I, and J). The sizes of the points represent
@ -492,7 +492,7 @@ diamonds |>
Then visualize with `geom_tile()` and the fill aesthetic:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A tile plot of cut vs. color of diamonds. Each tile represents a
#| cut/color combination and tiles are colored according to the number of
#| observations in each tile. There are more Ideal diamonds than other cuts,
@ -528,7 +528,7 @@ The relationship is exponential.
```{r}
#| dev: "png"
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of price vs. carat. The relationship is positive, somewhat
#| strong, and exponential.
@ -543,7 +543,7 @@ You've already seen one way to fix the problem: using the `alpha` aesthetic to a
```{r}
#| dev: "png"
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of price vs. carat. The relationship is positive, somewhat
#| strong, and exponential. The points are transparent, showing clusters where
#| the number of points is higher than other areas, The most obvious clusters
@ -566,7 +566,7 @@ You will need to install the hexbin package to use `geom_hex()`.
```{r}
#| layout-ncol: 2
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| Plot 1: A binned density plot of price vs. carat. Plot 2: A hexagonal bin
#| plot of price vs. carat. Both plots show that the highest density of
#| diamonds have low carats and low prices.
@ -584,7 +584,7 @@ Then you can use one of the techniques for visualizing the combination of a cate
For example, you could bin `carat` and then for each group, display a boxplot:
```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side box plots of price by carat. Each box plot represents diamonds
#| that are 0.1 carats apart in weight. The box plots show that as carat
#| increases the median price increases as well. Additionally, diamonds with
@ -668,7 +668,7 @@ Then, we exponentiate the residuals to put them back in the scale of raw prices.
```{r}
#| message: false
#| dev: "png"
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of residuals vs. carat of diamonds. The x-axis ranges from 0
#| to 5, the y-axis ranges from 0 to almost 4. Much of the data are clustered
#| around low values of carat and residuals. There is a clear, curved pattern
@ -695,7 +695,7 @@ ggplot(diamonds_aug, aes(x = carat, y = .resid)) +
Once you've removed the strong relationship between carat and price, you can see what you expect in the relationship between cut and price: relative to their size, better quality diamonds are more expensive.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side box plots of residuals by cut. The x-axis displays the various
#| cuts (Fair to Ideal), the y-axis ranges from 0 to almost 5. The medians are
#| quite similar, between roughly 0.75 to 1.25. Each of the distributions of

View File

@ -346,11 +346,11 @@ If this pepper shaker is your list `pepper`, then, `pepper[1]` is a pepper shake
#| label: fig-pepper
#| echo: false
#| out-width: "100%"
#| fig-cap: >
#| fig-cap: |
#| (Left) A pepper shaker that Hadley once found in his hotel room.
#| (Middle) `pepper[1]`.
#| (Right) `pepper[[1]]`
#| fig-alt: >
#| fig-alt: |
#| Three photos. On the left is a photo of a glass pepper shaker. Instead of
#| the pepper shaker containing pepper, it contains a single packet of pepper.
#| In the middle is a photo of a single packet of pepper. On the right is a

View File

@ -12,11 +12,11 @@ However, it doesn't matter how great your analysis is unless you can explain it
```{r}
#| label: fig-ds-communicate
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Communication is the final part of the data science process; if you
#| can't communicate your results to other humans, it doesn't matter how
#| great your analysis is.
#| fig-alt: >
#| fig-alt: |
#| A diagram displaying the data science cycle with
#| communicate highlighed in blue.
#| out.width: NULL

View File

@ -48,7 +48,7 @@ You add labels with the `labs()` function.
```{r}
#| message: false
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars, where
#| points are colored according to the car class. A smooth curve following
#| the trajectory of the relationship between highway fuel efficiency versus
@ -86,7 +86,7 @@ Just switch `""` out for `quote()` and read about the available options in `?plo
#| fig-asp: 1
#| out-width: "50%"
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| Scatterplot with math text on the x and y axis labels. X-axis label
#| says x_i, y-axis label says sum of x_i squared, for i from 1 to n.
@ -112,7 +112,7 @@ ggplot(df, aes(x, y)) +
```{r}
#| echo: false
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway versus city fuel efficiency. Shapes and
#| colors of points are determined by type of drive train.
@ -162,7 +162,7 @@ They're larger than the rest of the text on the plot and bolded.
(`theme(legend.position = "none"`) turns all the legends off --- we'll talk about it more shortly.)
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway mileage versus engine size where points are colored
#| by drive type. Smooth curves for each drive type are overlaid.
#| Text labels identify the curves as front-wheel, rear-wheel, and 4-wheel.
@ -185,7 +185,7 @@ We can use the `geom_label_repel()` function from the ggrepel package to address
This useful package will automatically adjust labels so that they don't overlap:
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars, where
#| points are colored according to the car class. Some points are labelled
#| with the car's name. The labels are box with white, transparent background
@ -206,7 +206,7 @@ You can also use the same idea to highlight certain points on a plot with `geom_
Note another handy technique used here: we added a second layer of large, hollow points to further highlight the labelled points.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars. Points
#| where highway mileage is above 40 as well as above 20 with engine size
#| above 5 are red, with a hollow red circle, and labelled with model name
@ -256,7 +256,7 @@ The `x` and `y` aesthetics in both define where the annotation should start, and
Note also that the segment is styled as an arrow.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars. A red
#| arrow pointing down follows the trend of the points and the annotation
#| placed next to the arrow reads "Larger engine sizes tend to have lower
@ -352,7 +352,7 @@ Labels controls the text label associated with each tick/key.
The most common use of `breaks` is to override the default choice:
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars,
#| colored by drive. The y-axis has breaks starting at 15 and ending at 40,
#| increasing by 5.
@ -368,7 +368,7 @@ You can also use `breaks` and `labels` to control the appearance of legends.
For discrete scales for categorical variables, `labels` can be a named list of the existing levels names and the desired labels for them.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars, colored
#| by drive. The x and y-axes do not have any labels at the axis ticks.
#| The legend has custom labels: 4-wheel, front, rear.
@ -388,7 +388,7 @@ Note that `breaks` is in the original scale of the data.
```{r}
#| layout-ncol: 2
#| fig-width: 4
#| fig-alt: >
#| fig-alt: |
#| Two side-by-side box plots of price versus cut of diamonds. The outliers
#| are transparent. On both plots the x-axis labels are formatted as dollars.
#| The x-axis labels on the plot start at $0 and go to $15,000, increasing
@ -412,7 +412,7 @@ ggplot(diamonds, aes(x = price, y = cut)) +
Another handy label function is `label_percent()`:
```{r}
#| fig-alt: >
#| fig-alt: |
#| Segmented bar plots of cut, filled with levels of clarity. The y-axis
#| labels start at 0% and go to 100%, increasing by 25%. The y-axis label
#| name is "Percentage".
@ -426,7 +426,7 @@ Another use of `breaks` is when you have relatively few data points and want to
For example, take this plot that shows when each US president started and ended their term.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Line plot of id number of presidents versus the year they started their
#| presidency. Start year is marked with a point and a segment that starts
#| there and ends at the end of the presidency. The x-axis labels are
@ -459,7 +459,7 @@ The theme setting `legend.position` controls where the legend is drawn:
```{r}
#| layout-ncol: 2
#| fig-width: 4
#| fig-alt: >
#| fig-alt: |
#| Four scatterplots of highway fuel efficiency versus engine size of cars
#| where points are colored based on class of car. Clockwise, the legend
#| is placed on the right, left, top, and bottom of the plot.
@ -485,7 +485,7 @@ The following example shows two important settings: controlling the number of ro
This is particularly useful if you have used a low `alpha` to display many points on a plot.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars
#| where points are colored based on class of car. Overlaid on the plot is a
#| smooth curve. The legend is in the bottom and classes are listed
@ -514,7 +514,7 @@ For example, it's easier to see the precise relationship between `carat` and `pr
#| fig-align: default
#| layout-ncol: 2
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| Two plots of price versus carat of diamonds. Data binned and the color of
#| the rectangles representing each bin based on the number of points that
#| fall into that bin. In the plot on the right, price and carat values
@ -534,7 +534,7 @@ Instead of doing the transformation in the aesthetic mapping, we can instead do
This is visually identical, except the axes are labelled on the original data scale.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Plot of price versus carat of diamonds. Data binned and the color of
#| the rectangles representing each bin based on the number of points that
#| fall into that bin. The axis labels are on the original data scale.
@ -556,7 +556,7 @@ The two plots below look similar, but there is enough difference in the shades o
#| fig-align: default
#| layout-ncol: 2
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| Two scatterplots of highway mileage versus engine size where points are
#| colored by drive type. The plot on the left uses the default
#| ggplot2 color palette and the plot on the right uses a different color
@ -575,7 +575,7 @@ If there are just a few colors, you can add a redundant shape mapping.
This will also help ensure your plot is interpretable in black and white.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Two scatterplots of highway mileage versus engine size where both color
#| and shape of points are based on drive type. The color palette is not
#| the default ggplot2 palette.
@ -595,7 +595,7 @@ This often arises if you've used `cut()` to make a continuous variable into a ca
#| echo: false
#| fig-cap: All colorBrewer scales.
#| fig-asp: 2.5
#| fig-alt: >
#| fig-alt: |
#| All colorBrewer scales. One group goes from light to dark colors.
#| Another group is a set of non ordinal colors. And the last group has
#| diverging scales (from dark to light to dark again). Within each set
@ -610,7 +610,7 @@ For example, if we map presidential party to color, we want to use the standard
One approach for assigning these colors is using hex color codes:
```{r}
#| fig-alt: >
#| fig-alt: |
#| Line plot of id number of presidents versus the year they started their
#| presidency. Start year is marked with a point and a segment that starts
#| there and ends at the end of the presidency. Democratic presidents are
@ -638,7 +638,7 @@ These scales are available as continuous (`c`), discrete (`d`), and binned (`b`)
#| layout-ncol: 2
#| fig-width: 3
#| fig-asp: 0.75
#| fig-alt: >
#| fig-alt: |
#| Three hex plots where the color of the hexes show the number of observations
#| that fall into that hex bin. The first plot uses the default, continuous
#| ggplot2 scale. The second plot uses the viridis, continuous scale, and the
@ -872,7 +872,7 @@ You can also create your own themes, if you are trying to match a particular cor
#| label: fig-themes
#| echo: false
#| fig-cap: The eight themes built-in to ggplot2.
#| fig-alt: >
#| fig-alt: |
#| Eight barplots created with ggplot2, each
#| with one of the eight built-in themes:
#| theme_bw() - White background with grid lines,
@ -898,7 +898,7 @@ In the following plot these are set to `"plot"` to indicate these elements are a
A few other helpful `theme()` components are used to change the placement for format of the title and caption text.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars, colored
#| by drive. The plot is titled 'Larger engine sizes tend to have lower fuel
#| economy' with the caption pointing to the source of the data, fueleconomy.gov.
@ -944,7 +944,7 @@ Then, you place them next to each other with `+`.
```{r}
#| fig-width: 6
#| fig-asp: 0.5
#| fig-alt: >
#| fig-alt: |
#| Two plots (a scatterplot of highway mileage versus engine size and a
#| side-by-side boxplots of highway mileage versus drive train) placed next
#| to each other.
@ -967,7 +967,7 @@ In the following, `|` places the `p1` and `p3` next to each other and `/` moves
```{r}
#| fig-width: 6
#| fig-asp: 0.8
#| fig-alt: >
#| fig-alt: |
#| Three plots laid out such that first and third plot are next to each other
#| and the second plot stretched beneath them. The first plot is a
#| scatterplot of highway mileage versus engine size, third plot is a
@ -993,7 +993,7 @@ Patchwork divides up the area you have allotted for your plot using this scale a
```{r}
#| fig-width: 8
#| fig-asp: 1
#| fig-alt: >
#| fig-alt: |
#| Five plots laid out such that first two plots are next to each other. Plots
#| three and four are underneath them. And the fifth plot stretches under them.
#| The patchworked plot is titled "City and highway mileage for cars with
@ -1065,7 +1065,7 @@ If you'd like to learn more about combining and layout out multiple plots with p
#| fig-width: 7
#| fig-asp: 0.8
#| echo: false
#| fig-alt: >
#| fig-alt: |
#| Three plots: Plot 1 is a scatterplot of highway mileage versus engine size.
#| Plot 2 is side-by-side box plots of highway mileage versus drive train.
#| Plot 3 is side-by-side box plots of city mileage versus drive train.

View File

@ -66,10 +66,10 @@ There are three interrelated rules that make a dataset tidy:
```{r}
#| label: fig-tidy-structure
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| The following three rules make a dataset tidy: variables are columns,
#| observations are rows, and values are cells.
#| fig-alt: >
#| fig-alt: |
#| Three panels, each representing a tidy data frame. The first panel
#| shows that each variable is a column. The second panel shows that each
#| observation is a row. The third panel shows that each value is
@ -93,7 +93,7 @@ Here are a few small examples showing how you might work with `table1`.
```{r}
#| fig-width: 5
#| fig-alt: >
#| fig-alt: |
#| This figure shows the number of cases in 1999 and 2000 for
#| Afghanistan, Brazil, and China, with year on the x-axis and number
#| of cases on the y-axis. Each point on the plot represents the number
@ -236,9 +236,9 @@ We can see that very few songs stay in the top 100 for more than 20 weeks.
```{r}
#| label: fig-billboard-ranks
#| fig-cap: >
#| fig-cap: |
#| A line plot showing how the rank of a song changes over time.
#| fig-alt: >
#| fig-alt: |
#| A line plot with week on the x-axis and rank on the y-axis, where
#| each line represents a song. Most songs appear to start at a high rank,
#| rapidly accelerate to a low rank, and then decay again. There are
@ -286,10 +286,10 @@ As shown in @fig-pivot-variables, the values in column that was already a variab
```{r}
#| label: fig-pivot-variables
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Columns that are already variables need to be repeated, once for
#| each column that is pivotted.
#| fig-alt: >
#| fig-alt: |
#| A diagram showing how `pivot_longer()` transforms a simple
#| dataset, using color to highlight how the values in the `id` column
#| ("A", "B", "C") are each repeated twice in the output because there are
@ -304,10 +304,10 @@ They need to be repeated once for each row in the original dataset.
```{r}
#| label: fig-pivot-names
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| The column names of pivoted columns become values in a new column. The
#| values need to be repeated once for each row of the original dataset.
#| fig-alt: >
#| fig-alt: |
#| A diagram showing how `pivot_longer()` transforms a simple
#| data set, using color to highlight how column names ("bp1" and
#| "bp2") become the values in a new `measurement` column. They are repeated
@ -323,10 +323,10 @@ They are unwound row by row.
```{r}
#| label: fig-pivot-values
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| The number of values is preserved (not repeated), but unwound
#| row-by-row.
#| fig-alt: >
#| fig-alt: |
#| A diagram showing how `pivot_longer()` transforms data,
#| using color to highlight how the cell values (blood pressure measurements)
#| become the values in a new `value` column. They are unwound row-by-row,
@ -374,11 +374,11 @@ You can imagine this happening in two steps (first pivoting and then separating)
```{r}
#| label: fig-pivot-multiple-names
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Pivoting columns with multiple pieces of information in the names
#| means that each column name now fills in values in multiple output
#| columns.
#| fig-alt: >
#| fig-alt: |
#| A diagram that uses color to illustrate how supplying `names_sep`
#| and multiple `names_to` creates multiple variables in the output.
#| The input has variable names "x_1" and "y_2" which are split up
@ -421,12 +421,12 @@ When you use `".value"` in `names_to`, the column names in the input contribute
```{r}
#| label: fig-pivot-names-and-values
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Pivoting with `names_to = c(".value", "num")` splits the column names
#| into two components: the first part determines the output column
#| name (`x` or `y`), and the second part determines the value of the
#| `num` column.
#| fig-alt: >
#| fig-alt: |
#| A diagram that uses color to illustrate how the special ".value"
#| sentinel works. The input has names "x_1", "x_2", "y_1", and "y_2",
#| and we want to use the first component ("x", "y") as a variable name

View File

@ -521,9 +521,9 @@ You'll need to make one change to your RStudio options to use `|>` instead of `%
```{r}
#| label: fig-pipe-options
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| To insert `|>`, make sure the "Use native pipe operator" option is checked.
#| fig-alt: >
#| fig-alt: |
#| Screenshot showing the "Use native pipe operator" option which can
#| be found on the "Editing" panel of the "Code" options.
@ -852,7 +852,7 @@ When we plot the skill of the batter (measured by the batting average, `performa
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of number of batting performance vs. batting opportunites
#| overlaid with a smoothed line. Average performance increases sharply
#| from 0.2 at when n is 1 to 0.25 when n is ~1000. Average performance

View File

@ -130,7 +130,7 @@ Our ultimate goal in this chapter is to recreate the following visualization dis
```{r}
#| echo: false
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins, with a
#| best fit line of the relationship between these two variables
#| overlaid. The plot displays a positive, fairly linear, and relatively
@ -163,7 +163,7 @@ The first argument of `ggplot()` is the dataset to use in the graph and so `ggpl
This is not a very exciting plot, but you can think of it like an empty canvas you'll paint the remaining layers of your plot onto.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A blank, gray plot area.
ggplot(data = penguins)
@ -178,7 +178,7 @@ ggplot2 looks for the mapped variables in the `data` argument, in this case, `pe
The following plot shows the result of adding these mappings.
```{r}
#| fig-alt: >
#| fig-alt: |
#| The plot shows flipper length on the x-axis, with values that range from
#| 170 to 230, and body mass on the y-axis, with values that range from 3000
#| to 6000.
@ -203,7 +203,7 @@ ggplot2 comes with many geom functions that each adds a different type of layer
You'll learn a whole bunch of geoms throughout the book, particularly in @sec-layers.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins. The plot
#| displays a positive, linear, and relatively strong relationship between
#| these two variables.
@ -242,7 +242,7 @@ Throughout the book you will make many more ggplots and have many more opportuni
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins. The plot
#| displays a positive, fairly linear, and relatively strong relationship
#| between these two variables. Species (Adelie, Chinstrap, and Gentoo)
@ -266,7 +266,7 @@ And we will specify that we want to draw the line of best fit based on a `l`inea
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins. Overlaid
#| on the scatterplot are three smooth curves displaying the
#| relationship between these variables for each species (Adelie,
@ -289,7 +289,7 @@ Since we want points to be colored based on species but don't want the lines to
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins. Overlaid
#| on the scatterplot is a single line of best fit displaying the
#| relationship between these variables for each species (Adelie,
@ -313,7 +313,7 @@ Therefore, in addition to color, we can also map `species` to the `shape` aesthe
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins. Overlaid
#| on the scatterplot is a single line of best fit displaying the
#| relationship between these variables for each species (Adelie,
@ -337,7 +337,7 @@ In addition, we can improve the color palette to be colorblind safe with the `sc
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins, with a
#| line of best fit displaying the relationship between these two variables
#| overlaid. The plot displays a positive, fairly linear, and relatively
@ -401,7 +401,7 @@ We finally have a plot that perfectly matches our "ultimate goal"!
```{r}
#| echo: false
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins, colored
#| by bill depth. A smooth curve of the relationship between body mass
#| and flipper length is overlaid. The relationship is positive,
@ -503,7 +503,7 @@ To examine the distribution of a categorical variable, you can use a bar chart.
The height of the bars displays how many observations occurred with each `x` value.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A bar chart of frequencies of species of penguins: Adelie
#| (approximately 150), Chinstrap (approximately 90), Gentoo
#| (approximately 125).
@ -516,7 +516,7 @@ In bar plots of categorical variables with non-ordered levels, like the penguin
Doing so requires transforming the variable to a factor (how R handles categorical data) and then reordering the levels of that factor.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A bar chart of frequencies of species of penguins, where the bars are
#| ordered in decreasing order of their heights (frequencies): Adelie
#| (approximately 150), Gentoo (approximately 125), Chinstrap
@ -537,7 +537,7 @@ One commonly used visualization for distributions of continuous variables is a h
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A histogram of body masses of penguins. The distribution is unimodal
#| and right skewed, ranging between approximately 2500 to 6500 grams.
@ -558,7 +558,7 @@ A binwidth of 200 provides a sensible balance.
#| warning: false
#| layout-ncol: 2
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| Two histograms of body masses of penguins, one with binwidth of 20
#| (left) and one with binwidth of 2000 (right). The histogram with binwidth
#| of 20 shows lots of ups and downs in the heights of the bins, creating a
@ -579,7 +579,7 @@ The shape the spaghetti will take draped over blocks can be thought of as the sh
It shows fewer details than a histogram but can make it easier to quickly glean the shape of the distribution, particularly with respect to modes and skewness.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A density plot of body masses of penguins. The distribution is unimodal
#| and right skewed, ranging between approximately 2500 to 6500 grams.
@ -635,9 +635,9 @@ As shown in @fig-eda-boxplot, each boxplot consists of:
```{r}
#| label: fig-eda-boxplot
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Diagram depicting how a boxplot is created.
#| fig-alt: >
#| fig-alt: |
#| A diagram depicting how a boxplot is created following the steps outlined
#| above.
@ -648,7 +648,7 @@ Let's take a look at the distribution of body mass by species using `geom_boxplo
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| Side-by-side box plots of distributions of body masses of Adelie,
#| Chinstrap, and Gentoo penguins. The distribution of Adelie and
#| Chinstrap penguins' body masses appear to be symmetric with
@ -664,7 +664,7 @@ Alternatively, we can make density plots with `geom_density()`.
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A density plot of body masses of penguins by species of penguins. Each
#| species (Adelie, Chinstrap, and Gentoo) is represented with different
#| colored outlines for the density curves.
@ -681,7 +681,7 @@ In the following plot it's *set* to 0.5.
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A density plot of body masses of penguins by species of penguins. Each
#| species (Adelie, Chinstrap, and Gentoo) is represented in different
#| colored outlines for the density curves. The density curves are also
@ -706,7 +706,7 @@ The plot of frequencies show that there are equal numbers of Adelies on each isl
But we don't have a good sense of the percentage balance within each island.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Bar plots of penguin species by island (Biscoe, Dream, and Torgersen)
ggplot(penguins, aes(x = island, fill = species)) +
geom_bar()
@ -716,7 +716,7 @@ The second plot is a relative frequency plot, created by setting `position = "fi
Using this plot we can see that Gentoo penguins all live on Biscoe island and make up roughly 75% of the penguins on that island, Chinstrap all live on Dream island and make up roughly 50% of the penguins on that island, and Adelie live on all three islands and make up all of the penguins on Torgersen.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Bar plots of penguin species by island (Biscoe, Dream, and Torgersen)
#| the bars are scaled to the same height, making it a relative frequencies
#| plot
@ -734,7 +734,7 @@ A scatterplot is probably the most commonly used plot for visualizing the relati
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins. The plot
#| displays a positive, linear, relatively strong relationship between
#| these two variables.
@ -750,7 +750,7 @@ For example, in the following scatterplot the colors of points represent species
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins. The plot
#| displays a positive, linear, relatively strong relationship between
#| these two variables. The points are colored based on the species of the
@ -776,7 +776,7 @@ The variable that you pass to `facet_wrap()` should be categorical.
#| warning: false
#| fig-width: 8
#| fig-asp: 0.33
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of body mass vs. flipper length of penguins. The shapes and
#| colors of points represent species. Penguins from each island are on a
#| separate facet. Within each facet, the relationship between body mass and

View File

@ -333,7 +333,7 @@ wday(datetime, label = TRUE, abbr = FALSE)
We can use `wday()` to see that more flights depart during the week than on the weekend:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A bar chart with days of the week on the x-axis and number of
#| flights on the y-axis. Monday-Friday have roughly the same number of
#| flights, ~48,0000, decreasingly slightly over the course of the week.
@ -349,7 +349,7 @@ We can also look at the average departure delay by minute within the hour.
There's an interesting pattern: flights leaving in minutes 20-30 and 50-60 have much lower delays than the rest of the hour!
```{r}
#| fig-alt: >
#| fig-alt: |
#| A line chart with minute of actual departure (0-60) on the x-axis and
#| average delay (4-20) on the y-axis. Average delay starts at (0, 12),
#| steadily increases to (18, 20), then sharply drops, hitting at minimum
@ -370,7 +370,7 @@ flights_dt |>
Interestingly, if we look at the *scheduled* departure time we don't see such a strong pattern:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A line chart with minute of scheduled departure (0-60) on the x-axis
#| and average delay (4-16). There is relatively little pattern, just a
#| small suggestion that the average delay decreases from maybe 10 minutes
@ -393,11 +393,11 @@ Always be alert for this sort of pattern whenever you work with data that involv
```{r}
#| label: fig-human-rounding
#| fig-cap: >
#| fig-cap: |
#| A frequency polygon showing the number of flights scheduled to
#| depart each hour. You can see a strong preference for round numbers
#| like 0 and 30 and generally for numbers that are a multiple of five.
#| fig-alt: >
#| fig-alt: |
#| A line plot with departure minute (0-60) on the x-axis and number of
#| flights (0-60000) on the y-axis. Most flights are scheduled to depart
#| on either the hour (~60,000) or the half hour (~35,000). Otherwise,
@ -415,7 +415,7 @@ Each function takes a vector of dates to adjust and then the name of the unit to
This, for example, allows us to plot the number of flights per week:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A line plot with week (Jan-Dec 2013) on the x-axis and number of
#| flights (2,000-7,000) on the y-axis. The pattern is fairly flat from
#| February to November with around 7,000 flights per week. There are
@ -431,7 +431,7 @@ flights_dt |>
You can use rounding to show the distribution of flights across the course of a day by computing the difference between `dep_time` and the earliest instant of that day:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A line plot with depature time on the x-axis. This is units of seconds
#| since midnight so it's hard to interpret.
flights_dt |>
@ -444,7 +444,7 @@ Computing the difference between a pair of date-times yields a difftime (more on
We can convert that to an `hms` object to get a more useful x-axis:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A line plot with depature time (midnight to midnight) on the x-axis
#| and number of flights on the y-axis (0 to 15,000). There are very few
#| (<100) flights before 5am. The number of flights then rises rapidly

View File

@ -159,7 +159,7 @@ It's often useful to change the order of the factor levels in a visualization.
For example, imagine you want to explore the average number of hours spent watching TV per day across religions:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of with tvhours on the x-axis and religion on the y-axis.
#| The y-axis is ordered seemingly aribtrarily making it hard to get
#| any sense of overall pattern.
@ -183,7 +183,7 @@ We can improve it by reordering the levels of `relig` using `fct_reorder()`.
- Optionally, `fun`, a function that's used if there are multiple values of `x` for each value of `f`. The default value is `median`.
```{r}
#| fig-alt: >
#| fig-alt: |
#| The same scatterplot as above, but now the religion is displayed in
#| increasing order of tvhours. "Other eastern" has the fewest tvhours
#| under 2, and "Don't know" has the highest (over 5).
@ -210,7 +210,7 @@ relig_summary |>
What if we create a similar plot looking at how average age varies across reported income level?
```{r}
#| fig-alt: >
#| fig-alt: |
#| A scatterplot with age on the x-axis and income on the y-axis. Income
#| has been reordered in order of average age which doesn't make much
#| sense. One section of the y-axis goes from $6000-6999, then <$1000,
@ -235,7 +235,7 @@ You can use `fct_relevel()`.
It takes a factor, `f`, and then any number of levels that you want to move to the front of the line.
```{r}
#| fig-alt: >
#| fig-alt: |
#| The same scatterplot but now "Not Applicable" is displayed at the
#| bottom of the y-axis. Generally there is a positive association
#| between income and age, and the income band with the highethst average
@ -254,7 +254,7 @@ This makes the plot easier to read because the colors of the line at the far rig
```{r}
#| layout-ncol: 2
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| A line plot with age on the x-axis and proportion on the y-axis.
#| There is one line for each category of marital status: no answer,
#| never married, separated, divorced, widowed, and married. It is
@ -289,7 +289,7 @@ Finally, for bar plots, you can use `fct_infreq()` to order levels in decreasing
Combine it with `fct_rev()` if you want them in increasing frequency so that in the bar plot largest values are on the right, not the left.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A bar char of marital status ordered in from least to most common:
#| no answer (~0), separated (~1,000), widowed (~2,000), divorced
#| (~3,000), never married (~5,000), married (~10,000).

View File

@ -13,10 +13,10 @@ But in more complex cases it might require both tidying and transformation in or
```{r}
#| label: fig-ds-import
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Data import is the beginning of the data science process; without
#| data you can't do data science!
#| fig-alt: >
#| fig-alt: |
#| Our data science model with import highlighted in blue.
#| out.width: NULL

View File

@ -19,12 +19,12 @@ Our model of the steps of a typical data science project looks something like @f
```{r}
#| label: fig-ds-diagram
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| In our model of the data science process, you start with data import
#| and tidying. Next, you understand your data with an iterative cycle of
#| transforming, visualizing, and modeling. You finish the process
#| by communicating your results to other humans.
#| fig-alt: >
#| fig-alt: |
#| A diagram displaying the data science cycle: Import -> Tidy -> Understand
#| (which has the phases Transform -> Visualize -> Model in a cycle) ->
#| Communicate. Surrounding all of these is Communicate.
@ -151,10 +151,10 @@ You'll learn more as we go along![^intro-1]
#| label: fig-rstudio-console
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| The RStudio IDE has two key regions: type R code in the console pane
#| on the left, and look for plots in the output pane on the right.
#| fig-alt: >
#| fig-alt: |
#| The RStudio IDE with the panes Console and Output highlighted.
knitr::include_graphics("diagrams/rstudio/console.png", dpi = 270)
```

View File

@ -94,11 +94,11 @@ These relationships are summarized visually in @fig-flights-relationships.
#| label: fig-flights-relationships
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| Connections between all five data frames in the nycflights13 package.
#| Variables making up a primary key are colored grey, and are connected
#| to their corresponding foreign keys with arrows.
#| fig-alt: >
#| fig-alt: |
#| The relationships between airports, planes, flights, weather, and
#| airlines datasets from the nycflights13 package. airports$faa
#| connected to the flights$origin and flights$dest. planes$tailnum
@ -434,11 +434,11 @@ y <- tribble(
#| label: fig-join-setup
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| Graphical representation of two simple tables. The colored `key`
#| columns map background color to key value. The grey columns represent
#| the "value" columns that are carried along for the ride.
#| fig-alt: >
#| fig-alt: |
#| x and y are two data frames with 2 columns and 3 rows, with contents
#| as described in the text. The values of the keys are colored:
#| 1 is green, 2 is purple, 3 is orange, and 4 is yellow.
@ -454,10 +454,10 @@ The rows and columns in the output are primarily determined by `x`, so the `x` t
#| label: fig-join-setup2
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| To understand how joins work, it's useful to think of every possible
#| match. Here we show that with a grid of connecting lines.
#| fig-alt: >
#| fig-alt: |
#| x and y are placed at right-angles, with horizonal lines extending
#| from x and vertical lines extending from y. There are 3 rows in x and
#| 3 rows in y, which leads to nine intersections representing nine
@ -474,10 +474,10 @@ For example, @fig-join-inner shows an inner join, where rows are retained if and
#| label: fig-join-inner
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| An inner join matches each row in `x` to the row in `y` that has the
#| same value of `key`. Each match becomes a row in the output.
#| fig-alt: >
#| fig-alt: |
#| x and y are placed at right-angles with lines forming a grid of
#| potential matches. Keys 1 and 2 appear in both x and y, so we
#| get a match, indicated by a dot. Each dot corresponds to a row
@ -498,10 +498,10 @@ There are three types of outer joins:
#| label: fig-join-left
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| A visual representation of the left join where every row in `x`
#| appears in the output.
#| fig-alt: >
#| fig-alt: |
#| Compared to the previous diagram showing an inner join, the y table
#| gets a new virtual row containin NA that will match any row in x
#| that didn't otherwise match. This means that the output now has
@ -519,10 +519,10 @@ There are three types of outer joins:
#| label: fig-join-right
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| A visual representation of the right join where every row of `y`
#| appears in the output.
#| fig-alt: >
#| fig-alt: |
#| Compared to the previous diagram showing an left join, the x table
#| now gains a virtual row so that every row in y gets a match in x.
#| val_x contains NA for the row in y that didn't match x.
@ -538,10 +538,10 @@ There are three types of outer joins:
#| label: fig-join-full
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| A visual representation of the full join where every row in `x`
#| and `y` appears in the output.
#| fig-alt: >
#| fig-alt: |
#| Now both x and y have a virtual row that always matches.
#| The result has 4 rows: keys 1, 2, 3, and 4 with all values
#| from val_x and val_y, however key 2, val_y and key 4, val_x are NAs
@ -557,10 +557,10 @@ However, this is not a great representation because while it might jog your memo
#| label: fig-join-venn
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| Venn diagrams showing the difference between inner, left, right, and
#| full joins.
#| fig-alt: >
#| fig-alt: |
#| Venn diagrams for inner, full, left, and right joins. Each join
#| represented with two intersecting circles representing data frames x
#| and y, with x on the right and y on the left. Shading indicates the
@ -588,13 +588,13 @@ To understand what's going let's first narrow our focus to the `inner_join()` an
#| label: fig-join-match-types
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| The three ways a row in `x` can match. `x1` matches
#| one row in `y`, `x2` matches two rows in `y`, `x3` matches
#| zero rows in y. Note that while there are three rows in
#| `x` and three rows in the output, there isn't a direct
#| correspondence between the rows.
#| fig-alt: >
#| fig-alt: |
#| A join diagram where x has key values 1, 2, and 3, and y has
#| key values 1, 2, 2. The output has three rows because key 1 matches
#| one row, key 2 matches two rows, and key 3 matches zero rows.
@ -639,10 +639,10 @@ This means that filtering joins never duplicate rows like mutating joins do.
#| label: fig-join-semi
#| echo: false
#| out-width: null
#| fig-cap: >
#| fig-cap: |
#| In a semi-join it only matters that there is a match; otherwise
#| values in `y` don't affect the output.
#| fig-alt: >
#| fig-alt: |
#| A join diagram with old friends x and y. In a semi join, only the
#| presence of a match matters so the output contains the same columns
#| as x.
@ -654,10 +654,10 @@ knitr::include_graphics("diagrams/join/semi.png", dpi = 270)
#| label: fig-join-anti
#| echo: false
#| out-width: null
#| fig-cap: >
#| fig-cap: |
#| An anti-join is the inverse of a semi-join, dropping rows from `x`
#| that have a match in `y`.
#| fig-alt: >
#| fig-alt: |
#| An anti-join is the inverse of a semi-join so matches are drawn with
#| red lines indicating that they will be dropped from the output.
@ -679,9 +679,9 @@ x |> left_join(y, by = "key", keep = TRUE)
```{r}
#| label: fig-inner-both
#| fig-cap: >
#| fig-cap: |
#| An inner join showing both `x` and `y` keys in the output.
#| fig-alt: >
#| fig-alt: |
#| A join diagram showing an inner join betwen x and y. The result
#| now includes four columns: key.x, val_x, key.y, and val_y. The
#| values of key.x and key.y are identical, which is why we usually
@ -699,10 +699,10 @@ dplyr's join functions understand this distinction equi and non-equi joins so wi
```{r}
#| label: fig-join-gte
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| A non-equi join where the `x` key must be greater than or equal to
#| the `y` key. Many rows generate multiple matches.
#| fig-alt: >
#| fig-alt: |
#| A join diagram illustrating join_by(key >= key). The first row
#| of x matches one row of y and the second and thirds rows each match
#| two rows. This means the output has five rows containing each of the
@ -729,9 +729,9 @@ This means the output will have `nrow(x) * nrow(y)` rows.
#| label: fig-join-cross
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| A cross join matches each row in `x` with every row in `y`.
#| fig-alt: >
#| fig-alt: |
#| A join diagram showing a dot for every combination of x and y.
knitr::include_graphics("diagrams/join/cross.png", dpi = 270)
```
@ -785,10 +785,10 @@ For example `join_by(closest(x <= y))` matches the smallest `y` that's greater t
#| label: fig-join-closest
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| A rolling join is similar to a greater-than-or-equal inequality join
#| but only matches the first value.
#| fig-alt: >
#| fig-alt: |
#| A rolling join is a subset of an inequality join so some matches are
#| grayed out indicating that they're not used because they're not the
#| "closest".

View File

@ -62,7 +62,7 @@ We can do this with a scatterplot where the numerical variables are mapped to th
```{r}
#| layout-ncol: 2
#| fig-width: 4
#| fig-alt: >
#| fig-alt: |
#| Two scatterplots next to each other, both visualizing highway fuel
#| efficiency versus engine size of cars and showing a negative
#| association. In the plot on the left class is mapped to the color
@ -96,7 +96,7 @@ Similarly, we can map `class` to `size` or `alpha` aesthetics as well, which con
```{r}
#| layout-ncol: 2
#| fig-width: 4
#| fig-alt: >
#| fig-alt: |
#| Two scatterplots next to each other, both visualizing highway fuel
#| efficiency versus engine size of cars and showing a negative
#| association. In the plot on the left class is mapped to the size
@ -130,7 +130,7 @@ You can also set the visual properties of your geom manually as an argument of y
For example, we can make all of the points in our plot blue:
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars
#| that shows a negative association. All points are blue.
@ -151,7 +151,7 @@ You'll need to pick a value that makes sense for that aesthetic:
#| warning: false
#| fig.asp: 0.364
#| fig-align: "center"
#| fig-cap: >
#| fig-cap: |
#| R has 25 built-in shapes that are identified by numbers. There are some
#| seeming duplicates: for example, 0, 15, and 22 are all squares. The
#| difference comes from the interaction of the `color` and `fill`
@ -159,7 +159,7 @@ You'll need to pick a value that makes sense for that aesthetic:
#| the solid shapes (15--20) are filled with `color`; the filled shapes
#| (21--24) have a border of `color` and are filled with `fill`. Shapes are
#| arranged to keep similar shapes next to each other.
#| fig-alt: >
#| fig-alt: |
#| Mapping between shapes and the numbers that represent them: 0 - square,
#| 1 - circle, 2 - triangle point up, 3 - plus, 4 - cross, 5 - diamond,
#| 6 - triangle point down, 7 - square cross, 8 - star, 9 - diamond plus,
@ -200,7 +200,7 @@ In the next section we dive deeper into geoms.
```{r}
#| fig-show: hide
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars
#| that shows a negative association. All points are red and
#| the legend shows a red point that is mapped to the word blue.
@ -225,7 +225,7 @@ How are these two plots similar?
#| message: false
#| layout-ncol: 2
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| There are two plots. The plot on the left is a scatterplot of highway
#| fuel efficiency versus engine size of cars and the plot on the right
#| shows a smooth curve that follows the trajectory of the relationship
@ -270,7 +270,7 @@ On the other hand, you *could* set the linetype of a line.
#| message: false
#| layout-ncol: 2
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| Two plots of highway fuel efficiency versus engine size of cars.
#| The data are represented with smooth curves. On the left, three
#| smooth curves, all with the same linetype. On the right, three
@ -295,7 +295,7 @@ If this sounds strange, we can make it clearer by overlaying the lines on top of
```{r}
#| message: false
#| fig-alt: >
#| fig-alt: |
#| A plot of highway fuel efficiency versus engine size of cars. The data
#| are represented with points (colored by drive train) as well as smooth
#| curves (where line type is determined based on drive train as well).
@ -319,7 +319,7 @@ It is convenient to rely on this feature because the `group` aesthetic by itself
#| fig-width: 3
#| fig-asp: 1
#| message: false
#| fig-alt: >
#| fig-alt: |
#| Three plots, each with highway fuel efficiency on the y-axis and engine
#| size of cars, where data are represented by a smooth curve. The first plot
#| only has these two variables, the center plot has three separate smooth
@ -348,7 +348,7 @@ This makes it possible to display different aesthetics in different layers.
```{r}
#| message: false
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars, where
#| points are colored according to the car class. A smooth curve following
#| the trajectory of the relationship between highway fuel efficiency versus
@ -365,7 +365,7 @@ The local data argument in `geom_point()` overrides the global data argument in
```{r}
#| message: false
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars, where
#| points are colored according to the car class. A smooth curve following
#| the trajectory of the relationship between highway fuel efficiency versus
@ -391,7 +391,7 @@ For example, the histogram and density plot below reveal that the distribution o
```{r}
#| layout-ncol: 3
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| Three plots: histogram, density plot, and box plot of highway
#| mileage.
@ -462,7 +462,7 @@ To learn more about any single geom, use the help (e.g., `?geom_smooth`).
#| message: false
#| layout-ncol: 2
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| There are six scatterplots in this figure, arranged in a 3x2 grid.
#| In all plots highway fuel efficiency of cars are on the y-axis and
#| engine size is on the x-axis. The first plot shows all points in black
@ -503,7 +503,7 @@ To learn more about any single geom, use the help (e.g., `?geom_smooth`).
In @sec-data-visualization you learned about faceting with `facet_wrap()`, which splits a plot into subplots that each display one subset of the data based on a categorical variable.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars,
#| faceted by class, with facets spanning two rows.
@ -516,7 +516,7 @@ To facet your plot with the combination of two variables, switch from `facet_wra
The first argument of `facet_grid()` is also a formula, but now it's a double sided formula: `rows ~ cols`.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars, faceted
#| by number of cylinders across rows and by type of drive train across
#| columns. This results in a 4x3 grid of 12 facets. Some of these facets have
@ -533,7 +533,7 @@ This is useful when you want to compare data across facets but it can be limitin
Setting the `scales` argument in a faceting function to `"free"` will allow for different axis scales across both rows and columns, `"free_x"` will allow for different scales across rows, and `"free_y"` will allow for different scales across columns.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars,
#| faceted by number of cylinders across rows and by type of drive train
#| across columns. This results in a 4x3 grid of 12 facets. Some of these
@ -631,7 +631,7 @@ The `diamonds` dataset is in the ggplot2 package and contains information on \~5
The chart shows that more diamonds are available with high quality cuts than with low quality cuts.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Bar chart of number of each cut of diamond. There are roughly 1500
#| Fair, 5000 Good, 12000 Very Good, 14000 Premium, and 22000 Ideal cut
#| diamonds.
@ -659,11 +659,11 @@ The algorithm used to calculate new values for a graph is called a **stat**, sho
#| label: fig-vis-stat-bar
#| echo: false
#| out-width: "100%"
#| fig-cap: >
#| fig-cap: |
#| When creating a bar chart we first start with the raw data, then
#| aggregate it to count the number of observations in each bar,
#| and finally map those computed variables to plot aesthetics.
#| fig-alt: >
#| fig-alt: |
#| A figure demonstrating three steps of creating a bar chart.
#| Step 1. geom_bar() begins with the diamonds data set. Step 2. geom_bar()
#| transforms the data with the count stat, which returns a data set of
@ -688,7 +688,7 @@ However, there are three reasons why you might need to use a stat explicitly:
```{r}
#| warning: false
#| fig-alt: >
#| fig-alt: |
#| Bar chart of number of each cut of diamond. There are roughly 1500
#| Fair, 5000 Good, 12000 Very Good, 14000 Premium, and 22000 Ideal cut
#| diamonds.
@ -703,7 +703,7 @@ However, there are three reasons why you might need to use a stat explicitly:
For example, you might want to display a bar chart of proportions, rather than counts:
```{r}
#| fig-alt: >
#| fig-alt: |
#| Bar chart of proportion of each cut of diamond. Roughly, Fair
#| diamonds make up 0.03, Good 0.09, Very Good 0.22, Premium 0.26, and
#| Ideal 0.40.
@ -718,7 +718,7 @@ However, there are three reasons why you might need to use a stat explicitly:
For example, you might use `stat_summary()`, which summarizes the y values for each unique x value, to draw attention to the summary that you're computing:
```{r}
#| fig-alt: >
#| fig-alt: |
#| A plot with depth on the y-axis and cut on the x-axis (with levels
#| fair, good, very good, premium, and ideal) of diamonds. For each level
#| of cut, vertical lines extend from minimum to maximum depth for diamonds
@ -774,7 +774,7 @@ You can color a bar chart using either the `color` aesthetic, or, more usefully,
```{r}
#| layout-ncol: 2
#| fig-width: 4
#| fig-alt: >
#| fig-alt: |
#| Two bar charts of drive types of cars. In the first plot, the bars have
#| colored borders. In the second plot, they're filled with colors. Heights
#| of the bars correspond to the number of cars in each cut category.
@ -792,7 +792,7 @@ Note what happens if you map the fill aesthetic to another variable, like `class
Each colored rectangle represents a combination of `drv` and `class`.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Segmented bar chart of drive types of cars, where each bar is filled with
#| colors for the classes of cars. Heights of the bars correspond to the
#| number of cars in each drive category, and heights of the colored
@ -813,7 +813,7 @@ If you don't want a stacked bar chart, you can use one of three other options: `
```{r}
#| layout-ncol: 2
#| fig-width: 4
#| fig-alt: >
#| fig-alt: |
#| Segmented bar chart of drive types of cars, where each bar is filled with
#| colors for the classes of cars. Heights of the bars correspond to the
#| number of cars in each drive category, and heights of the colored
@ -842,7 +842,7 @@ If you don't want a stacked bar chart, you can use one of three other options: `
```{r}
#| layout-ncol: 2
#| fig-width: 4
#| fig-alt: >
#| fig-alt: |
#| On the left, segmented bar chart of drive types of cars, where each bar is
#| filled with colors for the levels of class. Height of each bar is 1 and
#| heights of the colored segments represent the proportions of cars
@ -869,7 +869,7 @@ Did you notice that the plot displays only 126 points, even though there are 234
```{r}
#| echo: false
#| fig-alt: >
#| fig-alt: |
#| Scatterplot of highway fuel efficiency versus engine size of cars that
#| shows a negative association.
@ -887,7 +887,7 @@ You can avoid this gridding by setting the position adjustment to "jitter".
This spreads the points out because no two points are likely to receive the same amount of random noise.
```{r}
#| fig-alt: >
#| fig-alt: |
#| Jittered scatterplot of highway fuel efficiency versus engine size of cars.
#| The plot shows a negative association.
@ -945,7 +945,7 @@ There are two other coordinate systems that are occasionally helpful.
#| layout-ncol: 2
#| fig-width: 3
#| message: false
#| fig-alt: >
#| fig-alt: |
#| Two maps of the boundaries of New Zealand. In the first plot the aspect
#| ratio is incorrect, in the second plot it is correct.
@ -966,7 +966,7 @@ There are two other coordinate systems that are occasionally helpful.
#| layout-ncol: 2
#| fig-width: 3
#| fig-asp: 1
#| fig-alt: >
#| fig-alt: |
#| There are two plots. On the left is a bar chart of clarity of diamonds,
#| on the right is a Coxcomb chart of the same data.
@ -1030,12 +1030,12 @@ You'd then select a coordinate system to place the geoms into, using the locatio
```{r}
#| label: fig-visualization-grammar
#| echo: false
#| fig-alt: >
#| fig-alt: |
#| A figure demonstrating the steps for going from raw data to table of
#| frequencies where each row represents one level of cut and a count column
#| shows how many diamonds are in that cut level. Then, these values are
#| mapped to heights of bars.
#| fig-cap: >
#| fig-cap: |
#| Steps for going from raw data to a table of frequencies to a bar plot where
#| the heights of the bar represent the frequencies.

View File

@ -210,11 +210,11 @@ For example, `df |> filter(!is.na(x))` finds all rows where `x` is not missing a
#| label: fig-bool-ops
#| echo: false
#| out-width: NULL
#| fig-cap: >
#| fig-cap: |
#| The complete set of Boolean operations. `x` is the left-hand
#| circle, `y` is the right-hand circle, and the shaded region show
#| which parts each operator selects.
#| fig-alt: >
#| fig-alt: |
#| Six Venn diagrams, each explaining a given logical operator. The
#| circles (sets) in each of the Venn diagrams represent x and y. 1. y &
#| !x is y but none of x; x & y is the intersection of x and y; x & !y is

View File

@ -229,7 +229,7 @@ You can force them to display by supplying `drop = FALSE` to the appropriate dis
```{r}
#| layout-ncol: 2
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| A bar chart with a single value on the x-axis, "no".
#|
#| The same bar chart as the last plot, but now with two values on

View File

@ -241,12 +241,12 @@ The results are shown in @fig-prop-cancelled.
```{r}
#| label: fig-prop-cancelled
#| fig-cap: >
#| fig-cap: |
#| A line plot with scheduled departure hour on the x-axis, and proportion
#| of cancelled flights on the y-axis. Cancellations seem to accumulate
#| over the course of the day until 8pm, very late flights are much
#| less likely to be cancelled.
#| fig-alt: >
#| fig-alt: |
#| A line plot showing how proportion of cancelled flights changes over
#| the course of the day. The proportion starts low at around 0.5% at
#| 6am, then steadily increases over the course of the day until peaking
@ -584,10 +584,10 @@ The median delay is always smaller than the mean delay because flights sometimes
```{r}
#| label: fig-mean-vs-median
#| fig-cap: >
#| fig-cap: |
#| A scatterplot showing the differences of summarizing daily depature
#| delay with median instead of mean.
#| fig-alt: >
#| fig-alt: |
#| All points fall below a 45° line, meaning that the median delay is
#| always less than the mean delay. Most points are clustered in a
#| dense region of mean [0, 20] and median [0, 5]. As the mean delay
@ -665,12 +665,12 @@ This suggests that the mean is unlikely to be a good summary and we might prefer
```{r}
#| echo: false
#| label: fig-flights-dist
#| fig-cap: >
#| fig-cap: |
#| (Left) The histogram of the full data is extremely skewed making it
#| hard to get any details. (Right) Zooming into delays of less than two
#| hours makes it possible to see what's happening with the bulk of the
#| observations.
#| fig-alt: >
#| fig-alt: |
#| Two histograms of `dep_delay`. On the left, it's very hard to see
#| any pattern except that there's a very large spike around zero, the
#| bars rapidly decay in height, and for most of the plot, you can't
@ -700,7 +700,7 @@ In the following plot 365 frequency polygons of `dep_delay`, one for each day, a
The distributions seem to follow a common pattern, suggesting it's fine to use the same summary for each day.
```{r}
#| fig-alt: >
#| fig-alt: |
#| The distribution of `dep_delay` is highly right skewed with a strong
#| peak slightly less than 0. The 365 frequency polygons are mostly
#| overlapping forming a thick black bland.

View File

@ -13,9 +13,9 @@ Programming is a cross-cutting skill needed for all data science work: you must
#| label: fig-ds-program
#| echo: false
#| out.width: ~
#| fig-cap: >
#| fig-cap: |
#| Programming is the water in which all the other components swim.
#| fig-alt: >
#| fig-alt: |
#| Our model of the data science process with program (import, tidy,
#| transform, visualize, model, and communicate, i.e. everything)
#| highlighted in blue.

View File

@ -127,10 +127,10 @@ The advantage of this two step workflow is that you can create a very wide range
#| label: fig-quarto-flow
#| echo: false
#| out-width: "75%"
#| fig-alt: >
#| fig-alt: |
#| Workflow diagram starting with a qmd file, then knitr, then md,
#| then pandoc, then PDF, MS Word, or HTML.
#| fig-cap: >
#| fig-cap: |
#| Diagram of Quarto workflow from qmd, to knitr, to md, to pandoc,
#| to output in PDF, MS Word, or HTML formats.

View File

@ -136,7 +136,7 @@ It looks like they've radically increased in popularity lately!
[^regexps-4]: This gives us the proportion of **names** that contain an "x"; if you wanted the proportion of babies with a name containing an x, you'd need to perform a weighted mean.
```{r}
#| fig-alt: >
#| fig-alt: |
#| A time series showing the proportion of baby names that contain the letter x.
#| The proportion declines gradually from 8 per 1000 in 1880 to 4 per 1000 in
#| 1980, then increases rapidly to 16 per 1000 in 2019.

View File

@ -54,9 +54,9 @@ For the rest of the chapter we will focus on using `read_excel()`.
#| label: fig-students-excel
#| echo: false
#| fig-width: 5
#| fig-cap: >
#| fig-cap: |
#| Spreadsheet called students.xlsx in Excel.
#| fig-alt: >
#| fig-alt: |
#| A look at the students spreadsheet in Excel. The spreadsheet contains
#| information on 6 students, their ID, full name, favourite food, meal plan,
#| and age.
@ -189,9 +189,9 @@ Each worksheet contains information on penguins from a different island where da
```{r}
#| label: fig-penguins-islands
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Spreadsheet called penguins.xlsx in Excel containing three worksheets.
#| fig-alt: >
#| fig-alt: |
#| A look at the penguins spreadsheet in Excel. The spreadsheet contains has
#| three worksheets: Torgersen Island, Biscoe Island, and Dream Island.
@ -252,9 +252,9 @@ Since many use Excel spreadsheets for presentation as well as for data storage,
```{r}
#| label: fig-deaths-excel
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Spreadsheet called deaths.xlsx in Excel.
#| fig-alt: >
#| fig-alt: |
#| A look at the deaths spreadsheet in Excel. The spreadsheet has four rows
#| on top that contain non-data information; the text 'For the same of
#| consistency in the data layout, which is really a beautiful thing, I will
@ -360,9 +360,9 @@ These can be turned off by setting `col_names` and `format_headers` arguments to
#| label: fig-bake-sale-excel
#| echo: false
#| fig-width: 5
#| fig-cap: >
#| fig-cap: |
#| Spreadsheet called bake_sale.xlsx in Excel.
#| fig-alt: >
#| fig-alt: |
#| Bake sale data frame created earlier in Excel.
knitr::include_graphics("screenshots/import-spreadsheets-bake-sale.png")
@ -395,7 +395,7 @@ A good way of familiarizing yourself with the coding style used in a new package
```{r}
#| echo: false
#| fig-width: 4
#| fig-alt: >
#| fig-alt: |
#| A spreadsheet with 3 columns (group, subgroup, and id) and 12 rows.
#| The group column has two values: 1 (spanning 7 merged rows) and 2
#| (spanning 5 merged rows). The subgroup column has four values: A
@ -428,7 +428,7 @@ A good way of familiarizing yourself with the coding style used in a new package
```{r}
#| echo: false
#| fig-width: 4
#| fig-alt: >
#| fig-alt: |
#| A spreadsheet with 3 columns (group, subgroup, and id) and 12 rows. The
#| group column has two values: 1 (spanning 7 merged rows) and 2 (spanning
#| 5 merged rows). The subgroup column has four values: A (spanning 3 merged
@ -456,7 +456,7 @@ A good way of familiarizing yourself with the coding style used in a new package
```{r}
#| echo: false
#| fig-alt: >
#| fig-alt: |
#| A spreadsheet with 2 columns and 13 rows. The first two rows have text
#| containing information about the sheet. Row 1 says "This file contains
#| information on sales". Row 2 says "Data are organized by brand name, and
@ -543,9 +543,9 @@ This is the same dataset as in @fig-students-excel, except it's stored in a Goog
```{r}
#| label: fig-students-googlesheets
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Google Sheet called students in a browser window.
#| fig-alt: >
#| fig-alt: |
#| A look at the students spreadsheet in Google Sheets. The spreadsheet contains
#| information on 6 students, their ID, full name, favourite food, meal plan,
#| and age.

View File

@ -12,10 +12,10 @@ In this part of the book, you'll learn about the most important types of variabl
```{r}
#| label: fig-ds-transform
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| The options for data transformation depends heavily on the type of
#| data involved, the subject of this part of the book.
#| fig-alt: >
#| fig-alt: |
#| Our data science model, with transform highlighted in blue.
#| out.width: NULL

View File

@ -14,9 +14,9 @@ In this part of the book, you'll learn about visualizing data in further depth.
```{r}
#| label: fig-ds-visualize
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Data visualization is often the first step in data exploration.
#| fig-alt: >
#| fig-alt: |
#| Our data science model, with visualize highlighted in blue.
#| out.width: NULL

View File

@ -450,9 +450,9 @@ At the time we wrote this chapter, the page looked like @fig-scraping-imdb.
```{r}
#| label: fig-scraping-imdb
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Screenshot of the IMDb top movies web page taken on 2022-12-05.
#| fig-alt: >
#| fig-alt: |
#| The screenshot shows a table with columns "Rank and Title",
#| "IMDb Rating", and "Your Rating". 9 movies out of the top 250
#| are shown. The top 5 are the Shawshank Redemption, The Godfather,

View File

@ -14,10 +14,10 @@ The later parts of the book will hit each of these topics in more depth, increas
#| label: fig-ds-whole-game
#| echo: false
#| out.width: NULL
#| fig-cap: >
#| fig-cap: |
#| In this section of the book, you'll learn how to import,
#| tidy, transform, and visualize data.
#| fig-alt: >
#| fig-alt: |
#| A diagram displaying the data science cycle: Import -> Tidy ->
#| Understand (which has the phases Transform -> Visualize -> Model in a
#| cycle) -> Communicate. Surrounding all of these is Program

View File

@ -206,7 +206,7 @@ Note that the environment tab in the upper right pane displays all of the object
```{r}
#| echo: false
#| fig-alt: >
#| fig-alt: |
#| Environment tab of RStudio which shows r_rocks, this_is_a_really_long_name,
#| x, and y in the Global Environment.

View File

@ -24,10 +24,10 @@ And once you have written code that works and does what you want, you can save i
#| label: fig-rstudio-script
#| echo: false
#| out-width: ~
#| fig-cap: >
#| fig-cap: |
#| Opening the script editor adds a new pane at the top-left of the
#| IDE.
#| fig-alt: >
#| fig-alt: |
#| RStudio IDE with Editor, Console, and Output highlighted.
knitr::include_graphics("diagrams/rstudio/script.png", dpi = 270)
```
@ -75,7 +75,7 @@ In the script editor, RStudio will highlight syntax errors with a red squiggly l
```{r}
#| echo: false
#| out-width: ~
#| fig-alt: >
#| fig-alt: |
#| Script editor with the script x y <- 10. A red X indicates that there is
#| syntax error. The syntax error is also highlighted with a red squiggly line.
@ -87,7 +87,7 @@ Hover over the cross to see what the problem is:
```{r}
#| echo: false
#| out-width: ~
#| fig-alt: >
#| fig-alt: |
#| Script editor with the script x y <- 10. A red X indicates that there is
#| syntax error. The syntax error is also highlighted with a red squiggly line.
#| Hovering over the X shows a text box with the text unexpected token y and
@ -101,7 +101,7 @@ RStudio will also let you know about potential problems:
```{r}
#| echo: false
#| out-width: ~
#| fig-alt: >
#| fig-alt: |
#| Script editor with the script 3 == NA. A yellow exclamation mark
#| indicates that there may be a potential problem. Hovering over the
#| exclamation mark shows a text box with the text use is.na to check
@ -184,10 +184,10 @@ There's nothing worse than discovering three months after the fact that you've o
```{r}
#| label: fig-blank-slate
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Copy these options in your RStudio options to always start your
#| RStudio session with a clean slate.
#| fig-alt: >
#| fig-alt: |
#| RStudio Global Options window where the option Restore .RData into workspace
#| at startup is not checked. Also, the option Save workspace to .RData
#| on exit is set to Never.
@ -222,7 +222,7 @@ RStudio shows your current working directory at the top of the console:
```{r}
#| echo: false
#| fig-alt: >
#| fig-alt: |
#| The Console tab shows the current working directory as
#| ~/Documents/r4ds.
#| out-width: ~
@ -263,11 +263,11 @@ Click File \> New Project, then follow the steps shown in @fig-new-project.
```{r}
#| label: fig-new-project
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| To create new project: (top) first click New Directory, then (middle)
#| click New Project, then (bottom) fill in the directory (project) name,
#| choose a good subdirectory for its home and click Create Project.
#| fig-alt: >
#| fig-alt: |
#| Three screenshots of the New Project menu. In the first screenshot,
#| the Create Project window is shown and New Directory is selected.
#| In the second screenshot, the Project Type window is shown and

View File

@ -24,10 +24,10 @@ Open the palette by pressing Cmd/Ctrl + Shift + P, then type "styler" to see all
#| label: fig-styler
#| echo: false
#| out-width: null
#| fig-cap: >
#| fig-cap: |
#| RStudio's command palette makes it easy to access every RStudio command
#| using only the keyboard.
#| fig-alt: >
#| fig-alt: |
#| A screenshot showing the command palette after typing "styler", showing
#| the four styling tool provided by the package.
@ -267,7 +267,7 @@ RStudio provides a keyboard shortcut to create these headers (Cmd/Ctrl + Shift +
#| label: fig-rstudio-sections
#| echo: false
#| out-width: null
#| fig-cap: >
#| fig-cap: |
#| After adding sectioning comments to your script, you can
#| easily navigate to them using the code navigation tool in the
#| bottom-left of the script editor.