Fix vs. e.g. i.e. punctuation (#1157)
Co-authored-by: Mine Cetinkaya-Rundel <cetinkaya.mine@gmail.com>
This commit is contained in:
		
							
								
								
									
										6
									
								
								EDA.qmd
									
									
									
									
									
								
							
							
						
						
									
										6
									
								
								EDA.qmd
									
									
									
									
									
								
							| @@ -261,7 +261,7 @@ You'll need to figure out what caused them (e.g. a data entry error) and disclos | ||||
|     How many are 1 carat? | ||||
|     What do you think is the cause of the difference? | ||||
|  | ||||
| 4.  Compare and contrast `coord_cartesian()` vs `xlim()` or `ylim()` when zooming in on a histogram. | ||||
| 4.  Compare and contrast `coord_cartesian()` vs. `xlim()` or `ylim()` when zooming in on a histogram. | ||||
|     What happens if you leave `binwidth` unset? | ||||
|     What happens if you try and zoom so only half a bar shows? | ||||
|  | ||||
| @@ -488,7 +488,7 @@ ggplot(mpg, | ||||
|  | ||||
| 4.  One problem with boxplots is that they were developed in an era of much smaller datasets and tend to display a prohibitively large number of "outlying values". | ||||
|     One approach to remedy this problem is the letter value plot. | ||||
|     Install the lvplot package, and try using `geom_lv()` to display the distribution of price vs cut. | ||||
|     Install the lvplot package, and try using `geom_lv()` to display the distribution of price vs. cut. | ||||
|     What do you learn? | ||||
|     How do you interpret the plots? | ||||
|  | ||||
| @@ -654,7 +654,7 @@ ggplot(smaller, aes(x = carat, y = price)) + | ||||
| #### Exercises | ||||
|  | ||||
| 1.  Instead of summarizing the conditional distribution with a boxplot, you could use a frequency polygon. | ||||
|     What do you need to consider when using `cut_width()` vs `cut_number()`? | ||||
|     What do you need to consider when using `cut_width()` vs. `cut_number()`? | ||||
|     How does that impact a visualization of the 2d distribution of `carat` and `price`? | ||||
|  | ||||
| 2.  Visualize the distribution of `carat`, partitioned by `price`. | ||||
|   | ||||
| @@ -243,7 +243,7 @@ Once you've mastered `read_csv()`, using readr's other functions is straightforw | ||||
| 6.  Practice referring to non-syntactic names in the following data frame by: | ||||
|  | ||||
|     a.  Extracting the variable called `1`. | ||||
|     b.  Plotting a scatterplot of `1` vs `2`. | ||||
|     b.  Plotting a scatterplot of `1` vs. `2`. | ||||
|     c.  Creating a new column called `3` which is `2` divided by `1`. | ||||
|     d.  Renaming the columns to `one`, `two` and `three`. | ||||
|  | ||||
|   | ||||
| @@ -668,7 +668,7 @@ When we plot the skill of the batter (measured by the batting average, `ba`) aga | ||||
| ```{r} | ||||
| #| warning: false | ||||
| #| fig-alt: > | ||||
| #|   A scatterplot of number of batting opportunites vs batting performance | ||||
| #|   A scatterplot of number of batting opportunites vs. batting performance | ||||
| #|   overlaid with a smoothed line. Average performance increases sharply | ||||
| #|   from 0.2 at when n is 1 to 0.25 when n is ~1000. Average performance | ||||
| #|   continues to increase linearly at a much shallower slope reaching | ||||
|   | ||||
| @@ -542,7 +542,7 @@ flights_sub <- function(rows, cols) { | ||||
| flights_sub(dest == "IAH", contains("time")) | ||||
| ``` | ||||
|  | ||||
| ### Data-masking vs tidy-selection | ||||
| ### Data-masking vs. tidy-selection | ||||
|  | ||||
| Sometimes you want to select variables inside a function that uses data-masking. | ||||
| For example, imagine you want to write a `count_missing()` that counts the number of missing observations in rows. | ||||
|   | ||||
| @@ -627,7 +627,7 @@ An alternative is to use the `median()`, which finds a value that lies in the "m | ||||
| Depending on the shape of the distribution of the variable you're interested in, mean or median might be a better measure of center. | ||||
| For example, for symmetric distributions we generally report the mean while for skewed distributions we usually report the median. | ||||
|  | ||||
| @fig-mean-vs-median compares the mean vs the median when looking at the hourly vs median departure delay. | ||||
| @fig-mean-vs-median compares the mean vs. the median when looking at the hourly vs. median departure delay. | ||||
| The median delay is always smaller than the mean delay because flights sometimes leave multiple hours late, but never leave multiple hours early. | ||||
|  | ||||
| ```{r} | ||||
| @@ -761,7 +761,7 @@ flights |> | ||||
| ``` | ||||
|  | ||||
| Don't be afraid to explore your own custom summaries specifically tailored for the data that you're working with. | ||||
| In this case, that might mean separately summarizing the flights that left early vs the flights that left late, or given that the values are so heavily skewed, you might try a log-transformation. | ||||
| In this case, that might mean separately summarizing the flights that left early vs. the flights that left late, or given that the values are so heavily skewed, you might try a log-transformation. | ||||
| Finally, don't forget what you learned in @sec-sample-size: whenever creating numerical summaries, it's a good idea to include the number of observations in each group. | ||||
|  | ||||
| ### Positions | ||||
|   | ||||
| @@ -34,7 +34,7 @@ Four most common are Latin, Chinese, Arabic, and Devangari, which represent thre | ||||
|  | ||||
| -   Chinese. | ||||
|     Logograms. | ||||
|     Half width vs full width. | ||||
|     Half width vs. full width. | ||||
|     English letters are roughly twice as high as they are wide. | ||||
|     Chinese characters are roughly square. | ||||
|  | ||||
|   | ||||
| @@ -41,7 +41,7 @@ There are two ways to set the output of a document: | ||||
|  | ||||
| Quarto offers a wide range of output formats. | ||||
| You can find the complete list at <https://quarto.org/docs/output-formats/all-formats.html>. | ||||
| Many formats share some output options (e.g., `toc: true` for including a table of contents), but others have options that are format specific (e.g., `code-fold: true` collapses code chunks into a `<details>` tag for HTML output so the user can display it on demand, it's not applicable in a PDF or Word document). | ||||
| Many formats share some output options (e.g. `toc: true` for including a table of contents), but others have options that are format specific (e.g. `code-fold: true` collapses code chunks into a `<details>` tag for HTML output so the user can display it on demand, it's not applicable in a PDF or Word document). | ||||
|  | ||||
| To override the default voptions, you need to use an expanded `format` field. | ||||
| For example, if you wanted to render an `html` with a floating table of contents, you'd use: | ||||
|   | ||||
							
								
								
									
										39
									
								
								quarto.qmd
									
									
									
									
									
								
							
							
						
						
									
										39
									
								
								quarto.qmd
									
									
									
									
									
								
							| @@ -314,7 +314,7 @@ The most important set of options controls if your code block is executed and wh | ||||
|     It's also useful if you're teaching R and want to deliberately include an error. | ||||
|     The default, `error: false` causes rendering to fail if there is a single error in the document. | ||||
|  | ||||
| Each of these chunk options get added to the header of the chunk, following `#|`, e.g., in the following chunk the result is not printed since `eval` is set to false. | ||||
| Each of these chunk options get added to the header of the chunk, following `#|`, e.g. in the following chunk the result is not printed since `eval` is set to false. | ||||
|  | ||||
| ```{r} | ||||
| #| echo: fenced | ||||
| @@ -336,6 +336,35 @@ The following table summarizes which types of output each option suppresses: | ||||
| | `message: false` |          |           |        |       |    \-    |          | | ||||
| | `warning: false` |          |           |        |       |          |    \-    | | ||||
|  | ||||
| ### Global options | ||||
|  | ||||
| As you work more with knitr, you will discover that some of the default chunk options don't fit your needs and you want to change them. | ||||
|  | ||||
| You can do this by adding the preferred options in the document YAML, under `execute`. | ||||
| For example, if you are preparing a report for an audience who does not need to see your code but only your results and narrative, you might set `echo: false` at the document level. | ||||
| That will hide the code by default, so only showing the chunks you deliberately choose to show (with `echo: true`). | ||||
| You might consider setting `message: false` and `warning: false`, but that would make it harder to debug problems because you wouldn't see any messages in the final document. | ||||
|  | ||||
| ``` yaml | ||||
| title: "My report" | ||||
| execute: | ||||
|   echo: false | ||||
| ``` | ||||
|  | ||||
| Since Quarto is designed to be multi-lingual (works with R as well as other languages like Python, Julia, etc.), all of the knitr options are not available at the document execution level since some of them only work with knitr and not other engines Quarto uses for running code in other languages (e.g. Jupyter). | ||||
| You can, however, still set these as global options for your document under the `knitr` field, under `opts_chunk`. | ||||
| For example, when writing books and tutorials we set: | ||||
|  | ||||
| ``` yaml | ||||
| title: "Tutorial" | ||||
| knitr: | ||||
|   opts_chunk: | ||||
|     comment: "#>" | ||||
|     collapse: true | ||||
| ``` | ||||
|  | ||||
| This uses our preferred comment formatting and ensures that the code and output are kept closely entwined. | ||||
|  | ||||
| ### Inline code | ||||
|  | ||||
| There is one other way to embed R code into a Quarto document: directly into the text, with: `r inline()`. | ||||
| @@ -375,19 +404,19 @@ comma(.12358124331) | ||||
|  | ||||
| ## Figures {#sec-figures} | ||||
|  | ||||
| The figures in a Quarto document can be embedded (e.g., a PNG or JPEG file) or generated as a result of a code chunk. | ||||
| The figures in a Quarto document can be embedded (e.g. a PNG or JPEG file) or generated as a result of a code chunk. | ||||
|  | ||||
| To embed an image from an external file, you can use the Insert menu in RStudio and select Figure / Image. | ||||
| This will pop open a menu where you can browse to the image you want to insert as well as add alternative text or caption to it and adjust its size. | ||||
| In the visual editor you can also simply paste an image from your clipboard into your document and RStudio will place a copy of that image in your project folder. | ||||
|  | ||||
| If you include a code chunk that generates a figure (e.g., includes a `ggplot()` call), the resulting figure will be automatically included in your Quarto document. | ||||
| If you include a code chunk that generates a figure (e.g. includes a `ggplot()` call), the resulting figure will be automatically included in your Quarto document. | ||||
|  | ||||
| ### Figure sizing | ||||
|  | ||||
| The biggest challenge of graphics in Quarto is getting your figures the right size and shape. | ||||
| There are five main options that control figure sizing: `fig-width`, `fig-height`, `fig-asp`, `out-width` and `out-height`. | ||||
| Image sizing is challenging because there are two sizes (the size of the figure created by R and the size at which it is inserted in the output document), and multiple ways of specifying the size (i.e., height, width, and aspect ratio: pick two of three). | ||||
| Image sizing is challenging because there are two sizes (the size of the figure created by R and the size at which it is inserted in the output document), and multiple ways of specifying the size (i.e. height, width, and aspect ratio: pick two of three). | ||||
|  | ||||
| <!-- TODO: https://www.tidyverse.org/blog/2020/08/taking-control-of-plot-scaling/ --> | ||||
|  | ||||
| @@ -609,7 +638,7 @@ Here we'll discuss three: self-contained documents, document parameters, and bib | ||||
| ### Self-contained | ||||
|  | ||||
| HTML documents typically have a number of external dependencies (e.g. images, CSS style sheets, JavaScript, etc.) and, by default, Quarto places these dependencies in a `_files` folder in the same directory as your `.qmd` file. | ||||
| If you publish the HTML file on a hosting platform (e.g., QuartoPub, <https://quartopub.com/>), the dependencies in this directory are published with your document and hence are available in the published report. | ||||
| If you publish the HTML file on a hosting platform (e.g. QuartoPub, <https://quartopub.com/>), the dependencies in this directory are published with your document and hence are available in the published report. | ||||
| However, if you want to email the report to a colleague, you might prefer to have a single, self-contained, HTML document that embeds all of its dependencies. | ||||
| You can do this by specifying the `embed-resources` option: | ||||
|  | ||||
|   | ||||
| @@ -46,7 +46,7 @@ Through this chapter we'll use a mix of very simple inline examples so you can g | ||||
| ## Pattern basics {#sec-reg-basics} | ||||
|  | ||||
| We'll use `str_view()` to learn how regex patterns work. | ||||
| We used `str_view()` in the last chapter to better understand a string vs its printed representation, and now we'll use it with its second argument, a regular expression. | ||||
| We used `str_view()` in the last chapter to better understand a string vs. its printed representation, and now we'll use it with its second argument, a regular expression. | ||||
| When this is supplied, `str_view()` will show only the elements of the string vector that match, surrounding each match with `<>`, and, where possible, highlighting the match in blue. | ||||
|  | ||||
| The simplest patterns consist of letters and numbers which match those characters exactly: | ||||
|   | ||||
| @@ -30,7 +30,7 @@ A good reprex makes it easier for other people to help you, and often you'll fig | ||||
| There are two parts to creating a reprex: | ||||
|  | ||||
| -   First, you need to make your code reproducible. | ||||
|     This means that you need to capture everything, i.e., include any `library()` calls and create all necessary objects. | ||||
|     This means that you need to capture everything, i.e. include any `library()` calls and create all necessary objects. | ||||
|     The easiest way to make sure you've done this is to use the reprex package. | ||||
|  | ||||
| -   Second, you need to make it minimal. | ||||
|   | ||||
| @@ -100,7 +100,7 @@ Firstly, because it's part of base R, it's always available for you to use, even | ||||
| Secondly, `|>` is quite a bit simpler than `%>%`: in the time between the invention of `%>%` in 2014 and the inclusion of `|>` in R 4.1.0 in 2021, we gained a better understanding of the pipe. | ||||
| This allowed the base implementation to jettison infrequently used and less important features. | ||||
|  | ||||
| ## `|>` vs `%>%` | ||||
| ## `|>` vs. `%>%` | ||||
|  | ||||
| While `|>` and `%>%` behave identically for simple cases, there are a few important differences. | ||||
| These are most likely to affect you if you're a long-term user of `%>%` who has taken advantage of some of the more advanced features. | ||||
|   | ||||
| @@ -66,7 +66,7 @@ In general, if you have a bunch of variables that are a variation on a theme you | ||||
|  | ||||
| ## Spaces | ||||
|  | ||||
| Put spaces on either side of mathematical operators apart from `^` (i.e., `+`, `-`, `==`, `<`, ...), and around the assignment operator (`<-`). | ||||
| Put spaces on either side of mathematical operators apart from `^` (i.e. `+`, `-`, `==`, `<`, ...), and around the assignment operator (`<-`). | ||||
|  | ||||
| ```{r} | ||||
| #| eval: false | ||||
|   | ||||
		Reference in New Issue
	
	Block a user