More work on O'Reilly book

* Make width narrower
* Convert deps to table
* Strip chapter status
This commit is contained in:
Hadley Wickham
2022-11-18 11:05:00 -06:00
parent 5895db09cd
commit 69b4597f3b
33 changed files with 784 additions and 1048 deletions

View File

@@ -1,13 +1,5 @@
<section data-type="chapter" id="chp-iteration">
<h1><span id="sec-iteration" class="quarto-section-identifier d-none d-lg-block"><span class="chapter-title">Iteration</span></span></h1><div data-type="note"><div class="callout-body d-flex">
<div class="callout-icon-container">
<i class="callout-icon"/>
</div>
</div>
<p>You are reading the work-in-progress second edition of R for Data Science. This chapter should be readable but is currently undergoing final polishing. You can find the complete first edition at <a href="https://r4ds.had.co.nz" class="uri">https://r4ds.had.co.nz</a>.</p></div>
<h1><span id="sec-iteration" class="quarto-section-identifier d-none d-lg-block"><span class="chapter-title">Iteration</span></span></h1><p>::: status callout-note You are reading the work-in-progress second edition of R for Data Science. This chapter should be readable but is currently undergoing final polishing. You can find the complete first edition at <a href="https://r4ds.had.co.nz" class="uri">https://r4ds.had.co.nz</a>. :::</p>
<section id="introduction" data-type="sect1">
<h1>
Introduction</h1>
@@ -226,9 +218,10 @@ df_miss |&gt;
n = n()
)
#&gt; # A tibble: 1 × 9
#&gt; a_median a_n_miss b_median b_n_miss c_median c_n_miss d_median d_n_miss n
#&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;int&gt;
#&gt; 1 0.429 1 -0.721 1 -0.796 2 0.704 0 5</pre>
#&gt; a_median a_n_miss b_median b_n_miss c_median c_n_miss d_med…¹ d_n_m…² n
#&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;int&gt;
#&gt; 1 0.429 1 -0.721 1 -0.796 2 0.704 0 5
#&gt; # … with abbreviated variable names ¹d_median, ²d_n_miss</pre>
</div>
<p>If you look carefully, you might intuit that the columns are named using using a glue specification (<a href="#sec-glue" data-type="xref">#sec-glue</a>) like <code>{.col}_{.fn}</code> where <code>.col</code> is the name of the original column and <code>.fn</code> is the name of the function. Thats not a coincidence! As youll learn in the next section, you can use <code>.names</code> argument to supply your own glue spec.</p>
</section>
@@ -251,9 +244,10 @@ Column names</h2>
n = n(),
)
#&gt; # A tibble: 1 × 9
#&gt; median_a n_miss_a median_b n_miss_b median_c n_miss_c median_d n_miss_d n
#&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;int&gt;
#&gt; 1 0.429 1 -0.721 1 -0.796 2 0.704 0 5</pre>
#&gt; median_a n_miss_a median_b n_miss_b median_c n_miss_c media…¹ n_mis…² n
#&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt; &lt;int&gt; &lt;int&gt;
#&gt; 1 0.429 1 -0.721 1 -0.796 2 0.704 0 5
#&gt; # … with abbreviated variable names ¹median_d, ²n_miss_d</pre>
</div>
<p>The <code>.names</code> argument is particularly important when you use <code><a href="https://dplyr.tidyverse.org/reference/across.html">across()</a></code> with <code><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate()</a></code>. By default the output of <code><a href="https://dplyr.tidyverse.org/reference/across.html">across()</a></code> is given the same names as the inputs. This means that <code><a href="https://dplyr.tidyverse.org/reference/across.html">across()</a></code> inside of <code><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate()</a></code> will replace existing columns. For example, here we use <code><a href="https://dplyr.tidyverse.org/reference/coalesce.html">coalesce()</a></code> to replace <code>NA</code>s with <code>0</code>:</p>
<div class="cell">
@@ -930,8 +924,8 @@ DBI::dbCreateTable(con, "gapminder", template)</pre>
<pre data-type="programlisting" data-code-language="downlit">con |&gt; tbl("gapminder")
#&gt; # Source: table&lt;gapminder&gt; [0 x 6]
#&gt; # Database: DuckDB 0.5.1 [root@Darwin 22.1.0:R 4.2.1/:memory:]
#&gt; # … with 6 variables: country &lt;chr&gt;, continent &lt;chr&gt;, lifeExp &lt;dbl&gt;, pop &lt;dbl&gt;,
#&gt; # gdpPercap &lt;dbl&gt;, year &lt;dbl&gt;</pre>
#&gt; # … with 6 variables: country &lt;chr&gt;, continent &lt;chr&gt;, lifeExp &lt;dbl&gt;,
#&gt; # pop &lt;dbl&gt;, gdpPercap &lt;dbl&gt;, year &lt;dbl&gt;</pre>
</div>
<p>Next, we need a function that takes a single file path, reads it into R, and adds the result to the <code>gapminder</code> table. We can do that by combining <code>read_excel()</code> with <code><a href="https://dbi.r-dbi.org/reference/dbAppendTable.html">DBI::dbAppendTable()</a></code>:</p>
<div class="cell">