Don't transform non-crossref links

This commit is contained in:
Hadley Wickham
2022-11-18 10:30:32 -06:00
parent 4caea5281b
commit 78a1c12fe7
32 changed files with 693 additions and 693 deletions

View File

@@ -79,7 +79,7 @@ Reading data from a file</h1>
</tr></tbody></table></div>
</div>
</div>
<p>We can read this file into R using <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code>. The first argument is the most important: its the path to the file.</p>
<p>We can read this file into R using <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code>. The first argument is the most important: its the path to the file.</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">students &lt;- read_csv("data/students.csv")
#&gt; Rows: 6 Columns: 5
@@ -91,7 +91,7 @@ Reading data from a file</h1>
#&gt; Use `spec()` to retrieve the full column specification for this data.
#&gt; Specify the column types or set `show_col_types = FALSE` to quiet this message.</pre>
</div>
<p>When you run <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> it prints out a message telling you the number of rows and columns of data, the delimiter that was used, and the column specifications (names of columns organized by the type of data the column contains). It also prints out some information about how to retrieve the full column specification as well as how to quiet this message. This message is an important part of readr and well come back to in <a href="#sec-col-types" data-type="xref">#sec-col-types</a>.</p>
<p>When you run <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code> it prints out a message telling you the number of rows and columns of data, the delimiter that was used, and the column specifications (names of columns organized by the type of data the column contains). It also prints out some information about how to retrieve the full column specification as well as how to quiet this message. This message is an important part of readr and well come back to in <a href="#sec-col-types" data-type="xref">#sec-col-types</a>.</p>
<section id="practical-advice" data-type="sect2">
<h2>
@@ -129,7 +129,7 @@ students
#&gt; 5 5 Chidiegwu Dunkel Pizza Breakfast and lunch five
#&gt; 6 6 Güvenç Attila Ice cream Lunch only 6</pre>
</div>
<p>An alternative approach is to use <code><a href="#chp-https://rdrr.io/pkg/janitor/man/clean_names" data-type="xref">#chp-https://rdrr.io/pkg/janitor/man/clean_names</a></code> to use some heuristics to turn them all into snake case at once<span data-type="footnote">The <a href="#chp-http://sfirke.github.io/janitor/" data-type="xref">#chp-http://sfirke.github.io/janitor/</a> package is not part of the tidyverse, but it offers handy functions for data cleaning and works well within data pipelines that uses <code>|&gt;</code>.</span>.</p>
<p>An alternative approach is to use <code><a href="https://rdrr.io/pkg/janitor/man/clean_names.html">janitor::clean_names()</a></code> to use some heuristics to turn them all into snake case at once<span data-type="footnote">The <a href="http://sfirke.github.io/janitor/">janitor</a> package is not part of the tidyverse, but it offers handy functions for data cleaning and works well within data pipelines that uses <code>|&gt;</code>.</span>.</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">students |&gt; janitor::clean_names()
#&gt; # A tibble: 6 × 5
@@ -185,7 +185,7 @@ students
<section id="other-arguments" data-type="sect2">
<h2>
Other arguments</h2>
<p>There are a couple of other important arguments that we need to mention, and theyll be easier to demonstrate if we first show you a handy trick: <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> can read csv files that youve created in a string:</p>
<p>There are a couple of other important arguments that we need to mention, and theyll be easier to demonstrate if we first show you a handy trick: <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code> can read csv files that youve created in a string:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">read_csv(
"a,b,c
@@ -198,7 +198,7 @@ Other arguments</h2>
#&gt; 1 1 2 3
#&gt; 2 4 5 6</pre>
</div>
<p>Usually <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> uses the first line of the data for the column names, which is a very common convention. But sometime there are a few lines of metadata at the top of the file. You can use <code>skip = n</code> to skip the first <code>n</code> lines or use <code>comment = "#"</code> to drop all lines that start with (e.g.) <code>#</code>:</p>
<p>Usually <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code> uses the first line of the data for the column names, which is a very common convention. But sometime there are a few lines of metadata at the top of the file. You can use <code>skip = n</code> to skip the first <code>n</code> lines or use <code>comment = "#"</code> to drop all lines that start with (e.g.) <code>#</code>:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">read_csv(
"The first line of metadata
@@ -223,7 +223,7 @@ read_csv(
#&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
#&gt; 1 1 2 3</pre>
</div>
<p>In other cases, the data might not have column names. You can use <code>col_names = FALSE</code> to tell <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> not to treat the first row as headings, and instead label them sequentially from <code>X1</code> to <code>Xn</code>:</p>
<p>In other cases, the data might not have column names. You can use <code>col_names = FALSE</code> to tell <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code> not to treat the first row as headings, and instead label them sequentially from <code>X1</code> to <code>Xn</code>:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">read_csv(
"1,2,3
@@ -249,29 +249,29 @@ read_csv(
#&gt; 1 1 2 3
#&gt; 2 4 5 6</pre>
</div>
<p>These arguments are all you need to know to read the majority of CSV files that youll encounter in practice. (For the rest, youll need to carefully inspect your <code>.csv</code> file and carefully read the documentation for <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code>s many other arguments.)</p>
<p>These arguments are all you need to know to read the majority of CSV files that youll encounter in practice. (For the rest, youll need to carefully inspect your <code>.csv</code> file and carefully read the documentation for <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code>s many other arguments.)</p>
</section>
<section id="other-file-types" data-type="sect2">
<h2>
Other file types</h2>
<p>Once youve mastered <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code>, using readrs other functions is straightforward; its just a matter of knowing which function to reach for:</p>
<ul><li><p><code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> reads semicolon separated files. These use <code>;</code> instead of <code>,</code> to separate fields, and are common in countries that use <code>,</code> as the decimal marker.</p></li>
<li><p><code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> reads tab delimited files.</p></li>
<li><p><code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> reads in files with any delimiter, attempting to automatically guess the delimited if you dont specify it.</p></li>
<li><p><code><a href="#chp-https://readr.tidyverse.org/reference/read_fwf" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_fwf</a></code> reads fixed width files. You can specify fields either by their widths with <code><a href="#chp-https://readr.tidyverse.org/reference/read_fwf" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_fwf</a></code> or their position with <code><a href="#chp-https://readr.tidyverse.org/reference/read_fwf" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_fwf</a></code>.</p></li>
<li><p><code><a href="#chp-https://readr.tidyverse.org/reference/read_table" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_table</a></code> reads a common variation of fixed width files where columns are separated by white space.</p></li>
<li><p><code><a href="#chp-https://readr.tidyverse.org/reference/read_log" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_log</a></code> reads Apache style log files.</p></li>
<p>Once youve mastered <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code>, using readrs other functions is straightforward; its just a matter of knowing which function to reach for:</p>
<ul><li><p><code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv2()</a></code> reads semicolon separated files. These use <code>;</code> instead of <code>,</code> to separate fields, and are common in countries that use <code>,</code> as the decimal marker.</p></li>
<li><p><code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_tsv()</a></code> reads tab delimited files.</p></li>
<li><p><code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_delim()</a></code> reads in files with any delimiter, attempting to automatically guess the delimited if you dont specify it.</p></li>
<li><p><code><a href="https://readr.tidyverse.org/reference/read_fwf.html">read_fwf()</a></code> reads fixed width files. You can specify fields either by their widths with <code><a href="https://readr.tidyverse.org/reference/read_fwf.html">fwf_widths()</a></code> or their position with <code><a href="https://readr.tidyverse.org/reference/read_fwf.html">fwf_positions()</a></code>.</p></li>
<li><p><code><a href="https://readr.tidyverse.org/reference/read_table.html">read_table()</a></code> reads a common variation of fixed width files where columns are separated by white space.</p></li>
<li><p><code><a href="https://readr.tidyverse.org/reference/read_log.html">read_log()</a></code> reads Apache style log files.</p></li>
</ul></section>
<section id="exercises" data-type="sect2">
<h2>
Exercises</h2>
<ol type="1"><li><p>What function would you use to read a file where fields were separated with “|”?</p></li>
<li><p>Apart from <code>file</code>, <code>skip</code>, and <code>comment</code>, what other arguments do <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> and <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> have in common?</p></li>
<li><p>What are the most important arguments to <code><a href="#chp-https://readr.tidyverse.org/reference/read_fwf" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_fwf</a></code>?</p></li>
<li><p>Apart from <code>file</code>, <code>skip</code>, and <code>comment</code>, what other arguments do <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code> and <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_tsv()</a></code> have in common?</p></li>
<li><p>What are the most important arguments to <code><a href="https://readr.tidyverse.org/reference/read_fwf.html">read_fwf()</a></code>?</p></li>
<li>
<p>Sometimes strings in a CSV file contain commas. To prevent them from causing problems they need to be surrounded by a quoting character, like <code>"</code> or <code>'</code>. By default, <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> assumes that the quoting character will be <code>"</code>. What argument to <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> do you need to specify to read the following text into a data frame?</p>
<p>Sometimes strings in a CSV file contain commas. To prevent them from causing problems they need to be surrounded by a quoting character, like <code>"</code> or <code>'</code>. By default, <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code> assumes that the quoting character will be <code>"</code>. What argument to <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code> do you need to specify to read the following text into a data frame?</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">"x,y\n1,'a,b'"</pre>
</div>
@@ -375,7 +375,7 @@ Missing values, column types, and problems</h2>
#&gt; dat &lt;- vroom(...)
#&gt; problems(dat)</pre>
</div>
<p>Now <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> reports that there was a problem, and tells us we can find out more with <code><a href="#chp-https://readr.tidyverse.org/reference/problems" data-type="xref">#chp-https://readr.tidyverse.org/reference/problems</a></code>:</p>
<p>Now <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code> reports that there was a problem, and tells us we can find out more with <code><a href="https://readr.tidyverse.org/reference/problems.html">problems()</a></code>:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">problems(df)
#&gt; # A tibble: 1 × 5
@@ -401,18 +401,18 @@ Missing values, column types, and problems</h2>
Column types</h2>
<p>readr provides a total of nine column types for you to use:</p>
<ul><li>
<code><a href="#chp-https://readr.tidyverse.org/reference/parse_atomic" data-type="xref">#chp-https://readr.tidyverse.org/reference/parse_atomic</a></code> and <code><a href="#chp-https://readr.tidyverse.org/reference/parse_atomic" data-type="xref">#chp-https://readr.tidyverse.org/reference/parse_atomic</a></code> read logicals and real numbers. Theyre relatively rarely needed (except as above), since readr will usually guess them for you.</li>
<code><a href="https://readr.tidyverse.org/reference/parse_atomic.html">col_logical()</a></code> and <code><a href="https://readr.tidyverse.org/reference/parse_atomic.html">col_double()</a></code> read logicals and real numbers. Theyre relatively rarely needed (except as above), since readr will usually guess them for you.</li>
<li>
<code><a href="#chp-https://readr.tidyverse.org/reference/parse_atomic" data-type="xref">#chp-https://readr.tidyverse.org/reference/parse_atomic</a></code> reads integers. We distinguish because integers and doubles in this book because theyre functionally equivalent, but reading integers explicitly can occasionally be useful because they occupy half the memory of doubles.</li>
<code><a href="https://readr.tidyverse.org/reference/parse_atomic.html">col_integer()</a></code> reads integers. We distinguish because integers and doubles in this book because theyre functionally equivalent, but reading integers explicitly can occasionally be useful because they occupy half the memory of doubles.</li>
<li>
<code><a href="#chp-https://readr.tidyverse.org/reference/parse_atomic" data-type="xref">#chp-https://readr.tidyverse.org/reference/parse_atomic</a></code> reads strings. This is sometimes useful to specify explicitly when you have a column that is a numeric identifier, i.e. long series of digits that identifies some object, but it doesnt make sense to (e.g.) divide it in half.</li>
<code><a href="https://readr.tidyverse.org/reference/parse_atomic.html">col_character()</a></code> reads strings. This is sometimes useful to specify explicitly when you have a column that is a numeric identifier, i.e. long series of digits that identifies some object, but it doesnt make sense to (e.g.) divide it in half.</li>
<li>
<code><a href="#chp-https://readr.tidyverse.org/reference/parse_factor" data-type="xref">#chp-https://readr.tidyverse.org/reference/parse_factor</a></code>, <code><a href="#chp-https://readr.tidyverse.org/reference/parse_datetime" data-type="xref">#chp-https://readr.tidyverse.org/reference/parse_datetime</a></code> and <code><a href="#chp-https://readr.tidyverse.org/reference/parse_datetime" data-type="xref">#chp-https://readr.tidyverse.org/reference/parse_datetime</a></code> create factors, dates and date-time respectively; youll learn more about those when we get to those data types in <a href="#chp-factors" data-type="xref">#chp-factors</a> and <a href="#chp-datetimes" data-type="xref">#chp-datetimes</a>.</li>
<code><a href="https://readr.tidyverse.org/reference/parse_factor.html">col_factor()</a></code>, <code><a href="https://readr.tidyverse.org/reference/parse_datetime.html">col_date()</a></code> and <code><a href="https://readr.tidyverse.org/reference/parse_datetime.html">col_datetime()</a></code> create factors, dates and date-time respectively; youll learn more about those when we get to those data types in <a href="#chp-factors" data-type="xref">#chp-factors</a> and <a href="#chp-datetimes" data-type="xref">#chp-datetimes</a>.</li>
<li>
<code><a href="#chp-https://readr.tidyverse.org/reference/parse_number" data-type="xref">#chp-https://readr.tidyverse.org/reference/parse_number</a></code> is a permissive numeric parser that will ignore non-numeric components, and is particularly useful for currencies. Youll learn more about it in <a href="#chp-numbers" data-type="xref">#chp-numbers</a>.</li>
<code><a href="https://readr.tidyverse.org/reference/parse_number.html">col_number()</a></code> is a permissive numeric parser that will ignore non-numeric components, and is particularly useful for currencies. Youll learn more about it in <a href="#chp-numbers" data-type="xref">#chp-numbers</a>.</li>
<li>
<code><a href="#chp-https://readr.tidyverse.org/reference/col_skip" data-type="xref">#chp-https://readr.tidyverse.org/reference/col_skip</a></code> skips a column so its not included in the result.</li>
</ul><p>Its also possible to override the default column by switching from <code><a href="#chp-https://rdrr.io/r/base/list" data-type="xref">#chp-https://rdrr.io/r/base/list</a></code> to <code><a href="#chp-https://readr.tidyverse.org/reference/cols" data-type="xref">#chp-https://readr.tidyverse.org/reference/cols</a></code>:</p>
<code><a href="https://readr.tidyverse.org/reference/col_skip.html">col_skip()</a></code> skips a column so its not included in the result.</li>
</ul><p>Its also possible to override the default column by switching from <code><a href="https://rdrr.io/r/base/list.html">list()</a></code> to <code><a href="https://readr.tidyverse.org/reference/cols.html">cols()</a></code>:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">csv &lt;- "
x,y,z
@@ -424,7 +424,7 @@ read_csv(csv, col_types = cols(.default = col_character()))
#&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
#&gt; 1 1 2 3</pre>
</div>
<p>Another useful helper is <code><a href="#chp-https://readr.tidyverse.org/reference/cols" data-type="xref">#chp-https://readr.tidyverse.org/reference/cols</a></code> which will read in only the columns you specify:</p>
<p>Another useful helper is <code><a href="https://readr.tidyverse.org/reference/cols.html">cols_only()</a></code> which will read in only the columns you specify:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">read_csv(
"x,y,z
@@ -442,7 +442,7 @@ read_csv(csv, col_types = cols(.default = col_character()))
<section id="sec-readr-directory" data-type="sect1">
<h1>
Reading data from multiple files</h1>
<p>Sometimes your data is split across multiple files instead of being contained in a single file. For example, you might have sales data for multiple months, with each months data in a separate file: <code>01-sales.csv</code> for January, <code>02-sales.csv</code> for February, and <code>03-sales.csv</code> for March. With <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> you can read these data in at once and stack them on top of each other in a single data frame.</p>
<p>Sometimes your data is split across multiple files instead of being contained in a single file. For example, you might have sales data for multiple months, with each months data in a separate file: <code>01-sales.csv</code> for January, <code>02-sales.csv</code> for February, and <code>03-sales.csv</code> for March. With <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code> you can read these data in at once and stack them on top of each other in a single data frame.</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">sales_files &lt;- c("data/01-sales.csv", "data/02-sales.csv", "data/03-sales.csv")
read_csv(sales_files, id = "file")
@@ -466,7 +466,7 @@ read_csv(sales_files, id = "file")
#&gt; # … with 13 more rows</pre>
</div>
<p>With the additional <code>id</code> parameter we have added a new column called <code>file</code> to the resulting data frame that identifies the file the data come from. This is especially helpful in circumstances where the files youre reading in do not have an identifying column that can help you trace the observations back to their original sources.</p>
<p>If you have many files you want to read in, it can get cumbersome to write out their names as a list. Instead, you can use the base <code><a href="#chp-https://rdrr.io/r/base/list.files" data-type="xref">#chp-https://rdrr.io/r/base/list.files</a></code> function to find the files for you by matching a pattern in the file names. Youll learn more about these patterns in <a href="#chp-regexps" data-type="xref">#chp-regexps</a>.</p>
<p>If you have many files you want to read in, it can get cumbersome to write out their names as a list. Instead, you can use the base <code><a href="https://rdrr.io/r/base/list.files.html">list.files()</a></code> function to find the files for you by matching a pattern in the file names. Youll learn more about these patterns in <a href="#chp-regexps" data-type="xref">#chp-regexps</a>.</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">sales_files &lt;- list.files("data", pattern = "sales\\.csv$", full.names = TRUE)
sales_files
@@ -477,7 +477,7 @@ sales_files
<section id="sec-writing-to-a-file" data-type="sect1">
<h1>
Writing to a file</h1>
<p>readr also comes with two useful functions for writing data back to disk: <code><a href="#chp-https://readr.tidyverse.org/reference/write_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/write_delim</a></code> and <code><a href="#chp-https://readr.tidyverse.org/reference/write_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/write_delim</a></code>. Both functions increase the chances of the output file being read back in correctly by using the standard UTF-8 encoding for strings and ISO8601 format for date-times.</p>
<p>readr also comes with two useful functions for writing data back to disk: <code><a href="https://readr.tidyverse.org/reference/write_delim.html">write_csv()</a></code> and <code><a href="https://readr.tidyverse.org/reference/write_delim.html">write_tsv()</a></code>. Both functions increase the chances of the output file being read back in correctly by using the standard UTF-8 encoding for strings and ISO8601 format for date-times.</p>
<p>The most important arguments are <code>x</code> (the data frame to save), and <code>file</code> (the location to save it). You can also specify how missing values are written with <code>na</code>, and if you want to <code>append</code> to an existing file.</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">write_csv(students, "students.csv")</pre>
@@ -508,7 +508,7 @@ read_csv("students-2.csv")
</div>
<p>This makes CSVs a little unreliable for caching interim results—you need to recreate the column specification every time you load in. There are two main options:</p>
<ol type="1"><li>
<p><code><a href="#chp-https://readr.tidyverse.org/reference/read_rds" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_rds</a></code> and <code><a href="#chp-https://readr.tidyverse.org/reference/read_rds" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_rds</a></code> are uniform wrappers around the base functions <code><a href="#chp-https://rdrr.io/r/base/readRDS" data-type="xref">#chp-https://rdrr.io/r/base/readRDS</a></code> and <code><a href="#chp-https://rdrr.io/r/base/readRDS" data-type="xref">#chp-https://rdrr.io/r/base/readRDS</a></code>. These store data in Rs custom binary format called RDS:</p>
<p><code><a href="https://readr.tidyverse.org/reference/read_rds.html">write_rds()</a></code> and <code><a href="https://readr.tidyverse.org/reference/read_rds.html">read_rds()</a></code> are uniform wrappers around the base functions <code><a href="https://rdrr.io/r/base/readRDS.html">readRDS()</a></code> and <code><a href="https://rdrr.io/r/base/readRDS.html">saveRDS()</a></code>. These store data in Rs custom binary format called RDS:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">write_rds(students, "students.rds")
read_rds("students.rds")
@@ -546,7 +546,7 @@ read_parquet("students.parquet")
<section id="data-entry" data-type="sect1">
<h1>
Data entry</h1>
<p>Sometimes youll need to assemble a tibble “by hand” doing a little data entry in your R script. There are two useful functions to help you do this which differ in whether you layout the tibble by columns or by rows. <code><a href="#chp-https://tibble.tidyverse.org/reference/tibble" data-type="xref">#chp-https://tibble.tidyverse.org/reference/tibble</a></code> works by column:</p>
<p>Sometimes youll need to assemble a tibble “by hand” doing a little data entry in your R script. There are two useful functions to help you do this which differ in whether you layout the tibble by columns or by rows. <code><a href="https://tibble.tidyverse.org/reference/tibble.html">tibble()</a></code> works by column:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">tibble(
x = c(1, 2, 5),
@@ -573,7 +573,7 @@ Data entry</h1>
#&gt; • Size 3: Column `y`.
#&gt; Only values of size one are recycled.</pre>
</div>
<p>Laying out the data by column can make it hard to see how the rows are related, so an alternative is <code><a href="#chp-https://tibble.tidyverse.org/reference/tribble" data-type="xref">#chp-https://tibble.tidyverse.org/reference/tribble</a></code>, short for <strong>tr</strong>ansposed t<strong>ibble</strong>, which lets you lay out your data row by row. <code><a href="#chp-https://tibble.tidyverse.org/reference/tribble" data-type="xref">#chp-https://tibble.tidyverse.org/reference/tribble</a></code> is customized for data entry in code: column headings start with <code>~</code> and entries are separated by commas. This makes it possible to lay out small amounts of data in an easy to read form:</p>
<p>Laying out the data by column can make it hard to see how the rows are related, so an alternative is <code><a href="https://tibble.tidyverse.org/reference/tribble.html">tribble()</a></code>, short for <strong>tr</strong>ansposed t<strong>ibble</strong>, which lets you lay out your data row by row. <code><a href="https://tibble.tidyverse.org/reference/tribble.html">tribble()</a></code> is customized for data entry in code: column headings start with <code>~</code> and entries are separated by commas. This makes it possible to lay out small amounts of data in an easy to read form:</p>
<div class="cell">
<pre data-type="programlisting" data-code-language="downlit">tribble(
~x, ~y, ~z,
@@ -588,13 +588,13 @@ Data entry</h1>
#&gt; 2 m 2 0.83
#&gt; 3 g 5 0.6</pre>
</div>
<p>Well use <code><a href="#chp-https://tibble.tidyverse.org/reference/tibble" data-type="xref">#chp-https://tibble.tidyverse.org/reference/tibble</a></code> and <code><a href="#chp-https://tibble.tidyverse.org/reference/tribble" data-type="xref">#chp-https://tibble.tidyverse.org/reference/tribble</a></code> later in the book to construct small examples to demonstrate how various functions work.</p>
<p>Well use <code><a href="https://tibble.tidyverse.org/reference/tibble.html">tibble()</a></code> and <code><a href="https://tibble.tidyverse.org/reference/tribble.html">tribble()</a></code> later in the book to construct small examples to demonstrate how various functions work.</p>
</section>
<section id="summary" data-type="sect1">
<h1>
Summary</h1>
<p>In this chapter, youve learned how to load CSV files with <code><a href="#chp-https://readr.tidyverse.org/reference/read_delim" data-type="xref">#chp-https://readr.tidyverse.org/reference/read_delim</a></code> and to do your own data entry with <code><a href="#chp-https://tibble.tidyverse.org/reference/tibble" data-type="xref">#chp-https://tibble.tidyverse.org/reference/tibble</a></code> and <code><a href="#chp-https://tibble.tidyverse.org/reference/tribble" data-type="xref">#chp-https://tibble.tidyverse.org/reference/tribble</a></code>. Youve learned how csv files work, some of the problems you might encounter, and how to overcome them. Well come to data import a few times in this book: <a href="#chp-databases" data-type="xref">#chp-databases</a> will show you how to load data from databases, <a href="#chp-spreadsheets" data-type="xref">#chp-spreadsheets</a> from Excel and googlesheets, <a href="#chp-rectangling" data-type="xref">#chp-rectangling</a> from JSON, and <a href="#chp-webscraping" data-type="xref">#chp-webscraping</a> from websites.</p>
<p>In this chapter, youve learned how to load CSV files with <code><a href="https://readr.tidyverse.org/reference/read_delim.html">read_csv()</a></code> and to do your own data entry with <code><a href="https://tibble.tidyverse.org/reference/tibble.html">tibble()</a></code> and <code><a href="https://tibble.tidyverse.org/reference/tribble.html">tribble()</a></code>. Youve learned how csv files work, some of the problems you might encounter, and how to overcome them. Well come to data import a few times in this book: <a href="#chp-databases" data-type="xref">#chp-databases</a> will show you how to load data from databases, <a href="#chp-spreadsheets" data-type="xref">#chp-spreadsheets</a> from Excel and googlesheets, <a href="#chp-rectangling" data-type="xref">#chp-rectangling</a> from JSON, and <a href="#chp-webscraping" data-type="xref">#chp-webscraping</a> from websites.</p>
<p>Now that youre writing a substantial amount of R code, its time to learn more about organizing your code into files and directories. In the next chapter, youll learn all about the advantages of scripts and projects, and some of the many tools that they provide to make your life easier.</p>