More minor page count tweaks & fixes
And re-convert with latest htmlbook
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
<section data-type="chapter" id="chp-datetimes">
|
||||
<h1><span id="sec-dates-and-times" class="quarto-section-identifier d-none d-lg-block"><span class="chapter-title">Dates and times</span></span></h1>
|
||||
<section id="introduction" data-type="sect1">
|
||||
<section id="datetimes-introduction" data-type="sect1">
|
||||
<h1>
|
||||
Introduction</h1>
|
||||
<p>This chapter will show you how to work with dates and times in R. At first glance, dates and times seem simple. You use them all the time in your regular life, and they don’t seem to cause much confusion. However, the more you learn about dates and times, the more complicated they seem to get!</p>
|
||||
@@ -8,14 +8,12 @@ Introduction</h1>
|
||||
<p>Dates and times are hard because they have to reconcile two physical phenomena (the rotation of the Earth and its orbit around the sun) with a whole raft of geopolitical phenomena including months, time zones, and DST. This chapter won’t teach you every last detail about dates and times, but it will give you a solid grounding of practical skills that will help you with common data analysis challenges.</p>
|
||||
<p>We’ll begin by showing you how to create date-times from various inputs, and then once you’ve got a date-time, how you can extract components like year, month, and day. We’ll then dive into the tricky topic of working with time spans, which come in a variety of flavors depending on what you’re trying to do. We’ll conclude with a brief discussion of the additional challenges posed by time zones.</p>
|
||||
|
||||
<section id="prerequisites" data-type="sect2">
|
||||
<section id="datetimes-prerequisites" data-type="sect2">
|
||||
<h2>
|
||||
Prerequisites</h2>
|
||||
<p>This chapter will focus on the <strong>lubridate</strong> package, which makes it easier to work with dates and times in R. lubridate is not part of core tidyverse because you only need it when you’re working with dates/times. We will also need nycflights13 for practice data.</p>
|
||||
<p>This chapter will focus on the <strong>lubridate</strong> package, which makes it easier to work with dates and times in R. As of the latest tidyverse release, lubridate is part of core tidyverse so. We will also need nycflights13 for practice data.</p>
|
||||
<div class="cell">
|
||||
<pre data-type="programlisting" data-code-language="r">library(tidyverse)
|
||||
|
||||
library(lubridate)
|
||||
library(nycflights13)</pre>
|
||||
</div>
|
||||
</section>
|
||||
@@ -33,9 +31,9 @@ Creating date/times</h1>
|
||||
<p>To get the current date or date-time you can use <code><a href="https://lubridate.tidyverse.org/reference/now.html">today()</a></code> or <code><a href="https://lubridate.tidyverse.org/reference/now.html">now()</a></code>:</p>
|
||||
<div class="cell">
|
||||
<pre data-type="programlisting" data-code-language="r">today()
|
||||
#> [1] "2023-01-12"
|
||||
#> [1] "2023-01-26"
|
||||
now()
|
||||
#> [1] "2023-01-12 17:04:08 CST"</pre>
|
||||
#> [1] "2023-01-26 10:32:54 CST"</pre>
|
||||
</div>
|
||||
<p>Otherwise, the following sections describe the four ways you’re likely to create a date/time:</p>
|
||||
<ul><li>While reading a file with readr.</li>
|
||||
@@ -281,9 +279,9 @@ From other types</h2>
|
||||
<p>You may want to switch between a date-time and a date. That’s the job of <code><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_datetime()</a></code> and <code><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_date()</a></code>:</p>
|
||||
<div class="cell">
|
||||
<pre data-type="programlisting" data-code-language="r">as_datetime(today())
|
||||
#> [1] "2023-01-12 UTC"
|
||||
#> [1] "2023-01-26 UTC"
|
||||
as_date(now())
|
||||
#> [1] "2023-01-12"</pre>
|
||||
#> [1] "2023-01-26"</pre>
|
||||
</div>
|
||||
<p>Sometimes you’ll get date/times as numeric offsets from the “Unix Epoch”, 1970-01-01. If the offset is in seconds, use <code><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_datetime()</a></code>; if it’s in days, use <code><a href="https://lubridate.tidyverse.org/reference/as_date.html">as_date()</a></code>.</p>
|
||||
<div class="cell">
|
||||
@@ -294,7 +292,7 @@ as_date(365 * 10 + 2)
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<section id="exercises" data-type="sect2">
|
||||
<section id="datetimes-exercises" data-type="sect2">
|
||||
<h2>
|
||||
Exercises</h2>
|
||||
<ol type="1"><li>
|
||||
@@ -474,7 +472,7 @@ update(ymd("2023-02-01"), hour = 400)
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<section id="exercises-1" data-type="sect2">
|
||||
<section id="datetimes-exercises-1" data-type="sect2">
|
||||
<h2>
|
||||
Exercises</h2>
|
||||
<ol type="1"><li><p>How does the distribution of flight times within a day change over the course of the year?</p></li>
|
||||
@@ -507,12 +505,12 @@ Durations</h2>
|
||||
<pre data-type="programlisting" data-code-language="r"># How old is Hadley?
|
||||
h_age <- today() - ymd("1979-10-14")
|
||||
h_age
|
||||
#> Time difference of 15796 days</pre>
|
||||
#> Time difference of 15810 days</pre>
|
||||
</div>
|
||||
<p>A difftime class object records a time span of seconds, minutes, hours, days, or weeks. This ambiguity can make difftimes a little painful to work with, so lubridate provides an alternative which always uses seconds: the <strong>duration</strong>.</p>
|
||||
<div class="cell">
|
||||
<pre data-type="programlisting" data-code-language="r">as.duration(h_age)
|
||||
#> [1] "1364774400s (~43.25 years)"</pre>
|
||||
#> [1] "1365984000s (~43.29 years)"</pre>
|
||||
</div>
|
||||
<p>Durations come with a bunch of convenient constructors:</p>
|
||||
<div class="cell">
|
||||
@@ -530,7 +528,7 @@ dweeks(3)
|
||||
dyears(1)
|
||||
#> [1] "31557600s (~1 years)"</pre>
|
||||
</div>
|
||||
<p>Durations always record the time span in seconds. Larger units are created by converting minutes, hours, days, weeks, and years to seconds: 60 seconds in a minute, 60 minutes in an hour, 24 hours in a day, and 7 days in a week. Larger time units are more problematic. A year is uses the “average” number of days in a year, i.e. 365.25. There’s no way to convert a month to a duration, because there’s just too much variation.</p>
|
||||
<p>Durations always record the time span in seconds. Larger units are created by converting minutes, hours, days, weeks, and years to seconds: 60 seconds in a minute, 60 minutes in an hour, 24 hours in a day, and 7 days in a week. Larger time units are more problematic. A year uses the “average” number of days in a year, i.e. 365.25. There’s no way to convert a month to a duration, because there’s just too much variation.</p>
|
||||
<p>You can add and multiply durations:</p>
|
||||
<div class="cell">
|
||||
<pre data-type="programlisting" data-code-language="r">2 * dyears(1)
|
||||
@@ -545,14 +543,14 @@ last_year <- today() - dyears(1)</pre>
|
||||
</div>
|
||||
<p>However, because durations represent an exact number of seconds, sometimes you might get an unexpected result:</p>
|
||||
<div class="cell">
|
||||
<pre data-type="programlisting" data-code-language="r">one_pm <- ymd_hms("2026-03-12 13:00:00", tz = "America/New_York")
|
||||
<pre data-type="programlisting" data-code-language="r">one_am <- ymd_hms("2026-03-08 01:00:00", tz = "America/New_York")
|
||||
|
||||
one_pm
|
||||
#> [1] "2026-03-12 13:00:00 EDT"
|
||||
one_pm + ddays(1)
|
||||
#> [1] "2026-03-13 13:00:00 EDT"</pre>
|
||||
one_am
|
||||
#> [1] "2026-03-08 01:00:00 EST"
|
||||
one_am + ddays(1)
|
||||
#> [1] "2026-03-09 02:00:00 EDT"</pre>
|
||||
</div>
|
||||
<p>Why is one day after 1pm March 12, 2pm March 13? If you look carefully at the date you might also notice that the time zones have changed. March 12 only has 23 hours because it’s when DST starts, so if we add a full days worth of seconds we end up with a different time.</p>
|
||||
<p>Why is one day after 1am March 8, 2am March 9? If you look carefully at the date you might also notice that the time zones have changed. March 8 only has 23 hours because it’s when DST starts, so if we add a full days worth of seconds we end up with a different time.</p>
|
||||
</section>
|
||||
|
||||
<section id="periods" data-type="sect2">
|
||||
@@ -560,10 +558,10 @@ one_pm + ddays(1)
|
||||
Periods</h2>
|
||||
<p>To solve this problem, lubridate provides <strong>periods</strong>. Periods are time spans but don’t have a fixed length in seconds, instead they work with “human” times, like days and months. That allows them to work in a more intuitive way:</p>
|
||||
<div class="cell">
|
||||
<pre data-type="programlisting" data-code-language="r">one_pm
|
||||
#> [1] "2026-03-12 13:00:00 EDT"
|
||||
one_pm + days(1)
|
||||
#> [1] "2026-03-13 13:00:00 EDT"</pre>
|
||||
<pre data-type="programlisting" data-code-language="r">one_am
|
||||
#> [1] "2026-03-08 01:00:00 EST"
|
||||
one_am + days(1)
|
||||
#> [1] "2026-03-09 01:00:00 EDT"</pre>
|
||||
</div>
|
||||
<p>Like durations, periods can be created with a number of friendly constructor functions.</p>
|
||||
<div class="cell">
|
||||
@@ -591,10 +589,10 @@ ymd("2024-01-01") + years(1)
|
||||
#> [1] "2025-01-01"
|
||||
|
||||
# Daylight Savings Time
|
||||
one_pm + ddays(1)
|
||||
#> [1] "2026-03-13 13:00:00 EDT"
|
||||
one_pm + days(1)
|
||||
#> [1] "2026-03-13 13:00:00 EDT"</pre>
|
||||
one_am + ddays(1)
|
||||
#> [1] "2026-03-09 02:00:00 EDT"
|
||||
one_am + days(1)
|
||||
#> [1] "2026-03-09 01:00:00 EDT"</pre>
|
||||
</div>
|
||||
<p>Let’s use periods to fix an oddity related to our flight dates. Some planes appear to have arrived at their destination <em>before</em> they departed from New York City.</p>
|
||||
<div class="cell">
|
||||
@@ -668,7 +666,7 @@ y2024 / days(1)
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<section id="exercises-2" data-type="sect2">
|
||||
<section id="datetimes-exercises-2" data-type="sect2">
|
||||
<h2>
|
||||
Exercises</h2>
|
||||
<ol type="1"><li><p>Explain <code>days(overnight * 1)</code> to someone who has just started learning R. How does it work?</p></li>
|
||||
@@ -694,7 +692,7 @@ Time zones</h1>
|
||||
<p>And see the complete list of all time zone names with <code><a href="https://rdrr.io/r/base/timezones.html">OlsonNames()</a></code>:</p>
|
||||
<div class="cell">
|
||||
<pre data-type="programlisting" data-code-language="r">length(OlsonNames())
|
||||
#> [1] 596
|
||||
#> [1] 597
|
||||
head(OlsonNames())
|
||||
#> [1] "Africa/Abidjan" "Africa/Accra" "Africa/Addis_Ababa"
|
||||
#> [4] "Africa/Algiers" "Africa/Asmara" "Africa/Asmera"</pre>
|
||||
@@ -755,7 +753,7 @@ x4b - x4
|
||||
</li>
|
||||
</ul></section>
|
||||
|
||||
<section id="summary" data-type="sect1">
|
||||
<section id="datetimes-summary" data-type="sect1">
|
||||
<h1>
|
||||
Summary</h1>
|
||||
<p>This chapter has introduced you to the tools that lubridate provides to help you work with date-time data. Working with dates and times can seem harder than necessary, but hopefully this chapter has helped you see why — date-times are more complex than they seem at first glance, and handling every possible situation adds complexity. Even if your data never crosses a day light savings boundary or involves a leap year, the functions need to be able to handle it.</p>
|
||||
|
||||
Reference in New Issue
Block a user