Merge branch 'master' of github.com:hadley/r4ds
This commit is contained in:
commit
061e233740
|
@ -1,6 +1,6 @@
|
|||
# Dates and times
|
||||
|
||||
This chapter will show you how to work with dates and times in R. Dates and times follow their own rules, which can make working with them difficult. For example dates and times are ordered, like numbers; but the timeline is not as orderly as the numberline. The timeline repeats itself, and has noticeable gaps due to Daylight Savings Time, leap years, and leap seconds. Datetimes also rely on ambiguous units: How long is a month? How long is a year? Time zones give you another headache when you work with dates and times. The same instant of time will have different "names" in different time zones.
|
||||
This chapter will show you how to work with dates and times in R. Dates and times follow their own rules, which can make working with them difficult. For example dates and times are ordered, like numbers; but the timeline is not as orderly as the number line. The timeline repeats itself, and has noticeable gaps due to Daylight Savings Time, leap years, and leap seconds. Datetimes also rely on ambiguous units: How long is a month? How long is a year? Time zones give you another headache when you work with dates and times. The same instant of time will have different "names" in different time zones.
|
||||
|
||||
This chapter will focus on R's __lubridate__ package, which makes it much easier to work with dates and times in R. You'll learn the basic date time structures in R and the lubridate functions that make working with them easy. We will also rely on some of the packages that you already know how to use, so load this entire set of packages to begin:
|
||||
|
||||
|
@ -58,7 +58,7 @@ mdy_hm("01/31/2017 08:01")
|
|||
|
||||
Lubridate's parsing functions handle a wide variety of formats and separators, which simplifies the parsing process.
|
||||
|
||||
For both `make_difftime()` and the y,m,d,h,m,s parsing functions, you can set the time zone of a date when you create it with a tz argument. As a general rule, I recommend that you do not use time zones unless you have to. I'll cover time zones and the idiosyncracies that come with them later in the chapter. If you do not set a time zone, lubridate will supply the Coordinated Universal Time zone, a very easy time zone to work in.
|
||||
For both `make_datetime()` and the y,m,d,h,m,s parsing functions, you can set the time zone of a date when you create it with a tz argument. As a general rule, I recommend that you do not use time zones unless you have to. I'll cover time zones and the idiosyncrasies that come with them later in the chapter. If you do not set a time zone, lubridate will supply the Coordinated Universal Time zone, a very easy time zone to work in.
|
||||
|
||||
```{r}
|
||||
ymd_hms("2017-01-31 20:11:59", tz = "America/New_York")
|
||||
|
@ -153,7 +153,7 @@ You can avoid these rough edges by using lubridate's version of difftimes, known
|
|||
|
||||
### Durations
|
||||
|
||||
Durations behave like difftimes, but are a little more user friendly. To make a duration, choose a units of time, make it plural, and then place a "d" in front of it. This is the name of the funtion in lubridate that will make your duration, i.e.
|
||||
Durations behave like difftimes, but are a little more user friendly. To make a duration, choose a unit of time, make it plural, and then place a "d" in front of it. This is the name of the function in lubridate that will make your duration, i.e.
|
||||
|
||||
```{r}
|
||||
dseconds(1)
|
||||
|
@ -278,7 +278,7 @@ The length of months and years change so often that doing arithmetic with them c
|
|||
|
||||
1. `February 31st` (which doesn't exist)
|
||||
2. `March 4th` (31 days after January 31), or
|
||||
3. `February 28th` (assuming its not a leap year)
|
||||
3. `February 28th` (assuming it's not a leap year)
|
||||
|
||||
A basic property of arithmetic is that `a + b - b = a`. Only solution 1 obeys this property, but it is an invalid date. Lubridate tries to make arithmetic as consistent as possible by invoking the following rule *if adding or subtracting a month or a year creates an invalid date, lubridate will return an NA*.
|
||||
|
||||
|
@ -398,7 +398,7 @@ datetimes %>%
|
|||
|
||||
### Setting dates
|
||||
|
||||
You can also use each accessor funtion to set the components of a date or datetime.
|
||||
You can also use each accessor function to set the components of a date or datetime.
|
||||
|
||||
```{r}
|
||||
datetime
|
||||
|
@ -495,7 +495,7 @@ When should you use `with_tz()` and when should you use `force_tz()`? Use `with_
|
|||
|
||||
### Daylight Savings Time
|
||||
|
||||
In computing, time zones do double duty. They record where on the planet a time occurs as well as whether or not that location follows Daylight Savings Time. Different areas within the same "time zone" make different decisions about whether or not ot follow Daylight Savings Time. As a result, places like Phoenix, AZ and Denver, CO have the same times for part of the year, but different times for the rest of the year.
|
||||
In computing, time zones do double duty. They record where on the planet a time occurs as well as whether or not that location follows Daylight Savings Time. Different areas within the same "time zone" make different decisions about whether or not to follow Daylight Savings Time. As a result, places like Phoenix, AZ and Denver, CO have the same times for part of the year, but different times for the rest of the year.
|
||||
|
||||
```{r}
|
||||
with_tz(c(jan_day, july_day), tz = "America/Denver")
|
||||
|
@ -511,7 +511,7 @@ dst(with_tz(c(jan_day, july_day), tz = "America/Denver"))
|
|||
dst(with_tz(c(jan_day, july_day), tz = "America/Phoenix"))
|
||||
```
|
||||
|
||||
R will display times that are adjusted for Daylight Savings Time with a "D" in the time zone. Hence, MDT stands for Mountain Daylight Savings Time. MST stands for Mountain Standard Time. Notice that R displays an abbreviation for each time zone that does not directly map to the full name of the time zone. Many time zones share the same abbreviations. For example, America/Phoeniz and America/Denver both appear as MST.
|
||||
R will display times that are adjusted for Daylight Savings Time with a "D" in the time zone. Hence, MDT stands for Mountain Daylight Savings Time. MST stands for Mountain Standard Time. Notice that R displays an abbreviation for each time zone that does not directly map to the full name of the time zone. Many time zones share the same abbreviations. For example, America/Phoenix and America/Denver both appear as MST.
|
||||
|
||||
```{r include = FALSE}
|
||||
# TIME ZONES and DAYLIGHT SAVINGS
|
||||
|
@ -578,8 +578,8 @@ An interval of time is a specific period of time, such as midnight April 13, 201
|
|||
|
||||
```{r}
|
||||
|
||||
apr13 <- mdy("4/13/2013", tz = "America/New_york")
|
||||
apr23 <- mdy("4/23/2013", tz = "America/New_york")
|
||||
apr13 <- mdy("4/13/2013", tz = "America/New_York")
|
||||
apr23 <- mdy("4/23/2013", tz = "America/New_York")
|
||||
interval(apr13, apr23)
|
||||
```
|
||||
|
||||
|
@ -622,7 +622,7 @@ int_start(spring_break)
|
|||
int_end(spring_break)
|
||||
```
|
||||
|
||||
You can chnge the direction of an interval with `int_flip()`. Use `int_shift()` to shift an interval forwards or backwards along the timeline. Give `int_shift()` a period or duration object to shift the interval by.
|
||||
You can change the direction of an interval with `int_flip()`. Use `int_shift()` to shift an interval forwards or backwards along the timeline. Give `int_shift()` a period or duration object to shift the interval by.
|
||||
|
||||
```{r}
|
||||
int_flip(spring_break)
|
||||
|
|
|
@ -226,15 +226,15 @@ filter(df, is.na(x) | x > 1)
|
|||
|
||||
### Exercises
|
||||
|
||||
1. Find all the flights that:
|
||||
1. Find all flights that
|
||||
|
||||
1. That were delayed by more two hours.
|
||||
1. That flew to Houston (`IAH` or `HOU`).
|
||||
1. There were operated by United, American, or Delta.
|
||||
1. Departed in summer (July, August, and September).
|
||||
1. That arrived more than two hours late, but didn't leave late.
|
||||
1. Were delayed by at least an hour, but made up over 30 minutes in flight.
|
||||
1. Departed between midnight and 6am (inclusive).
|
||||
1. Were delayed by more two hours
|
||||
1. Flew to Houston (`IAH` or `HOU`)
|
||||
1. Were operated by United, American, or Delta
|
||||
1. Departed in summer (July, August, and September)
|
||||
1. Arrived more than two hours late, but didn't leave late
|
||||
1. Were delayed by at least an hour, but made up over 30 minutes in flight
|
||||
1. Departed between midnight and 6am (inclusive)
|
||||
|
||||
1. Another useful dplyr filtering helper is `between()`. What does it do?
|
||||
Can you use it to simplify the code needed to answer the previous
|
||||
|
@ -861,7 +861,7 @@ daily %>%
|
|||
|
||||
Which is more important: arrival delay or departure delay?
|
||||
|
||||
1. Look at the number of cancelled flights per day. Is there are pattern?
|
||||
1. Look at the number of cancelled flights per day. Is there a pattern?
|
||||
Is the proportion of cancelled flights related to the average delay?
|
||||
|
||||
1. Which carrier has the worst delays? Challenge: can you disentangle the
|
||||
|
|
Loading…
Reference in New Issue