parent
707b332c3c
commit
f09058420e
|
@ -12,6 +12,7 @@ Depends:
|
|||
Imports:
|
||||
arrow,
|
||||
babynames,
|
||||
curl (>= 5.0.0),
|
||||
dplyr,
|
||||
duckdb,
|
||||
gapminder,
|
||||
|
|
12
arrow.qmd
12
arrow.qmd
|
@ -53,16 +53,18 @@ We begin by getting a dataset worthy of these tools: a data set of item checkout
|
|||
This dataset contains 41,389,465 rows that tell you how many times each book was checked out each month from April 2005 to October 2022.
|
||||
|
||||
The following code will get you a cached copy of the data.
|
||||
The data is a 9GB CSV file, so it will take some time to download: simply getting the data is often the first challenge!
|
||||
The data is a 9GB CSV file, so it will take some time to download.
|
||||
I highly recommend using `curl::multidownload()` to get very large files as it's built for exactly this purpose: it gives you a progress bar and it can resume the download if its interrupted.
|
||||
|
||||
```{r}
|
||||
#| eval: false
|
||||
dir.create("data", showWarnings = FALSE)
|
||||
url <- "https://r4ds.s3.us-west-2.amazonaws.com/seattle-library-checkouts.csv"
|
||||
|
||||
# Default timeout is 60s; bump it up to an hour
|
||||
options(timeout = 60 * 60)
|
||||
download.file(url, "data/seattle-library-checkouts.csv")
|
||||
curl::multi_download(
|
||||
"https://r4ds.s3.us-west-2.amazonaws.com/seattle-library-checkouts.csv",
|
||||
"data/seattle-library-checkouts.csv",
|
||||
resume = TRUE
|
||||
)
|
||||
```
|
||||
|
||||
## Opening a dataset
|
||||
|
|
Loading…
Reference in New Issue