Mild import/wrangling reorg
This commit is contained in:
@@ -11,8 +11,7 @@ status("polishing")
|
||||
|
||||
Working with data provided by R packages is a great way to learn the tools of data science, but at some point you want to stop learning and start working with your own data.
|
||||
In this chapter, you'll learn how to read plain-text rectangular files into R.
|
||||
Here, we'll only scratch the surface of data import, but many of the principles will translate to other forms of data.
|
||||
We'll finish with a few pointers to packages that are useful for other types of data.
|
||||
Here, we'll only scratch the surface of data import, but many of the principles will translate to other forms of data, which we'll come back to in @sec-wrangle.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
@@ -320,33 +319,10 @@ There are two alternatives:
|
||||
```
|
||||
|
||||
Feather tends to be faster than RDS and is usable outside of R.
|
||||
RDS supports list-columns (which you'll learn about in [Chapter -@sec-list-columns]; feather currently does not.
|
||||
RDS supports list-columns (which you'll learn about in @sec-rectangling; feather currently does not.
|
||||
|
||||
```{r}
|
||||
#| include: false
|
||||
|
||||
file.remove("students-2.csv")
|
||||
file.remove("students.rds")
|
||||
```
|
||||
|
||||
## Other types of data
|
||||
|
||||
To get other types of data into R, we recommend starting with the tidyverse packages listed below.
|
||||
They're certainly not perfect, but they are a good place to start.
|
||||
For rectangular data:
|
||||
|
||||
- **readxl** reads Excel files (both `.xls` and `.xlsx`).
|
||||
See [Chapter -@sec-import-spreadsheets] for more on working with data stored in Excel spreadsheets.
|
||||
|
||||
- **googlesheets4** reads Google Sheets.
|
||||
Also see [Chapter -@sec-import-spreadsheets] for more on working with data stored in Google Sheets.
|
||||
|
||||
- **DBI**, along with a database specific backend (e.g. **RMySQL**, **RSQLite**, **RPostgreSQL** etc) allows you to run SQL queries against a database and return a data frame.
|
||||
See [Chapter -@sec-import-databases] for more on working with databases .
|
||||
|
||||
- **haven** reads SPSS, Stata, and SAS files.
|
||||
|
||||
For hierarchical data: use **jsonlite** (by Jeroen Ooms) for json, and **xml2** for XML.
|
||||
Jenny Bryan has some excellent worked examples at <https://jennybc.github.io/purrr-tutorial/>.
|
||||
|
||||
For other file types, try the [R data import/export manual](https://cran.r-project.org/doc/manuals/r-release/R-data.html) and the [**rio**](https://github.com/leeper/rio) package.
|
||||
|
||||
Reference in New Issue
Block a user