Rough notes for import & transform
This commit is contained in:
		@@ -1,14 +1,15 @@
 | 
				
			|||||||
<li><a href="intro.html">Introduction</a></li>
 | 
					<li><a href="intro.html">Introduction</a></li>
 | 
				
			||||||
<li><a href="tidy.html">Tidy data</a></li>
 | 
					 | 
				
			||||||
<!--
 | 
					<!--
 | 
				
			||||||
<li class="dropdown-header">R for Data Science</li>
 | 
					<li><a href="visualize.html">Visualize</a></li>
 | 
				
			||||||
<li><a href="intro.html">Introduction to data science</a></li>
 | 
					-->
 | 
				
			||||||
<li><a href="visualize.html">Visualize data</a></li>
 | 
					<li><a href="transform.html">Transform</a></li>
 | 
				
			||||||
<li><a href="transform.html">Transform data</a></li>
 | 
					<!--
 | 
				
			||||||
<li><a href="tidy.html">Tidy data</a></li>
 | 
					<li><a href="strings.html">Regular expresssions</a></li>
 | 
				
			||||||
<li><a href="import.html">Import data</a></li>
 | 
					<li><a href="dates.html">Dates and times</a></li>
 | 
				
			||||||
<li><a href="dates.html">Dates and times</a></li>
 | 
					-->
 | 
				
			||||||
<li><a href="strings.html">Regular expresssions</a></li>
 | 
					<li><a href="tidy.html">Tidy</a></li>
 | 
				
			||||||
<li><a href="models.html">Model data</a></li>
 | 
					<li><a href="import.html">Import</a></li>
 | 
				
			||||||
<li><a href="reports.html">Reporting and reproducible research</a></li>
 | 
					<!--
 | 
				
			||||||
 | 
					<li><a href="models.html">Model</a></li>
 | 
				
			||||||
 | 
					<li><a href="communicate.html">Communicate</a></li>
 | 
				
			||||||
-->
 | 
					-->
 | 
				
			||||||
 
 | 
				
			|||||||
							
								
								
									
										24
									
								
								import.Rmd
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										24
									
								
								import.Rmd
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,24 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					layout: default
 | 
				
			||||||
 | 
					title: Data import
 | 
				
			||||||
 | 
					output: bookdown::html_chapter
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					  
 | 
				
			||||||
 | 
					## Overview
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					You can't apply any of the tools you've applied so far to your own work, unless you can get your own data into R. In this chapter, you'll learn how to import:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Flat files (like csv) with readr.
 | 
				
			||||||
 | 
					* Database queries with DBI.
 | 
				
			||||||
 | 
					* Data from web APIs with httr.
 | 
				
			||||||
 | 
					* Binary file formats (like excel or sas), with haven and readxl.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Flat files
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Databases
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Web APIs
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Binary files
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Needs to discuss how data types in different languages are converted to R. Similarly for missing values.
 | 
				
			||||||
							
								
								
									
										27
									
								
								transform.Rmd
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										27
									
								
								transform.Rmd
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,27 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					layout: default
 | 
				
			||||||
 | 
					title: Data transformation
 | 
				
			||||||
 | 
					output: bookdown::html_chapter
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Missing values
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* Why `NA == NA` is not `TRUE`
 | 
				
			||||||
 | 
					* Why default is `na.rm = FALSE`.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Data types
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Overview of different data types and useful summary functions for working with them. Strings and dates covered in more detail in future chapters.  
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Need to mention `typeof()` vs. `class()` mostly in context of how date/times and factors are built on top of simpler structures.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Logical
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When used with numeric functions, `TRUE` is converted to 1 and `FALSE` to 0. This makes `sum()` and `mean()` particularly useful: `sum(x)` gives the number of `TRUE`s in `x`, and `mean(x)` gives the proportion.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Numeric (integer and double)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Strings (and factors)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Date/times
 | 
				
			||||||
		Reference in New Issue
	
	Block a user