Reorders outline. Model section becomes do science section.
This commit is contained in:
parent
95c96704c1
commit
9137dd91b5
|
@ -1,29 +1,30 @@
|
|||
<li><a href="intro.html">Introduction</a></li>
|
||||
|
||||
<li class="dropdown-header"><a href="essentials.html">Understand data</a></li>
|
||||
<li class="dropdown-header"><a href="understand.html">Understand your data</a></li>
|
||||
<li><a href="visualize.html">Visualize</a></li>
|
||||
<li><a href="transform.html">Transform</a></li>
|
||||
<li><a href="model.html">Model</a></li>
|
||||
<li><a href="eda.html">Explore</a></li>
|
||||
<li><a href="variation.html">Variation</a></li>
|
||||
|
||||
<li class="dropdown-header">Work with your data</li>
|
||||
<li class="dropdown-header"><a href="work.html">Work with your data</a></li>
|
||||
<li><a href="import.html">Import</a></li>
|
||||
<li><a href="tidy.html">Tidy</a></li>
|
||||
<li><a href="relational-data.html">Relational data</a></li>
|
||||
<li><a href="strings.html">Strings</a></li>
|
||||
<li><a href="datetimes.html">Dates and times</a></li>
|
||||
|
||||
<li class="dropdown-header">Save time with programming</li>
|
||||
<li class="dropdown-header"><a href="program.html">Save time with programming</a></li>
|
||||
<li><a href="functions.html">Express yourself with code</a></li>
|
||||
<li><a href="data-structures.html">Data structures</a></li>
|
||||
<li><a href="lists.html">Lists and functional programming</a></li>
|
||||
<li><a href="robust-code.html">Robust code</a></li>
|
||||
|
||||
<li class="dropdown-header">Model data to get deeper insights</li>
|
||||
<li><a href="model-vis.html">Models and visualisation</a></li>
|
||||
<li><a href="model-assess.html">Model assessment</a></li>
|
||||
<li class="dropdown-header"><a href="science.html">Do science with data</a></li>
|
||||
<li><a href="model-vis.html">Discover</a></li>
|
||||
<li><a href="model-assess.html">Predict</a></li>
|
||||
<li>Test</li>
|
||||
|
||||
<li class="dropdown-header">Communicate your results</li>
|
||||
<li class="dropdown-header"><a href="communicate.html">Communicate your work</a></a></li>
|
||||
<li><a href="rmarkdown.html">R Markdown</a></li>
|
||||
<li><a href="shiny.html">Shiny</a></li>
|
||||
|
||||
|
|
|
@ -0,0 +1,6 @@
|
|||
---
|
||||
layout: default
|
||||
title: Communicate your work
|
||||
---
|
||||
|
||||
Reproducible, literate code is the data science equivalent of the Scientific Report (i.e, Intro, Methods and materials, Results, Discussion).
|
|
@ -1,28 +0,0 @@
|
|||
---
|
||||
layout: default
|
||||
title: Essentials
|
||||
---
|
||||
|
||||
If you measure any quantity twice---and precisely enough, you will get two different results. This is true even for quantities that should be constant, like the speed of light (below).
|
||||
|
||||
This phenomenon, called _variation_, is the beginning of data science. To understand anything you must decipher patterns of variation. But variation does more than just obscure, it is an incredibly useful tool. Patterns of variation provide evidence of causal relationships.
|
||||
|
||||
The best way to study variation is to collect data, particularly rectangular data: data that is made up of variables, observations, and values.
|
||||
|
||||
* A _variable_ is a quantity, quality, or property that you can measure.
|
||||
|
||||
* A _value_ is the state of a variable when you measure it. The value of a
|
||||
variable may change from measurement to measurement.
|
||||
|
||||
* An _observation_ is a set of measurements you make under similar conditions
|
||||
(usually all at the same time or on the same object). Observations contain
|
||||
values that you measure on different variables.
|
||||
|
||||
Rectangular data provides a clear record of variation, but that doesn't mean it is easy to understand. The human mind isn't built to process tables of data. This section will show you the best ways to comprehend your own data, which is the most important challenge of data science.
|
||||
|
||||
```{r, echo = FALSE}
|
||||
|
||||
mat <- as.data.frame(matrix(morley$Speed + 299000, ncol = 10))
|
||||
|
||||
knitr::kable(mat, caption = "*The speed of light is* the *universal constant, but variation obscures its value, here demonstrated by Albert Michelson in 1879. Michelson measured the speed of light 100 times and observed 30 different values (in km/sec).*", col.names = c("\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s"))
|
||||
```
|
|
@ -0,0 +1,6 @@
|
|||
---
|
||||
layout: default
|
||||
title: Save time by programming
|
||||
---
|
||||
|
||||
Computer-human communication matters.
|
|
@ -0,0 +1,6 @@
|
|||
---
|
||||
layout: default
|
||||
title: Do science with data
|
||||
---
|
||||
|
||||
The scientific method guides data science. Data science solves known problems with the scientific method.
|
|
@ -0,0 +1,6 @@
|
|||
---
|
||||
layout: default
|
||||
title: Understand your data
|
||||
---
|
||||
|
||||
Data poses a cognitive problem; Data comprehension is a skill.
|
|
@ -1,18 +1,45 @@
|
|||
---
|
||||
layout: default
|
||||
title: Exploratory data analysis
|
||||
title: Variation
|
||||
---
|
||||
|
||||
# Exploratory data analysis
|
||||
# Variation
|
||||
|
||||
```{r, include = FALSE}
|
||||
library(ggplot2)
|
||||
knitr::opts_chunk$set(
|
||||
cache = TRUE,
|
||||
fig.path = "figures/eda/"
|
||||
fig.path = "figures/variation/"
|
||||
)
|
||||
```
|
||||
|
||||
|
||||
If you measure any quantity twice---and precisely enough, you will get two different results. This is true even for quantities that should be constant, like the speed of light (below).
|
||||
|
||||
This phenomenon, called _variation_, is the beginning of data science. To understand anything you must decipher patterns of variation. But variation does more than just obscure, it is an incredibly useful tool. Patterns of variation provide evidence of causal relationships.
|
||||
|
||||
The best way to study variation is to collect data, particularly rectangular data: data that is made up of variables, observations, and values.
|
||||
|
||||
* A _variable_ is a quantity, quality, or property that you can measure.
|
||||
|
||||
* A _value_ is the state of a variable when you measure it. The value of a
|
||||
variable may change from measurement to measurement.
|
||||
|
||||
* An _observation_ is a set of measurements you make under similar conditions
|
||||
(usually all at the same time or on the same object). Observations contain
|
||||
values that you measure on different variables.
|
||||
|
||||
Rectangular data provides a clear record of variation, but that doesn't mean it is easy to understand. The human mind isn't built to process tables of data. This section will show you the best ways to comprehend your own data, which is the most important challenge of data science.
|
||||
|
||||
```{r, echo = FALSE}
|
||||
|
||||
mat <- as.data.frame(matrix(morley$Speed + 299000, ncol = 10))
|
||||
|
||||
knitr::kable(mat, caption = "*The speed of light is* the *universal constant, but variation obscures its value, here demonstrated by Albert Michelson in 1879. Michelson measured the speed of light 100 times and observed 30 different values (in km/sec).*", col.names = c("\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s", "\\s"))
|
||||
```
|
||||
|
||||
|
||||
|
||||
***
|
||||
|
||||
*Tip*: Throughout this section, we will rely on a distinction between two types of variables:
|
Loading…
Reference in New Issue