Whole game feedback from O'Reilly (#1057)

* Redraw data science process diagrams

* Polishing the whole game

* Add reference to TMWR

* Respond to visualization feedback

* Minor changes

* Better integrate workflow-scripts chapter

* Minor getting help polishing

* Update investing in yourself links

* Redraw RStudio screenshots

* More scripts/projects polishing

Co-authored-by: Mine Cetinkaya-Rundel <cetinkaya.mine@gmail.com>
This commit is contained in:
Hadley Wickham
2022-08-29 06:24:32 -04:00
committed by GitHub
parent 7d4f86ca66
commit 21e31429a5
45 changed files with 298 additions and 207 deletions

View File

@@ -14,17 +14,23 @@ After reading this book, you'll have the tools to tackle a wide variety of data
Data science is a huge field, and there's no way you can master it all by reading a single book.
The goal of this book is to give you a solid foundation in the most important tools, and enough knowledge to find the resources to learn more when necessary.
Our model of the tools needed in a typical data science project looks something like this:
Our model of the tools needed in a typical data science project looks something like @fig-ds-diagram.
```{r}
#| label: fig-ds-diagram
#| echo: false
#| fig-align: "center"
#| fig-cap: >
#| In our model of the data science process you start with data import
#| and tidying. Next you understand your data with an iterative cycle of
#| transforming, visualizing, and modeling. You finish the process
#| by communicating your results to other humans.
#| fig-alt: >
#| A diagram displaying the data science cycle: Import -> Tidy -> Understand
#| (which has the phases Transform -> Visualize -> Model in a cycle) ->
#| Communicate. Surrounding all of these is Communicate.
#| out.width: NULL
knitr::include_graphics("diagrams/data-science.png")
knitr::include_graphics("diagrams/data-science/base.png", dpi = 270)
```
First you must **import** your data into R.
@@ -77,10 +83,13 @@ There are a number of important topics that this book doesn't cover.
We believe it's important to stay ruthlessly focused on the essentials so you can get up and running as quickly as possible.
That means this book can't cover every important topic.
### Modelling
### Modeling
<!--# TO DO: Say a few sentences about modelling. -->
To learn more about modeling, we highly recommend [Tidy Modeling with R](https://www.tmwr.org), by our colleagues Max Kuhn and Julia Silge.
This book will teach you the tidymodels family of packages, which, as you might guess from the name, share many conventions with the tidyverse packages we use in this book.
### Big data
This book proudly focuses on small, in-memory datasets.
@@ -150,20 +159,22 @@ When a new version is available, RStudio will let you know.
It's a good idea to upgrade regularly so you can take advantage of the latest and greatest features.
For this book, make sure you have at least RStudio 2022.02.0.
When you start RStudio, you'll see two key regions in the interface: the console pane, and the output pane.
```{r}
#| echo: false
#| fig-align: "center"
#| fig-alt: >
#| The RStudio IDE with the panes Console and Output highlighted.
knitr::include_graphics("diagrams/rstudio-console.png")
```
When you start RStudio, @fig-rstudio-console, you'll see two key regions in the interface: the console pane, and the output pane.
For now, all you need to know is that you type R code in the console pane, and press enter to run it.
You'll learn more as we go along!
```{r}
#| label: fig-rstudio-console
#| echo: false
#| out-width: ~
#| fig-cap: >
#| The RStudio IDE has two key regions: type R code in the console pane
#| on the left, and look for plots in the output pane on the right.
#| fig-alt: >
#| The RStudio IDE with the panes Console and Output highlighted.
knitr::include_graphics("diagrams/rstudio/console.png", dpi = 270)
```
### The tidyverse
You'll also need to install some R packages.