R Markdown documents are fully reproducible and support dozens of output formats, like PDFs, Word files, slideshows, and more. They also provide a useful notebook interface for R and are easy to track with version control software like git.
This chapter will show you how to use R Markdown to organize your work as a data scientist.
### Installation
Like the rest of R, R Markdown is free and open source. You can install the rmarkdown package from CRAN with:
[This](https://github.com/hadley/r4ds/tree/master/rmarkdown-demos/1-example.Rmd) is an R Markdown file, a plain text file that has the extension `.Rmd`. Notice that it contains three types of content:
When you open the file in the RStudio IDE, it becomes a [notebook interface for R](http://rmarkdown.rstudio.com/r_notebooks.html). You can run each code chunk by clicking the Run icon (it looks like a play button at the top of the chunk). RStudio executes the code and display the results inline with your file.
Better still, use the "Knit" button in the RStudio IDE to render the file and preview the output with a single click or keyboard shortcut (⇧⌘K).
R Markdown generates a new file that contains selected text, code, and results from the .Rmd file. The new file can be a finished web page, PDF, MS Word document, slide show, notebook, handout, book, dashboard, package vignette, or interactive document.
When you run `render()`, R Markdown feeds the .Rmd file to [knitr](http://yihui.name/knitr/), which executes all of the code chunks and creates a new markdown (.md) document which includes the code and it's output. The markdown file generated by knitr is then processed by [pandoc](http://pandoc.org/) which is responsible for creating the finished format.
This may sound complicated, but R Markdown makes it extremely simple by encapsulating all of the above processing into a single `render()` function.
## Get Started
To open a new .Rmd file, open the RStudio IDE and select File > New File > R Markdown... in the menubar. RStudio will launch a wizard that you can use to pre-populate your file with useful content. You can erase or add to the content as you wish.
## Code Chunks
You can quickly insert code chunks into your file with
* the keyboard shortcut **Ctrl + Alt + I** (OS X: **Cmd + Option + I**)
* the Add Chunk icon in the editor toolbar (it looks like a green box with a C in it)
* or by typing the chunk delimiters ` ```{r} ` and ` ``` `.
Test chunks as you write by clicking the "Run Current Chunk" and "Run All Chunks Above" icons at the top of each chunk. R Markdown will run the code in the chunks in your current environment and display the results in your file editor. To turn off this behavior, click the gear icon at the top of the .Rmd file and select "Chunk Output in the Console." RStudio will then run code chunks at the command line as if your .Rmd file were an R Script.
When you render your .Rmd file, R Markdown will create a fresh environment to run the code chunks in. It will run each code chunk in order and embed the results beneath the chunk in your final report.
### Chunk Options
Chunk output can be customized with [knitr options](http://yihui.name/knitr/options/), arguments set in the `{}` of a chunk header. For example, the above file uses five arguments:
* `include = FALSE` prevents code and results from appearing in the finished file. R Markdown still runs the code in the chunk, and the results can be used by other chunks.
* `echo = FALSE` prevents code, but not the results from appearing in the finished file. This is a useful way to embed figures.
* `message = FALSE` prevents messages that are generated by code from appearing in the finished file.
* `warning = FALSE` prevents warnings that are generated by code from appearing in the finished.
* `fig.cap = "..."` adds a caption to graphical results.
See the [R Markdown Reference Guide](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf) for a complete list of knitr chunk options.
Knitr provides almost 60 options that you can use to customize your code chunks. Since the options are not associated with an R function, it can be difficult to figure out where to learn about them. The best place is the knitr options web page at <http://yihui.name/knitr/options/>.
You can also find a list of knitr options with concise descriptions in the *R Markdown Reference Guide*, which is available in the RStudio IDE under *Help > Cheatsheets > R Markdown Reference Guide*.
Here are some of the most useful options not listed above.
* `cache = TRUE` caches code results to minimize future computation (caching is explained further below).
* `child = "file.Rmd"` renders a file and inserts the results into the main document at the chunk location.
* `comment = "##"` changes the prefix to put before each line of output.
* `error = FALSE` causes the render to stop is code returns an error.
* `eval = FALSE` prevents code from being evaluated. Useful for displaying example code.
* `out.width = "50%"` adjusts the width of the plot in the final output file.
* `results = 'hide'` prevents the results, but not the code, from appearing in the final document. Knitr still runs the code.
Knitr will treat each option that you pass to `knitr::opts_chunk$set` as a global default that applies to all chunks, but can be overwritten in individual chunk headers.
### Caching
If document rendering becomes time consuming due to long computations you can use knitr caching to improve performance.
TO DO
## Inline code
Code results can be inserted directly into the *text* of a .Rmd file by enclosing the code with `` `r ` ``. The [file below](rmarkdown-demos/3-inline.Rmd) uses `` `r ` `` twice to call `colorFunc`, which returns "heat.colors." This makes it easy to update the report to refer to another function.
Inline expressions do not take knitr options. When processing inline code, R Markdown will always
* display the results of inline code, but not the code
* apply relevant text formatting to the results
As a result, inline output is indistinguishable from the surrounding text.
## Code Languages
[knitr](http://yihui.name/knitr/) can execute code in many languages besides R, which makes it easy to write R Markdown files that are multiple languages. Some of the available language engines include:
Note that chunk options like `echo` and `results` are all valid when using a language engine like python.
## Parameters
R Markdown documents can include one or more parameters whose values can be set when you render the report. For example, the [file below](rmarkdown-demos/5-parameters.Rmd) uses a `data` parameter that determines which data set to plot.
To declare one or more parameters for your file, use the `params` field within the YAML header of the document. Here we create the parameter `data` and assign it the default value `"hawaii"`. R Markdown recognizes the atomic data types: numerics, character strings, logicals, etc. You can also pass an R expression as a parameter by prefacing the parameter value with `!R`, e.g.
Parameters are available within the knit environment as a read-only list named `params`. To access a parameter in code, call `params$<parameter name>`.
Add a `params` argument to `render()` to create a report that uses a different set of parameter values. Here we modify our report to use the `aleutians` data set with
Better yet, click the "Knit with Parameters" option in the dropdown menu next to the RStudio IDE knit button to set parameters, render, and preview the report in a single user friendly step.
By default, R Markdown displays data frames and matrixes as they would be in the R terminal (in a monospaced font). If you prefer that data be displayed with additional formatting you can use the `knitr::kable` function, as in the [.Rmd file below](rmarkdown-demos/6-tables.Rmd).
If you'd like to customize your tables at a deeper level consider the xtable, stargazer, pander, tables, and ascii packages. Each provides an extensive set of tools for returning formatted tables from R code.
Format the text in your R Markdown files with Markdown, a set of markup annotations for plain text files. When you render your file, Pandoc transforms the marked up text into formatted text in your final file format:
Markdown is designed to be easy to be easy to read and easy to write. It is also very easy to learn. The guide below shows how to use Pandoc's Markdown, a slightly extended version of Markdown that R Markdown understands.
Set the `output_format` argument of `render()` to render your .Rmd file into any of R Markdown's supported formats. For example, the code below renders [1-example.Rmd](rmarkdown-demos/1-example.Rmd) to a Microsoft Word document:
If you do not select a format, R Markdown renders the file to its default format, which you can set in the `output` field of a .Rmd file's header. The header of [1-example.Rmd](rmarkdown-demos/1-example.Rmd) shows that it renders to an HTML file by default:
RStudio's knit button renders a file to the first format listed in its `output` field. You can render to additional formats by clicking the dropdown menu beside the knit button:
* [ioslides_presentation](http://rmarkdown.rstudio.com/ioslides_presentation_format.html) - HTML presentation with ioslides
* [revealjs::revealjs_presentation](http://rmarkdown.rstudio.com/revealjs_presentation_format.html) - HTML presentation with reveal.js. Requires the revealjs package.
* [slidy_presentation](http://rmarkdown.rstudio.com/slidy_presentation_format.html) - HTML presentation with W3C Slidy
* [beamer_presentation](http://rmarkdown.rstudio.com/beamer_presentation_format.html) - PDF presentation with LaTeX Beamer
* [flexdashboard::flex_dashboard](http://rmarkdown.rstudio.com/flexdashboard/) - Administrative dashboards. Requires the flexdashboard package.
* [tufte::tufte_handout](http://rmarkdown.rstudio.com/tufte_handout_format.html) - PDF handouts in the style of Edward Tufte. Requires the tufte package.
* [tufte::tufte_html](http://rmarkdown.rstudio.com/tufte_handout_format.html) - HTML handouts in the style of Edward Tufte. Requires the tufte package.
* [tufte::tufte_book](http://rmarkdown.rstudio.com/tufte_handout_format.html) - PDF books in the style of Edward Tufte. Requires the tufte package.
* [html_vignette](http://rmarkdown.rstudio.com/package_vignette_format.html) - R package vignette (HTML)
You can also build [books](https://bookdown.org/), [websites](http://rmarkdown.rstudio.com/rmarkdown_websites.html), and [interactive documents](http://rmarkdown.rstudio.com/authoring_shiny.html) with R Markdown, as described in the sections below.
### Output Options
Each output format is implemented as a function in R. You can customize the output by passing arguments to the function as sub-values of the `output` field. For example, we can change [1-example.Rmd](rmarkdown-demos/1-example.Rmd) to render with a floating table of contents,
To learn which arguments a format takes, read the format's help page in R, e.g. `?html_document`.
## HTML Notebooks
In How It Works, you learned that R Markdown files provide a notebook interface that makes it easy to test and iterate when writing code.
To share this experience with colleagues, simply share your .Rmd file for them to open in their RStudio IDE. If your colleagues do not use R, you can recreate the notebook interface by rendering your file to an HTML notebook with `output: html_notebook`.
R Markdown will create a `nb.html` version of your file; a self-contained HTML file that contains all current chunk outputs (suitable for display on a website). You can view the .nb.html file in any ordinary web browser, or open it in RStudio. In this case, RStudio will extract and open the .Rmd file that underlies the nb.html file.
### Version Control
One of the major advantages of R Markdown notebooks compared to other notebook systems is that they are plain-text files and therefore work well with version control. I recommend checking in both the .Rmd and .nb.html files into version control so that both your source code and output are available to collaborators. However, you can choose to include only the .Rmd file (with a .gitignore that excludes the .nb.html) if you want each collaborator to work with their own private copies of the output.
## Slide Presentations
R Markdown renders to four presentation formats:
* beamer_presentation - PDF presentations with beamer
* ioslides_presentation - HTML presentations with ioslides
* slidy_presentation - HTML presentations with slidy
* revealjs::revealjs_presentation - HTML presentations with reveal.js (requires the revealjs package)
Each format will intuitively divide your content into slides, with a new slide beginning at each first or second level header.
Insert a horizontal rule (`***`) into your document to create a manual slide break. Create bullet points that display incrementally with `>-`. Here is a version of 1-example.Rmd displayed as a reveal.js slide presentation.
TO DO - IMAGE
## Dashboards
Dashboards are a useful way to communicate large amounts of information visually and quickly. Create one with the `flex_dashboard` output format of the flexdashboard package, as in the [.Rmd file below](rmarkdown-demos/11-dashboard.Rmd):
Flexdashboard makes it easy to organize your content into a visual layout:
* Each Level 1 Header (`#`) begins a new page in the dashboard.
* Each Level 2 Header (`##`) begins a new column.
* Each Level 3 Header (`###`) begins a new box.
You can further modify elements with attributes, as in the `{.sidebar}` above.
Flexdashboard also provides simple tools for creating tabsets, value boxes, and gauges. To learn more about flexdashboard visit <http://rmarkdown.rstudio.com/flexdashboard/>.
## Websites
Use `rmarkdown::render_site()` to render collections of R Markdown documents into a website. Each website requires, in a single directory,
* a YAML file named `_site.yml`, which provides the navigation for the site, e.g.
* a .Rmd file named `index.Rmd`, which provides the content for the home page of your website
* other .Rmd files to include in the site. Each .Rmd file becomes a page in the website
* any support material
Execute `rmarkdown::render_site("<path to directory>")` to build `_site`, a directory of files ready to deploy as a standalone static website. [This collection of files](rmarkdown-demos/12-website.zip) creates the simple site below
Better yet, create an [RStudio Project](https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects) for your website directory. RStudio will add a Build tab to the IDE that you can use to build and preview your site.
## Interactive Documents
R Markdown documents are a useful platform for interactive content. You can make your documents interactive in two ways. Add:
1. Interactive JavaScript visualizations based on [htmlwidgets](http://www.htmlwidgets.org/), or
2. Reactive components made with [Shiny](http://shiny.rstudio.com/)
### htmlwidgets
[Htmlwidgets](http://www.htmlwidgets.org/) are R functions that return JavaScript visualizations. You do not need to know any JavaScript to use htmlwidgets. The R functions take care of all of the coding for you. The [document below](rmarkdown-demos/13-htmlwidget.Rmd) uses a [leaflet](http://rstudio.github.io/leaflet/) htmlwidget to create an interactive map.
Htmlwidgets create *client side* interactions. Since htmlwidgets are exported in JavaScript, any common web browser can execute the interactions.
Learn more about packages that build htmlwidgets at [www.htmlwidgets.org](http://www.htmlwidgets.org/showcase_leaflet.html).
### Shiny
The [Shiny](http://shiny.rstudio.com/) package helps developers build interactive web apps powered by R. You can use components from the Shiny package to turn your R Markdown into such an app. To call Shiny code from an R Markdown document, add `runtime: shiny` to the header, like in [this document](rmarkdown-demos/14-shiny.Rmd).
But it also introduces a logistical issue: Shiny apps require a special server, known as a Shiny Server, when hosted online. You can also run Shiny powered documents on your local computer by rendering them in your local R session.
R Markdown documents rely on several technologies that go beyond R functions. To help you navigate these, the developers of RStudio have placed three R Markdown references in the RStudio IDE.
* Go to *File > Help > Cheatsheets > R Markdown Cheat Sheet* to open the main *[R Markdown Cheat Sheet](http://www.rstudio.com/wp-content/uploads/2016/03/rmarkdown-cheatsheet-2.0.pdf)*.
* Go to *File > Help > Cheatsheets > R Markdown Reference Guide* to open the main *[R Markdown Reference Guide](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf)*.
More than this, R Markdown files provide the data science equivalent of [literate programming](https://en.wikipedia.org/wiki/Literate_programming), i.e. literate data science. A literate program mixes code with text that explains what the code does in the context of the program. A .Rmd file mixes code with text that explains what the code does in the context of a data analysis, as well as what the results of the code mean. The result is a reproducible, human-readable record of data science.