r4ds/workflow-style.Rmd

73 lines
3.3 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Workflow: code style {#workflow-style}
```{r, results = "asis", echo = FALSE}
status("drafting")
```
Good coding style is like correct punctuation: you can manage without it, butitsuremakesthingseasiertoread.
<https://style.tidyverse.org>, <https://style.tidyverse.org/pipes.html>
- Variables should use only lowercase letters, numbers, and `_`.
Use underscores (`_`) (so called snake case) to separate words within a name.
Better to use overly long names than overly short.
Autocomplete means that
- Do not put spaces inside or outside parentheses for regular function calls.
Always put a space after a comma, never before, just like in regular English.
- Most operators ([`==`](https://rdrr.io/r/base/Comparison.html), [`+`](https://rdrr.io/r/base/Arithmetic.html), [`-`](https://rdrr.io/r/base/Arithmetic.html), [`<-`](https://rdrr.io/r/base/assignOps.html), etc.) should be surrounded by spaces; the chief exception is `^`.
- `%>%` should always have a space before it, and should usually be followed by a new line.
After the first step, each line should be indented by two spaces.
This structure makes it easier to add new steps (or rearrange existing steps) and harder to overlook a step.
- In a pipeline, put each function on its on line.
And if the function as named arguments (`=`) then put each of those on a single line.
```{r, eval = FALSE}
df |> mutate(y = x + 1)
# vs
df |>
mutate(
y = x + 1
)
```
Line up the opening and closing parens.
- It's ok to add extra spaces if it improves alignment of [`=`](https://rdrr.io/r/base/assignOps.html).
- For very short snippets that fit on one line, it's ok to write (e.g.) `mutate(df, y = x + 1)` vs `df %>% mutate(df, y = x + 1)`.
But in my experience, short snippets often grow longer, so you'll save time in the long run but start out how you wish to continue.
- Use empty lines to organize your code into "paragraphs" of related thoughts.
- with ggplot2
```{r, eval = FALSE}
df |>
ggplot(aes())
```
Don't forget to switch to plus!
- How long should your pipes be?
Too long vs too short.
Your pipes are longer than (say) ten steps.
In that case, create intermediate objects with meaningful names.
That will make debugging easier, because you can more easily check the intermediate results, and it makes it easier to understand your code, because the variable names can help communicate intent.
- Whenever you can give something an informative name, you should give it an informative name.
Don't expect to get it right the first time!
It's highly recommended to regularly spend some time just working on the clarity of your code.
The results might be exactly the same but it's not wasted effort: when you come back to the code in the future, you'll find it easier to remember what you did and easy to adapt to new demands.
- Strive to limit your code to 80 characters per line.
This fits comfortably on a printed page with a reasonably sized font.
If you find yourself running out of room, this is a good indication that you should encapsulate some of the work in a separate function.
- Restyling with styler.
- In data analysis code, use comments to record important findings and analysis decisions.