Rmd workflow proofing
This commit is contained in:
parent
891ab1d04d
commit
8ef3120b97
|
@ -1,24 +1,24 @@
|
||||||
# R Markdown workflow
|
# R Markdown workflow
|
||||||
|
|
||||||
Earlier, we discussed a basic workflow for capturing your R code where you work interactively in the _console_, then capture what works in the _script editor_. R Markdown effectively puts the console and the script editor in same place, blurring the lines between interactive exploration and long-term code capture. You can rapidly iterate within a chunk, editing and re-executing with Cmd/Ctrl + Shift + Enter. When you're happy, you move on and start a new chunk.
|
Earlier, we discussed a basic workflow for capturing your R code where you work interactively in the _console_, then capture what works in the _script editor_. R Markdown brings together the console and the script editor, blurring the lines between interactive exploration and long-term code capture. You can rapidly iterate within a chunk, editing and re-executing with Cmd/Ctrl + Shift + Enter. When you're happy, you move on and start a new chunk.
|
||||||
|
|
||||||
R Markdown is also important because it so tightly integrates prose and code. This makes it a great __analysis notebook__ because it lets you develop code and record your thoughts. An analysis notebook shares many of the same goals as a classic lab notebook in the physical sciences:
|
R Markdown is also important because it so tightly integrates prose and code. This makes it a great __analysis notebook__ because it lets you develop code and record your thoughts. An analysis notebook shares many of the same goals as a classic lab notebook in the physical sciences. It:
|
||||||
|
|
||||||
* Record what you did and why you did it. Regardless of how great your
|
* Records what you did and why you did it. Regardless of how great your
|
||||||
memory is, if you don't record what you do, there will come a time when
|
memory is, if you don't record what you do, there will come a time when
|
||||||
you have forgotten important details. Write them down so you don't forget!
|
you have forgotten important details. Write them down so you don't forget!
|
||||||
|
|
||||||
* To support rigorous thinking. You are more likely to come up with a strong
|
* Supports rigorous thinking. You are more likely to come up with a strong
|
||||||
analysis if you record your thoughts as you go, and continue to reflect
|
analysis if you record your thoughts as you go, and continue to reflect
|
||||||
on them. This also saves you time when you eventually write up your
|
on them. This also saves you time when you eventually write up your
|
||||||
analysis to share with others.
|
analysis to share with others.
|
||||||
|
|
||||||
* To help others understand your work. It is rare to do data analysis by
|
* Helps others understand your work. It is rare to do data analysis by
|
||||||
yourself, and you'll often be working as part of a team. A lab notebook
|
yourself, and you'll often be working as part of a team. A lab notebook
|
||||||
helps you share not only what you've done, but why you did it with your
|
helps you share not only what you've done, but why you did it with your
|
||||||
colleagues or lab mates.
|
colleagues or lab mates.
|
||||||
|
|
||||||
Much of the good advice about using lab notebooks effectively can also be translated to analysis notebooks. I've drawn on my own experiences and Colin Purrington's advice on lab notebooks <http://colinpurrington.com/tips/lab-notebooks> to come up with the following list of tips:
|
Much of the good advice about using lab notebooks effectively can also be translated to analysis notebooks. I've drawn on my own experiences and Colin Purrington's advice on lab notebooks (<http://colinpurrington.com/tips/lab-notebooks>) to come up with the following tips:
|
||||||
|
|
||||||
* Ensure each notebook has a descriptive title, an evocative filename, and a
|
* Ensure each notebook has a descriptive title, an evocative filename, and a
|
||||||
first paragraph that briefly describes the aims of the analysis.
|
first paragraph that briefly describes the aims of the analysis.
|
||||||
|
@ -38,23 +38,26 @@ Much of the good advice about using lab notebooks effectively can also be transl
|
||||||
leave it in the notebook. That will help you avoid going down the same
|
leave it in the notebook. That will help you avoid going down the same
|
||||||
dead end when you come back to the analysis in the future.
|
dead end when you come back to the analysis in the future.
|
||||||
|
|
||||||
* Generally, you're better off doing data entry outside of R. If you need
|
* Generally, you're better off doing data entry outside of R. But if you
|
||||||
a small snippet of data, clearly lay it out using `tibble::tribble()`.
|
do need to record a small snippet of data, clearly lay it out using
|
||||||
|
`tibble::tribble()`.
|
||||||
|
|
||||||
* If you discover an error in a data file, never modify it directly, but
|
* If you discover an error in a data file, never modify it directly, but
|
||||||
instead write code to correct the value. Explain why you made the fix.
|
instead write code to correct the value. Explain why you made the fix.
|
||||||
|
|
||||||
* Before you finish for the day, make sure you can compile the notebook
|
* Before you finish for the day, make sure you can knit the notebook
|
||||||
using knitr (aftering clearing caches if you're using them). That will
|
(if you're using caching, make sure to clear the caches). That will
|
||||||
let you fix any problems while the code is still fresh in your mind.
|
let you fix any problems while the code is still fresh in your mind.
|
||||||
|
|
||||||
* If you want your code to be reproducible in the long-run (i.e. so you can
|
* If you want your code to be reproducible in the long-run (i.e. so you can
|
||||||
come back to run it next month or next year), you'll need to track the
|
come back to run it next month or next year), you'll need to track the
|
||||||
versions of the packages that your code uses. A rigorous approach is to use
|
versions of the packages that your code uses. A rigorous approach is to use
|
||||||
__packrat__, <http://rstudio.github.io/packrat/>, which stores packages in
|
__packrat__, <http://rstudio.github.io/packrat/>, which store packages
|
||||||
your project directory. A quick and dirty hack is to include a chunk that
|
in your project directory, or __checkpoint__,
|
||||||
runs `sessionInfo()` --- that won't let easily recreate your packages as
|
<https://github.com/RevolutionAnalytics/checkpoint>, which will reinstall
|
||||||
they are today, but at least you know what they were.
|
packages avaialble on a specified date. A quick and dirty hack is to include
|
||||||
|
a chunk that runs `sessionInfo()` --- that won't you let easily recreate
|
||||||
|
your packages as they are today, but at least you'll know what they were.
|
||||||
|
|
||||||
* You are going to create many, many, many analysis notebooks over the course
|
* You are going to create many, many, many analysis notebooks over the course
|
||||||
of your career. How are you going to organise them so you can find them
|
of your career. How are you going to organise them so you can find them
|
||||||
|
|
Loading…
Reference in New Issue