From 492007afcaebd0a2863fb8f4a4d9765b16112414 Mon Sep 17 00:00:00 2001 From: hadley Date: Mon, 22 Aug 2016 15:36:28 -0500 Subject: [PATCH] More on caching --- rmarkdown.Rmd | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/rmarkdown.Rmd b/rmarkdown.Rmd index 60615f6..085becf 100644 --- a/rmarkdown.Rmd +++ b/rmarkdown.Rmd @@ -291,9 +291,16 @@ Caching the `processed_data` chunk means that it will get re-run if the dplyr pi `dependson` should contain a character vector of *every* chunk that the cached chunk depends on. Knitr will update the results for the cached chunk whenever it detects that one of its dependencies have changed. -Note that the chunks won't update if `a_very_large_file.csv` changes, because knitr caching only tracks changes within the `.Rmd` file. That means it's a good idea to periodically clear out all your caches and start from scratch by running `knitr::clean_cache()`. +Note that the chunks won't update if `a_very_large_file.csv` changes, because knitr caching only tracks changes within the `.Rmd` file. If you want to also track changes to that file you can use the `cache.extra` option. This is an arbitrary R expression that will invalidate the cache whenever it changes. A good function to use is `file.info()`: it returns a bunch of information about the file including when it was last modified. Then you can write: -Note that I've used the advice of [David Robinson](https://twitter.com/drob/status/738786604731490304) for naming these chunks. It's a good idea to name chunks that create objects after the primary object that they create. That makes it easier to understand the `dependson` specification. + # chunk 1 + `r chunk`{r raw_data, cache.extra = file.info("a_very_large_file.csv")} + rawdata <- readr::read_csv("a_very_large_file.csv") + `r chunk` + +As your caching strategies get progressively more complicated, it's good idea to regularly clear out all your caches with `knitr::clean_cache()`. + +Note that I've used the advice of [David Robinson](https://twitter.com/drob/status/738786604731490304) for naming these chunks. It's a good idea to name chunks that create objects after the primary object that they create because that makes it easier to understand the `dependson` specification. ### Global options