[R] Cacheing of functions from libraries other than the base in Rmarkdown

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Sun Sep 19 20:38:21 CEST 2021


You should Google "r cache" yourself, but I have used memoise, R.cache, drake, and targets, and I rate targets as #1 and R.cache as #2.

If you try to retrieve old cache objects (more than a few weeks, say) you are likely to run into package/class changes that could cause the kind of issues you are having to crop up. Try to archive results in an interchange format like csv, parquet, or feather to future-proof your work as a separate task from caching.

On September 19, 2021 10:49:50 AM PDT, Chris Evans <chrishold using psyctc.org> wrote:
>Can you point me to an example of this?  I definitely need cacheing for this work but I don't know
>about data cacheing packages.  Might be one of those things where my learning time might outweigh
>time saved but I lost a fair bit of time by being stupid with this so perhaps not.
>
>----- Original Message -----
>> From: "Jeff Newmiller" <jdnewmil using dcn.davis.ca.us>
>> To: r-help using r-project.org, "Charles Berry" <ccberry using health.ucsd.edu>, "Chris Evans" <chrishold using psyctc.org>
>> Cc: "R-help" <R-help using r-project.org>
>> Sent: Sunday, 19 September, 2021 19:45:03
>> Subject: Re: [R] Cacheing of functions from libraries other than the base in Rmarkdown
>
>> I avoid knitr (Rmarkdown uses knitr) caching like the plague. If I want caching,
>> I do it myself (with or without the aid of one of a data caching package).
>> 
>> On September 19, 2021 10:28:49 AM PDT, "Berry, Charles"
>> <ccberry using health.ucsd.edu> wrote:
>>>Chris,
>>>
>>>
>>>> On Sep 18, 2021, at 12:26 PM, Chris Evans <chrishold using psyctc.org> wrote:
>>>> 
>>>> This question may belong somewhere else, if so, please signpost me and accept
>>>> apologies.
>>>> 
>>>> What is happening is that I have a large (for me, > 3k lines) Rmarkdown file
>>>> with many R code blocks (no other code or
>>>> engine is used) working on some large datasets.  I have some inline r like
>>>> 
>>>>   There are `r n_distinct(tibDat$ID)` participants and `r nrow(tibDat)` rows of
>>>>   data.
>>>> 
>>>> What I am finding is that even if one knit has worked fine and I change
>>>> something somewhere and knit again, the second
>>>> knit is often failing with an error like
>>>> 
>>>>   n_distinct(tibDat$ID) : could not find function "n_distinct"
>>>> 
>>>> This is not happening for functions like nrow() from base R and it mostly seems
>>>> to happen to functions from the tidyverse.
>>>> 
>>>> I think what is happening is some sort of cache corruption presumably caused by
>>>> the memory demands.  I am pretty sure I've
>>>> seen this before but a long time ago and dealt with it by deleting the files and
>>>> cache folders created by the knit.
>>>
>>>Caching things that depend on libraries is known to be tricky.
>>>
>>>Specifically, it is advised that "loading packages via library() in a cached
>>>chunk and these packages will be used by uncached chunks" is something you
>>>should not do.  I suspect that this is the problem with your inline chunk.
>>>
>>>I have to reread things like:
>>>
>>>	https://yihui.org/knitr/demo/cache/
>>>
>>>and relevant parts of the manual to be sure I didn't mess something up and maybe
>>>you should look at that and the manual yet another time.
>>>
>>>HTH,
>>>
>>>Chuck
>>>
>>>______________________________________________
>>>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>> 
>> --
>> Sent from my phone. Please excuse my brevity.
>

-- 
Sent from my phone. Please excuse my brevity.



More information about the R-help mailing list