[R] "found 4 marked UTF-8 strings" during check of package... but where !

Duncan Murdoch murdoch.duncan at gmail.com
Fri Mar 10 20:06:41 CET 2017

On 10/03/2017 10:44 AM, Marc Girondot via R-help wrote:
> Thanks Duncan and Michael,
> Indeed I have data file with utf-8 characters inside. In the
> DESCRIPTION, I have the line Encoding: UTF-8
> but it seems to not be sufficient.

That line describes how text files in your package are to be interpreted.

> In each R page for these data, I have also :
> #' @docType data
> #' @encoding UTF-8

R ignores those lines, but presumably you're running Roxygen2, which 
will use them when it produces the .Rd files for the help topics.  They 
have nothing to do with the data itself.

> But I still have the notes during check when I try to submit the package
> in CRAN (not in local --as-cran check).
> How I could "say" that these data have utf-8 characters inside?

If those are intentional, just say so when you submit to CRAN in the 
submission comments, but do read the comments about portability in 
section 1.6.3 of the Writing R Extensions manual.

Duncan Murdoch

P.S. This question doesn't belong in R-help, it belongs in 
R-package-devel.  If you have any followup questions, please post them 

> Thanks
> Marc
> Le 10/03/2017 à 15:24, Duncan Murdoch a écrit :
>> On 10/03/2017 2:52 AM, Marc Girondot via R-help wrote:
>>> Dear members,
>>> I want submit to CRAN a new version of a package that I maintain. When I
>>> check locally "as-cran" no note or error are reported but the link after
>>> submission reports several notes and one warning:
>>> For example:
>>> using R Under development (unstable) (2017-03-05 r72309)
>>> using platform: x86_64-apple-darwin16.4.0 (64-bit)
>>> using session charset: UTF-8
>>> ...
>>> checking extension type ... Package
>>> this is package ‘embryogrowth’ version ‘6.4’
>>> package encoding: UTF-8
>>> ...
>>> checking data for non-ASCII characters ... NOTE
>>>    Note: found 4 marked UTF-8 strings
>>> I have the same with
>>> using R version 3.3.0 (2016-05-03)
>>> using platform: x86_64-apple-darwin13.4.0 (64-bit)
>>> but not with some others such as r-devel-linux-x86_64-debian-gcc
>>> Based on the message, "Note: found 4 marked UTF-8 strings", it seems
>>> that "4 marked UTF-8 strings" are present in the package and it is a
>>> problem...
>>> Is there any solution to know in which file?
>> It's one containing an object coming from your data directory.
>> R won't give more detail than that, but if you still can't guess, you
>> could get some idea by debugging the check code:
>> debug(tools:::.check_package_datasets)
>> tools:::.check_package_datasets(pkg)
>> where pkg contains the path to the package source code.  That function
>> does the checking one variable at a time.
>> Duncan Murdoch
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list