[R] input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?

John Kane jrkr|de@u @end|ng |rom gm@||@com
Sun Apr 25 19:46:55 CEST 2021


The tab format seems to read in with no problem.

On Thu, 22 Apr 2021 at 23:08, Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>
> On 22/04/2021 9:25 p.m., Spencer Graves wrote:
> > Hello:
> >
> >
> >         What if anything should I do regarding notes from either "load" or
> > "attach" that, "input string ... cannot be translated to UTF-8, is it
> > valid in 'ANSI_X3.4-1968'?"?
>
> First, ANSI_X3.4-1968  is an official name for for a version of Ascii.
> It appears in the file near the start, where I believe it records the
> native encoding in place when the file was written, so readers using a
> different encoding can translate.
>
> Your actual file appears to have been encoded in UTF-8, but not marked
> as such.  You're lucky you read it on macOS, where UTF-8 is the native
> encoding, since the reader probably recognized the bytes weren't ascii
> bytes (and warned you about that), then just left them alone.  If you
> read that file on Windows you'd likely get junk for those entries.
>
> For your interest, here's a dump of the start of your file, after
> gunzipping it:
>
> 00000000  52 44 58 33 0a 58 0a 00  00 00 03 00 03 06 00 00
> |RDX3.X..........|
> 00000010  03 05 00 00 00 00 0e 41  4e 53 49 5f 58 33 2e 34
> |.......ANSI_X3.4|
> 00000020  2d 31 39 36 38 00 00 04  02 00 00 00 01 00 04 00
> |-1968...........|
> 00000030  09 00 00 00 01 78 00 00  03 13 00 00 00 10 00 00
> |.....x..........|
> 00000040  02 0e 00 00 02 6e 40 90  0c 00 00 00 00 00 40 90
> |.....n using .......@.|
> 00000050  44 00 00 00 00 00 40 10  00 00 00 00 00 00 40 7c
> |D..... using .......@||
>
> Duncan Murdoch
>
> >
> >
> >         I'm running R 4.0.5 under macOS 11.2.3;  see "sessionInfo()" and
> > detailed instructions below on the precise file I dowloaded from the web
> > and tried to read.
> >
> >
> >         I may be able to get what I want just ignoring this.  However, I'd
> > like to know how to fix this.
> >
> >
> >         Thanks,
> >         Spencer Graves
> >
> >
> > sessionInfo()
> > R version 4.0.5 (2021-03-31)
> > Platform: x86_64-apple-darwin17.0 (64-bit)
> > Running under: macOS Big Sur 10.16
> >
> > Matrix products: default
> > LAPACK:
> > /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > loaded via a namespace (and not attached):
> >    [1] compiler_4.0.5    htmltools_0.5.1.1 tools_4.0.5       yaml_2.2.1
> >
> >    [5] tinytex_0.31      rmarkdown_2.7     knitr_1.31
> > digest_0.6.27
> >    [9] xfun_0.22         rlang_0.4.10      evaluate_0.14
> >   > search()
> >    [1] ".GlobalEnv"                "file:NAVCO 1.3 List.RData"
> >    [3] "file:NAVCO 1.3 List.RData" "tools:rstudio"
> >    [5] "package:stats"             "package:graphics"
> >    [7] "package:grDevices"         "package:utils"
> >    [9] "package:datasets"          "package:methods"
> > [11] "Autoloads"                 "package:base"
> >
> >
> > *** To get the file I used for this, I went to
> > "https://www.ericachenoweth.com/research".  From there I clicked
> > "Version 1.3".  This took me to
> >
> >
> > https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ON9XND
> >
> >
> > I then clicked the "Download" icon to the right of "NAVCO 1.3 List.tab".
> >    This gave me 5 "Download Options", one of which was "RData Format";  I
> > selected that.  This downloaded "NAVCO 1.3 List.RData", which I moved to
> > getwd().  Then I did 'load("NAVCO 1.3 List.RData")' and 'attach("NAVCO
> > 1.3 List.RData")'.  Both of those gave me 8 repetitions of a message
> > like "input string ... cannot be translated to UTF-8, is it valid in
> > 'ANSI_X3.4-1968'?" with different values substituted for "...".
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
John Kane
Kingston ON Canada



More information about the R-help mailing list