[Rd] iconv: embedded nulls when converting to UTF-16

Braun, Michael br@unm @end|ng |rom m@||@@mu@edu
Sun Aug 4 05:59:52 CEST 2019


R-devel community:

I have encountered some unexpected behavior using iconv, which may be the source of errors I am getting when connecting to a UTF-16 -encoded SQL Server database.  A simple example is below. 

When researching this problem, I found r-devel reports of the same problem in threads from June 2010 and February, 2016, and that bug #16738 was posted to Bugzilla as a result.  However, I have not been able to determine if the error is mine, if there is a known workaround, or it truly is a bug in R’s iconv implementation.  Any additional help is appreciated.

Thanks,

Michael

——

sessionInfo()
#> R version 3.6.1 (2019-07-05).   ## and replicated on R 3.4.1 on a cluster running CentOS Linux 7.
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS Mojave 10.14.6
# <snip>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     

#> loaded via a namespace (and not attached):
#> [1] compiler_3.6.1 

s <- "test"
iconv(s, to="UTF-8”)
#> [1] “test"

iconv(s, to="UTF-16")
#> Error in iconv(s, to = "UTF-16"): embedded nul in string: '\xfe\xff\0t\0e\0s\0t’

iconv(s, to="UTF-16BE")
#> Error in iconv(s, to = "UTF-16BE"): embedded nul in string: '\0t\0e\0s\0t’

iconv(s, to="UTF-16LE")
#> Error in iconv(s, to = "UTF-16LE"): embedded nul in string: 't\0e\0s\0t\0’




--------------------------
Michael Braun, Ph.D.
Associate Professor of Marketing, and
  Corrigan Research Professor
Cox School of Business
Southern Methodist University
Dallas, TX 75275







More information about the R-devel mailing list