[R] NAs produced by integer overflow, but only some time ...

Stefan Th. Gries @tgrie@ @ending from gm@il@com
Wed May 9 04:54:26 CEST 2018


I have problem with integer overflow that I cannot understand.

I have a character vector curr.lemmas with the following properties:

length(curr.lemmas) # 61224
length(unique(curr.lemmas)) # 2652

That vector is the input to the following function:

yules.k1 <- function(input) {
   m1 <- length(input); temp <- table(table(input))
   m2 <- sum("*"(temp, as.numeric(names(temp))^2))
   return(10000*(m2-m1) / (m1*m1))
}

When I run this, I get the following output:

[1] NA
Warning message:
In m1 * m1 : NAs produced by integer overflow

But when I change the function to this one by just replacing m1*m1 by m1^2 ...

yules.k2 <- function(input) {
   m1 <- length(input); temp <- table(table(input))
   m2 <- sum("*"(temp, as.numeric(names(temp))^2))
   return(10000*(m2-m1) / (m1^2))
}

yules.k2(curr.lemmas) # -> 157.261

I am using RStudio 1.1.447 and here's my sessionInfo
######################
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 18.3

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C
               LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
 [1] compiler_3.4.4  backports_1.1.2 magrittr_1.5    rprojroot_1.3-2
htmltools_0.3.6 tools_3.4.4     yaml_2.1.19     Rcpp_0.12.16
stringi_1.2.2
[10] rmarkdown_1.9   knitr_1.20      stringr_1.3.0   digest_0.6.15
evaluate_0.10.1
######################

What is even more puzzling is that one time I ran R in the console of
Geany and this happened:

> m1
[1] 61224
> 61224*61224
[1] 3748378176
> 61224^2
[1] 3748378176
> m1*m1
[1] NA
Warning message:
In m1 * m1 : NAs produced by integer overflow
> m1^2
[1] 3748378176

That is, the multiplication worked with the numbers but not the
numeric vectors; the above is literally copied from the console. Why
is that happening?

Any help would be much appreciated!
STG
--
Stefan Th. Gries
----------------------------------
Univ. of California, Santa Barbara
http://tinyurl.com/stgries



More information about the R-help mailing list