[R] Potential bug/unexpected behaviour in model matrix

Leonidas Lundell |eo@|unde|| @end|ng |rom @und@ku@dk
Thu Aug 26 09:46:02 CEST 2021


Dear R-project,
 
Apologies if I am sending this to the wrong list, and thank you for your enormous contribution.

I discovered a subtle interaction between the data.table package and model.matrix function that influences the output to the point that you will get completely erroneous results:

df  <- data.frame(basespaceID = 8:1, group = paste0(rep(c("a", "b"), 4), "_", sort(rep(c("1", "2"), 4))))
designDF <- model.matrix(~0 + group, data = df)

dt <- data.table::as.data.table(df)
designDT <- model.matrix(~0 + group, data = dt)

all(designDF == designDT)
#TRUE

data.table::setkey(dt, "basespaceID")
designDTkeyed <- model.matrix(~0 + group, data = dt)

all(designDF == designDTkeyed)
#FALSE

# It seems that a keyed data.table reorders the rows of the design matrix by alphabetical order:
  
 designDFreordered <- model.matrix(~0 + group, data = df[8:1,])
all(designDFreordered == designDTkeyed)
#TRUE

And my sessionInfo if that’s of any help:

sessionInfo()

R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.5.2

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.14.0

loaded via a namespace (and not attached):
[1] umap_0.2.7.0      Rcpp_1.0.7        knitr_1.33        magrittr_2.0.1   
 [5] maps_3.3.0        lattice_0.20-44   rlang_0.4.11      stringr_1.4.0    
 [9] tools_4.1.0       grid_4.1.0        xfun_0.25         png_0.1-7        
[13] audio_0.1-7       RSpectra_0.16-0   htmltools_0.5.1.1 shapefiles_0.7   
[17] askpass_1.1       openssl_1.4.4     yaml_2.2.1        digest_0.6.27    
[21] zip_2.2.0         Matrix_1.3-4      beepr_1.3         evaluate_0.14    
[25] rmarkdown_2.10    openxlsx_4.2.4    sp_1.4-5          stringi_1.7.3    
[29] compiler_4.1.0    fossil_0.4.0      jsonlite_1.7.2    reticulate_1.20  
[33] foreign_0.8-81   

Best regards

Leonidas Lundell
Postdoc
Barres & Zierath group
 
University of Copenhagen
Novo Nordisk Foundation
Center for Basic Metabolic Research
 
mailto:leo.lundell using sund.ku.dk
 
 




-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 22059 bytes
Desc: image001.png
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20210826/c3ec9058/attachment.png>


More information about the R-help mailing list