[R] bug(?) in str() with strict.width = "cut" when applied to dataframe with numeric component AND factor or character component with longerlevels/strings
Gerrit Eichner
Gerrit.Eichner at math.uni-giessen.de
Tue Oct 15 13:53:12 CEST 2013
Dear list subscribers,
here is a small artificial example to demonstrate the problem that I
encountered when looking at the structure of a (larger) data frame that
comprised (among other components)
a numeric component of elements of the order of > 10000, and
a factor or character component with longer levels/strings:
k <- 43 # length of levels or character strings
n <- 11 # number of rows of data frame
M <- 10000 # order of magnitude of numerical values
set.seed( 47) # to reproduce the following artificial character string
longer.char.string <- paste( sample( letters, k, replace = TRUE),
collapse = "")
X <- data.frame( A = 1:n * M,
B = rep( longer.char.string, n))
The following call to str() gives apparently a wrong result
str( X, strict.width = "cut")
'data.frame': 11 obs. of 2 variables:
$ A: num 1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+..
$ A: num 1e+04 2e+04 3e+04 4e+04 5e+04 6e+04 7e+04 8e+04 9e+04 1e+..
whereas the correct result appears for str( X) or if you decrease k to 42
(isn't that "the answer"? ;-) ) or n to 10 or M to 1000 (or smaller,
respectively).
I tried to dig into the entrails of str.default(), where the cause may
lie, but got lost pretty soon. So, I am hoping that someone may already
have a work-around or patch (or dares to dig further)? Thank you for any
feedback!
Best regards -- Gerrit
PS:
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] splines stats graphics grDevices utils datasets
[7] methods base
other attached packages:
[1] nparcomp_2.0 multcomp_1.2-21 mvtnorm_0.9-9996
[4] car_2.0-19 Hmisc_3.12-2 Formula_1.1-1
[7] survival_2.37-4 fortunes_1.5-0
loaded via a namespace (and not attached):
[1] cluster_1.14.4 grid_3.0.2 lattice_0.20-23 MASS_7.3-29
[5] nnet_7.3-7 rpart_4.1-3 stats4_3.0.2 tools_3.0.2
---------------------------------------------------------------------
Dr. Gerrit Eichner Mathematical Institute, Room 212
gerrit.eichner at math.uni-giessen.de Justus-Liebig-University Giessen
Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany
Fax: +49-(0)641-99-32109 http://www.uni-giessen.de/cms/eichner
