[R] coerce data to numeric
Charles R Parker
cddis at att.net
Wed Dec 3 19:29:12 CET 2014
I am trying to create groups of barplots from data that have different number of records in the groups, in such a way that all of the plots will have the same numbers and sizes of bars represented even when some of the groups will have some bars of zero height. The goal then would be to display multiple plots on a single page using split.screen or something similar. lattice does not seem suitable because of the data structure it operates on. A simple data structure that I operate on is given here:
> dput(stplot)
structure(list(GId = structure(1:11, .Label = c("A1", "B1", "B2",
"B3", "B4", "B5", "C1", "C2", "D1", "D2", "D3"), class = "factor"),
Grp = structure(c(1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 4L, 4L,
4L), .Label = c("A", "B", "C", "D"), class = "factor"), S = c(12.3,
23.8, 0, 7.6, 14.32, 1.9, 5.1, 0, 14.6, 10.1, 8.7), T = c(5L,
12L, 2L, 1L, 4L, 1L, 1L, 9L, 5L, 6L, 3L)), .Names = c("GId",
"Grp", "S", "T"), class = "data.frame", row.names = c(NA, -11L
))
My code, which doesn't quite work is:
> nbars <-
function(x){
sG = summary(x$Grp)
mG = max(sG)
for(n in 1:length(sG)){
tX = subset(x,x$Grp==names(sG[n]))
if(nrow(tX) < mG){
fm = as.numeric(rep(length = mG - nrow(tX), 0))
tX = rbind(tX, as.data.frame(cbind(GId = " ",Grp = names(sG[n]),
S = fm, T = fm)))
}
#print(tX)
#dput(t(as.matrix(tX[,3:4])))
barplot(t(as.matrix(tX[,3:4])),beside=TRUE, names.arg=tX$GId,
col = c("navy","gray"))
}
}
The function nbars first gets the list of group values with their counts 'summary(x$Grp)'.
It then determines the maximum number of bar pairs in the largest of the groups 'max(sG)', and uses this to determine how much each smaller group needs to be padded to fill out the proper number of bars in the ultimate barplots, using the for loop. If you uncomment the #print(tX) you can see that this works...sort of. The problem becomes apparent if you uncomment the #dput. This shows that the tX treats the S and T values as characters rather than as numeric values. This prevents the barplots from working. By changing the for loop to begin 'for(n in 2:length(sG)' the second plot will display correctly, but the third plot will fail.
I have tried various options to force the S and T variables to be numeric, but none of those have worked (as.numeric(fm), as.matrix(fm), as.vector(fm)) in the 'if(nrow(tX) < mG)' loop, but these have not worked.
If there is a sure-fire way to solve the problem I would be grateful.
Thanks.
[[alternative HTML version deleted]]
More information about the R-help
mailing list