[R] Need help with table() and apply()
jim holtman
jholtman at gmail.com
Sun Nov 20 23:29:12 CET 2011
The answer to your question as to why you had to convert back to
factors is that you "undid" the factors when you did the 'cbind' to
create the dataframe. Here is what you should have done:
> df <- data.frame(rating.1 , rating.2 , rating.3 , rating.4 ,
+ rating.5 , rating.6 , rating.7 , rating.8 ,
+ rating.9 , rating.10)
>
> str(df)
'data.frame': 10 obs. of 10 variables:
$ rating.1 : Factor w/ 4 levels "1","2","3","4": 4 1 2 4 3 2 4 1 2 1
$ rating.2 : Factor w/ 4 levels "1","2","3","4": 2 3 2 3 2 2 1 3 3 3
$ rating.3 : Factor w/ 4 levels "1","2","3","4": 3 1 1 3 2 1 3 3 1 3
$ rating.4 : Factor w/ 4 levels "1","2","3","4": 4 2 2 2 2 4 3 3 3 4
$ rating.5 : Factor w/ 4 levels "1","2","3","4": 1 2 2 2 1 2 3 3 4 4
$ rating.6 : Factor w/ 4 levels "1","2","3","4": 3 2 2 1 2 2 3 3 3 2
$ rating.7 : Factor w/ 4 levels "1","2","3","4": 3 4 2 2 4 3 4 4 4 4
$ rating.8 : Factor w/ 4 levels "1","2","3","4": 4 1 3 1 3 1 4 4 3 3
$ rating.9 : Factor w/ 4 levels "1","2","3","4": 4 4 2 3 2 4 3 2 3 2
$ rating.10: Factor w/ 4 levels "1","2","3","4": 1 2 1 3 2 2 3 1 1 1
Notice that the factors are maintained.
When having problems, break up the steps and see what happens at each
one. Here is the output of your 'cbind':
> x <- (cbind(rating.1 , rating.2 , rating.3 , rating.4 ,
+ rating.5 , rating.6 , rating.7 , rating.8 ,
+ rating.9 , rating.10)
+ )
> str(x)
int [1:10, 1:10] 4 1 2 4 3 2 4 1 2 1 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:10] "rating.1" "rating.2" "rating.3" "rating.4" ...
>
notice it is just an integer array.
Also if you had looked at the HELP page, you would have seen:
In the default method, all the vectors/matrices must be atomic (see
vector) or lists. Expressions are not allowed. Language objects (such
as formulae and calls) and pairlists will be coerced to lists: other
objects (such as names and external pointers) will be included as
elements in a list result. Any classes the inputs might have are
discarded (in particular, factors are replaced by their internal
codes).
Notice the last sentence.
2011/11/20 Stuart Luppescu <slu at ccsr.uchicago.edu>:
> Hello, I am having trouble getting counts of values in rows of a data
> frame. I'm trying to use apply, but it's not working.
>
> This gives a sample of the kind of data I'm working with:
>
> rating.1 <- factor(sample(1:4, size=10, replace=T), levels=1:4)
> rating.2 <- factor(sample(1:4, size=10, replace=T), levels=1:4)
> rating.3 <- factor(sample(1:3, size=10, replace=T), levels=1:4)
> rating.4 <- factor(sample(2:4, size=10, replace=T), levels=1:4)
> rating.5 <- factor(sample(1:4, size=10, replace=T), levels=1:4)
> rating.6 <- factor(sample(1:3, size=10, replace=T), levels=1:4)
> rating.7 <- factor(sample(2:4, size=10, replace=T), levels=1:4)
> rating.8 <- factor(sample(1:4, size=10, replace=T), levels=1:4)
> rating.9 <- factor(sample(2:4, size=10, replace=T), levels=1:4)
> rating.10 <- factor(sample(1:3, size=10, replace=T), levels=1:4)
>
> df <- as.data.frame(cbind(rating.1 , rating.2 , rating.3 , rating.4 ,
> rating.5 , rating.6 , rating.7 , rating.8 ,
> rating.9 , rating.10))
>
> for(i in 1:10) {
> df[,i] <- factor(df[,i], levels=1:4)
> }
>
> [Aside: why does the original df have columns of class "integer" when
> the original data are factors? Why is it necessary to reconvert them
> into factors? Also, is it possible to do this without a for loop?]
>
> If I do this:
>
> apply(df[,1:10], 1, table)
>
> I get a 4x10 array, the contents of which I do not understand.
>
> apply(df[,1:10], 2, table)
>
> gives 10 tables for the columns, but it leaves out factor levels which
> do not occur. For example,
>
> rating.6 : 'table' int [1:3(1d)] 7 1 2
> ..- attr(*, "dimnames")=List of 1
> .. ..$ : chr [1:3] "1" "2" "3"
>
> lapply(df[, 1:10], table)
>
> gives tables of the columns keeping the levels with 0 counts:
>
> $ rating.6 : 'table' int [1:4(1d)] 7 1 2 0
> ..- attr(*, "dimnames")=List of 1
> .. ..$ : chr [1:4] "1" "2" "3" "4"
>
> But I really want tables of the rows. Do I have to write my own function
> to count the numbers of values?
>
> Thanks in advance.
>
> --
> Stuart Luppescu -=- slu .at. ccsr.uchicago.edu
> University of Chicago -=- CCSR
> 才文と智奈美の父 -=- Kernel 3.0.6-gentoo
> You say yourself it wasn't reproducible. So it could have been anything
> that "crashed" your R, cosmic radiation, a bolt of lightning reversing a
> bit in your computer memory, ... :-) -- Martin Maechler (replying to a
> bug report) R-devel (July 2005)
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
More information about the R-help
mailing list