[R] omit empty cells in crosstab?

Phil Spector spector at stat.berkeley.edu
Sat Apr 25 00:51:23 CEST 2009


I think the easiest way to deal with this problem is
to paste together the values, use table on those, and
then unpaste (strsplit) them back.  Using the 10 columns
with 10 levels example:

> set.seed(25)
> x = as.data.frame(replicate(10,sample(1:10,12,replace=TRUE)))
> res = apply(x,1,paste,collapse=':')
> tt1 = as.data.frame(table(res))
> vals = strsplit(as.character(tt1$res),":")
> answer = data.frame(do.call(rbind,vals),Freq=tt1$Freq)
> answer
     X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 Freq
1   10  8  3  3  5  9  9  6  5   2    1
2    1  1  3  6  2  7  7  7  1  10    1
3    2  6  6  8  9  7  2  8  6   3    1
4    2  7  1  2  2  4  3  4  3   2    1
5    3  5  2  3  6  3  4  7  6   7    1
6    4  1  9 10  2  9  6  1  4   1    1
7    4  2  5  8  2  2  1  4  6   3    1
8    4  8  4  8  2  9  2  3  4   1    1
9    5 10  3 10  1  2  1  9  7  10    1
10   7  5  9  6  6  5  2  5  7   2    1
11   7  6  4  4  8  3  8  8 10   6    1
12   9  2  6  2  8  7  5  4  2   1    1

                                        - Phil Spector
 					 Statistical Computing Facility
 					 Department of Statistics
 					 UC Berkeley
 					 spector at stat.berkeley.edu


On Fri, 24 Apr 2009, sjaffe wrote:

>
> small example:
>
> a<-c(1.1, 2.1, 9.1)
> b<-cut(a,0:10)
> c<-data.frame(b,b)
> d<-table(c)
> dim(d)
> ##result: c(10, 10)
>
> But only 9 of the 100 cells are non-zero.
> If there were 10 columns, the table have 10 dimensions each of length 10, so
> have 10^10 elements, too much even to fit in memory
>
>
> Dieter Menne wrote:
>>
>> sjaffe <sjaffe <at> riskspan.com> writes:
>>
>>>
>>> I have data with many factors, each taking many values. However, only
>>> relatively few combinations appear in the data, ie have nonzero counts,
>>> in
>>> other words the resulting table is sparse. Say we have 10 factors each
>>> with
>>> 10 levels. The result of table() would exceed the memory space (on a
>>> 32bit
>>> machine). Is there any way to produce a table with empty cells omitted?
>>> (without first producing the whole table and then removing rows.)
>>
>> It would be easier if you had a reproducible base example, but I
>> suggest to create ONE new factor of the pasted levels using unique(),
>> and  creating a table of these.
>>
>> Dieter
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> View this message in context: http://www.nabble.com/omit-empty-cells-in-crosstab--tp23222263p23224071.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list