[R] Crosstabbing multiple response data

John Kane jrkrideau at yahoo.ca
Tue Feb 27 13:58:55 CET 2007


Thanks to everyone for this.  I was looking at the
same problem last night and just was going to write a
posting to R-help when I saw this.  


--- Michael Wexler <wexler at yahoo.com> wrote:

> 
> Thanks to Charles, Gabor, and a private message from
> Frank E Harrell with some good ideas and help.  This
> crossprod approach was very clever, I would never
> have thought of it.
> 
> Best, Michael
> 
> 
> ----- Original Message ----
> From: Charles C. Berry <cberry at tajo.ucsd.edu>
> To: Michael Wexler <wexler at yahoo.com>
> Cc: r-help at stat.math.ethz.ch
> Sent: Thursday, February 22, 2007 1:17:44 PM
> Subject: Re: [R] Crosstabbing multiple response data
> 
> 
> > res <- crossprod( as.matrix( ratings[ , -1] ) )
> > diag(res) <- ""
> > print(res, quote=F)
>       att1 att2 att3
> att1      2    1
> att2 2         2
> att3 1    2
> > 
> > res2 <- crossprod(as.matrix( ratings[ , -1])) *
> 100 / nrow( ratings )
> > res2[] <- paste( res2, "%", sep="" )
> > diag(res2) <- ""
> > print(res2, quote=F)
>       att1 att2 att3
> att1      50%  25%
> att2 50%       50%
> att3 25%  50%
> >
> 
> Be sure to bone up on format and sprintf before
> taking this into 
> production.
> 
> On Thu, 22 Feb 2007, Michael Wexler wrote:
> 
> > Using R version 2.4.1 (2006-12-18) on Windows, I
> have a dataset which resembles this:
> >
> > id    att1    att2    att3
> > 1    1        1        0
> > 2    1        0        0
> > 3    0        1        1
> > 4    1        1        1
> >
> > ratings <- data.frame(id = c(1,2,3,4), att1 =
> c(1,1,0,1), att2 = c(1,0,0,1), att3 = c(0,1,1,1))
> >
> > I would like to get a cross tab of counts of
> co-ocurrence, which might resemble this:
> >
> >    att1    att2    att3
> > att1         2       1
> > att2    2            2
> > att3    1    2
> >
> > with the hope of understanding, at least pairwise,
> what things "hang together".   (Yes, there are much,
> much better ways to do this statistically including
> clustering and binary corrected correlation, but the
> audience I am working with asked for this version
> for a specific reason.)
> >
> > (Later on, I would also like to convert to
> percentages of the total unique pop, so the final
> version of the table would be
> >
> >
> >    att1    att2    att3
> >
> > att1         50%       25%
> >
> > att2    50%            50%
> >
> > att3    25%    50%
> >
> >
> > But I can do this in excel if I can get the first
> table out.)
> >
> > I have tried the reshape library, but could not
> get anything resembling this (both on its own, as
> well as feeding in to table()).  (I have also played
> with transposing and using some comments from this
> list from 2002 and 2004, but the questioners appear
> to assume more knowledge than I have in use of R;
> the example in the posting guide was also more
> complex than I was ready for, I'm afraid.)
> >
> > Sample of some of my efforts:
> > library(reshape)
> > melt(ratings,id=c("id"))
> >
> > ds1 <- melt(ratings,id=c("id"))
> > table(ds1$variable, ds1$variable) # returns only
> rowcounts, 3 along diagonal
> > xtabs(formula = value ~ ds1$variable +
> ds1$variable , data=ds1) # returns only a single row
> of collapsed counts, appears to not allow 1 variable
> in multiple uses
> >
> > I suspect I am close, so any nudges in the right
> direction would be helpful.
> >
> > Thanks much, Michael
> >
> > PS: www.rseek.org is very impressive, I heartily
> encourage its use.
> >
> >
> >     [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> >
> 
> Charles C. Berry                        (858)
> 534-2098
>                                           Dept of
> Family/Preventive Medicine
> E mailto:cberry at tajo.ucsd.edu             UC San
> Diego
> http://biostat.ucsd.edu/~cberry/         La Jolla,
> San Diego 92093-0901
> 
> 
> 
> 
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>



More information about the R-help mailing list