[R] Making a 'joint distribution'?

Thu Oct 7 07:17:55 CEST 2004

Ajay Shah <ajayshah <at> mayin.org> writes:

: 
: Thanks to everyone who helped me solve this question. My cleanest
: solution is:
: 
: joint.and.marginals <- function(x,y) {
:   t <- addmargins(table(x, y))
:   rownames(t)[nrow(t)] <- deparse(substitute(y))
:   colnames(t)[ncol(t)] <- deparse(substitute(x))
:   return(t)
: }
: 
: There are many other valid solutions, but this one struck me as being
: the simplest.
: 
: As a demo of it's use:
: 
: > D <- data.frame(f1=sample(1:5,10000,replace=T), f2=sample
(1:5,10000,replace=T)
: > system.time(print(joint.and.marginals(D$f1, D$f2)))
:       y
: x         1    2    3    4    5  D$f1
:   1     420  427  385  376  423  2031
:   2     425  432  429  375  347  2008
:   3     405  419  434  401  352  2011
:   4     374  374  370  417  403  1938
:   5     403  381  409  388  431  2012
:   D$f2 2027 2033 2027 1957 1956 10000
: [1] 0.05 0.00 0.07 0.00 0.00
: 
: Hmm, how would one get rid of the 'x' and 'y' that are occuring in the
: table? 

Just add this line to your function:

	names(dimnames(t)) <- NULL

or you might consider the following which replaces x and y
with D$f1 and D$f2 and leaves the added rows and columns
with the Sum heading:

jm2 <- function(x,y) {
  t <- addmargins(table(x,y))
  names(dimnames(t)) <- list(deparse(substitute(x)), deparse(substitute(y)))
  tab
}
jm2(D$f1, D$f2)