[R] contingency tables in R

David L Carlson dcarlson at tamu.edu
Mon Jul 23 17:30:36 CEST 2012


Your first example creates a four-way table (sex, familyhist, numrisk,
hypertension) and your second example adds four more risk factors. From the
second example, you can use margin.table() to sum across any set of
dimensions and ftable() to present the results of different slices. You
should be able to accomplish what you want without making separate
data.frames.

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Jin Choi
> Sent: Sunday, July 22, 2012 11:32 AM
> To: r-help at r-project.org
> Subject: [R] contingency tables in R
> 
> Hello!
> 
> I am interested in creating contingency tables, namely one that would
> let me find the frequency and proportion of patients with specific
> risk factors (dyslipidemia, diabetes, obesity, smoking, hypertension).
> There are 3 dimensions I would like to divide the population into:
> sex, family history, and number of risk factors.
> In R, I used the following code:
> 
> mytable<-xtabs(~sex+familyhist+numrisk+hypertension,data=mydata)
> ftable(mytable)
> a<-ftable(mytable)
> prop.table(a,1)
> 
> However, when I conduct the following code:
> 
> mytable<-
> xtabs(~sex+familyhist+numrisk+hypertension+diabetes+obesity+smoking+dys
> lipidemia,data=mydata)
> 
> Here the table simply considers the additional risk factors as new
> dimensions, which I do not want. I would like to find a way where the
> dimensions are sex, family history, and number of risk factors and I
> am finding the frequency and prevalence for each risk factor
> (dyslipidemia, diabetes, obesity, smoking, hypertension) in each of
> these subgroups.
> 
> The only way to get around this problem I could think of is to create
> new data frames for each number of risk factor subgroup: numrisk1,
> numrisk2, numrisk3.where numrisk1 indicates population with 1 risk
> factor. Then I could calculate the prevalence of each risk factor
> separately. This approach will take a very long time so I was hoping
> to ask if anyone knew of a solution to this issue I am having with
> contingency tables...perhaps a useful R package?
> 
> Thank you for your help!
> 
> Jin Choi
> Masters Student (Epidemiology)
> McGill University
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list