[R] how to make a table of summary statistics

Duncan Murdoch murdoch.duncan at gmail.com
Thu Dec 20 12:58:57 CET 2012


On 12-12-20 6:45 AM, Francesco Sarracino wrote:
> Dear R-listers,
>
> I am a newbie with R and I am struggling with something I consider very
> basic. I wish to produce a table (to import in a latex file) of summary
> statistics, but for as much as I've been looking around and trying various
> alternatives (plyr, reporttools, pastecs and Hmisc) I haven't found what I
> am looking for. Probably I am doing something wrong, but I can't figure out
> what.
> Let's make up three simple variables:
>
> var1 <- runif(1000)
> var2 <- runif(1000)
> var3 <- factor(rep(1:2, 500), labels = c("m", "f"))
>
> and let's create a dataset out of them:
> data <- data.frame(var1, var2, var3)
>
> what I'd like to get is a table such as the following one:
>
> variable mean sd min max obs missing
> var1
> var2
> var3
>
> where for each variable, I can read in line the mean, the standard
> deviation, the min and the max values, the number of observations and the
> percentage of missing data.
> Can you advice any way to achieve it?
> Thanks a lot in advance for your kind help,

I'm not sure what you want for var3:  it doesn't make sense to calculate 
the mean or sd for a factor.  But for the other variables, using package 
tables, you  do

latex( tabular( Heading("variable")*(var1 + var2) ~ (mean + sd + min + 
max + (obs=length) + (missing=function(x) sum(is.na(x)))), data=data) )

You might want a breakdown of the summaries by var3; you'd get that this 
way:

latex( tabular( Heading("variable")*(var1 + var2)*var3 ~ (mean + sd + 
min + max + (obs=length) + (missing=function(x) sum(is.na(x)))), 
data=data) )

Duncan Murdoch




More information about the R-help mailing list