[R] Better way to create tables of mean & standard deviations

Benjamin Dickgiesser dickgiesser at gmail.com
Tue Nov 7 11:18:09 CET 2006


Hi

I'm trying to create tables of means, standard deviations and numbers
of observations (i) for
each laboratory (ii) for each batch number (iii) for each batch at
each laboratory for the attached data.

I created these functions:
summary.aggregate <- function(y, label, ...)
{
	temp.mean 	<- aggregate(y, FUN=mean, ...)
	temp.sd  		<- aggregate(y, FUN=sd, ...)
	temp.length <- aggregate(y, FUN=length, ...)
	txtlabs <-makeLabel(label,length(temp.mean$x))
	
	temp <- data.frame(mean=temp.mean$x,stdev=temp.sd$x,n=temp.length$x,row.names=txtlabs)
}
makeLabel <- function(label,llength,increaseLag=FALSE)
{
	x <- c()
	for(cnt in 1:llength)
	{
	if(increaseLag == TRUE && mode(cnt/2))
	{
		
	}
	x[cnt] <- paste(label,cnt)
	}
	x
}

and can use the following commands to create tables of means etc.

print(summary.aggregate(data.ceramic$Y,"Lab",by=list(data.ceramic$Lab)))

to create output like this:

          mean    stdev  n
Lab 1 645.6125 65.94129 60
Lab 2 655.2121 70.64094 60
Lab 3 633.3161 80.48620 60
Lab 4 650.3897 77.59191 60
Lab 5 630.4955 84.98888 60
Lab 6 656.2608 66.16100 60
Lab 7 666.1775 74.39796 60
Lab 8 663.1543 71.10769 60


The purpose of the first function is to calculate the mean, stdev etc.
and the second is simply to create a labelling vector e.g c(Lab1,
Lab2, ..., Lab 8)



This seems rather complex to me for what I am trying to achieve. Is
there a better way to do this?
Also I am having some trouble getting the labelling right for iii
since it should look like:

       Batch    mean    stdev  n
Lab 1     1 686.7179 53.37582 30
Lab 1     2 695.8710 62.08583 30
Lab 2     1 654.5317 94.19746 30
Lab 2     2 702.9095 51.44984 30
Lab 3     1 676.2975 69.13784 30
Lab 3     2 692.1952 57.27212 30
Lab 4     1 700.8995 56.91608 30
Lab 4     2 702.5668 62.36488 30
Lab 5     1 604.5070 50.01621 30
Lab 5    2 614.5532 53.64149 30
Lab 6    1 612.1006 58.09503 30
Lab 6   2 597.8699 62.40710 30
Lab 7    1 584.6934 74.66537 30
Lab 7    2 620.3263 54.34871 30
Lab 8    1 631.4555 74.34480 30
Lab 8   2 623.7419 56.42492 30

Currentley I'm using:
temp <- summary.aggregate(data.ceramic$Y,"Lab",by=list(data.ceramic$Lab,data.ceramic$Batch))
batchcnt <- c(1,2)
print(data.frame(Batc=batchcnt,temp))

But that produces this output:
       Batc     mean    stdev  n
Lab 1     1 686.7179 53.37582 30
Lab 2     2 695.8710 62.08583 30
Lab 3     1 654.5317 94.19746 30
Lab 4     2 702.9095 51.44984 30
Lab 5     1 676.2975 69.13784 30
Lab 6     2 692.1952 57.27212 30
Lab 7     1 700.8995 56.91608 30
Lab 8     2 702.5668 62.36488 30
Lab 9     1 604.5070 50.01621 30
Lab 10    2 614.5532 53.64149 30
Lab 11    1 612.1006 58.09503 30
Lab 12    2 597.8699 62.40710 30
Lab 13    1 584.6934 74.66537 30
Lab 14    2 620.3263 54.34871 30
Lab 15    1 631.4555 74.34480 30
Lab 16    2 623.7419 56.42492 30

I can only think of  rather complex ways to solve the labeling issue...

I would appreciate it if someone could point out if there are
better/cleaner/easier ways of achieving what I'm trying todo.

Benjamin


More information about the R-help mailing list