[R] getting means by group within time point for data on multiple lines (long rather than wide file)

Ivan Calandra ivan.calandra at univ-reims.fr
Thu Sep 17 13:44:26 CEST 2015


Hi John,

This will not be the complete answer, but it can probably help you in 
the right direction.

First, I would subset your data.frame to include only subjects with one 
observation at each time point (and I'm not sure how to do that easily).

But then, the aggregate() function is what you need. Let's say your 
subset data.frame is called df:
aggregate(value~group+time, data=df, FUN=function(x) c(length(x),mean(x)))

By defining your own function in aggregate() you can compute both the 
length(), i.e. the number of subjects that were used in the computation, 
and the mean() per group and per time-point.

HTH,
Ivan

--
Ivan Calandra, PhD
University of Reims Champagne-Ardenne
GEGENAA - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
ivan.calandra at univ-reims.fr
https://www.researchgate.net/profile/Ivan_Calandra

Le 17/09/15 13:06, John Sorkin a écrit :
> I have a long (rather than wide file), i.e. the data for each subject is on multiple lines rather than one line. Each line has the following layout:
> subject group time value
> I have two groups, multiple subjects, each subject can be seen up to three times a time 0, and at most once at times 4 and 8.
> An example of the data follows:
>
> 1 control 0 100
> 1 control 0 NA
> 1 control 0 55
> 1 control 4 100
> 1 control 8 100
>
> 2 exp 0 99
> 2 exp 0 67
> 2 exp 0 66
> 2 exp 4 110
> 2 exp 8 200
>
> I need to get means by group (control vs. exp) within time (0,4,8). The means should include only those subjects who have at least one observation at each time point (0, 4, 8). I also need to determine the number of subjects who contribute data at each time-point by group. Any suggestion on how to get them means would be appreciated. Sad to say I worked on this for four hours last night without coming to any understanding how this can be done. UGG!
>
> Thank you,
> John
>
>
>
>
>> John David Sorkin M.D., Ph.D.
>> Professor of Medicine
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:8}}



More information about the R-help mailing list