[R] how to "apply" functions to unbalanced data in long format by factors......cant get "by" or "aggregate" to work

ALAN SMITH alansmith2 at gmail.com
Thu Mar 8 18:34:52 CET 2007


Hello R-users
The help I received from Petr helped me created this solution to my problems.

t1<-with(mydata ,aggregate(mydata$Y,
list(mydata$time,mydata$treatment, mydata$expREP, mydata$techREP) ,
median, na.rm=T)) ### find median by factors ####

colnames(t1)<-c("time","treatment","expREP","techREP","Y50") ### column name ##

newdata<-merge(mydata, t1, by.x= names(mydata)[2:5],
by.y=names(t1)[1:4], all=T)

Thank you,
Alan







###############################################################
Message: 97
Date: Thu, 08 Mar 2007 08:00:53 +0100
From: "Petr Pikal" <petr.pikal at precheza.cz>
Subject: Re: [R] how to "apply" functions to unbalanced data in long
       format  by      factors......cant get "by" or "aggregate" to work
To: "ALAN SMITH" <alansmith2 at gmail.com>, r-help at stat.math.ethz.ch
Message-ID: <45EFC2B5.29775.2FD750 at localhost>
Content-Type: text/plain; charset=US-ASCII

Hi

you can use aggregate to create table of medians

with(mydata, aggregate(Y, list(time, tratment, expRep,....), median)

repeats of unique factors
either by rle or aggregate with length function

Then you can do replication by

norep <- rep(your.median, each = your replicates)

Regards
Petr



submitted question abrigded
> Hello R users,



> #####Example data frame######
> mydata<-as.data.frame(structure(list(cpdID = c(7, 7, 7, 7, 7, 7, 8, 8,
> 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 10, 10, 10, 10, 10, 10, 10, 10, 10,
> 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
> 10, 10, 10, 10, 10, 10, 19, 19, 19, 19, 19, 19, 23, 23, 23, 23, 23,
> 23, 23, 23, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33,
> 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33,
> 33, 33, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
> 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
> 40, 40, 40, 40, 40, 40, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42,
> 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42,
> 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 47, 47,
> 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47,
> 47, 47), time = structure(as.integer(c(1, 1, 1, 1, 2, 2, 2, 1, 1, 1,
> 1, 1, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 1, 2, 2, 1, 1, 2, 2, 1, 2, 2, 2,
> 1, 2, 1, 2, 2, 2, 2, 1, 2, 2, 1, 2, 1, 2, 2, 1, 1, 1, 2, 2, 2, 2, 2,
> 2, 1, 2, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 2, 1, 1,
> 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 1, 1, 1, 1, 2, 1, 1, 1,
> 1, 1, 1, 2, 2, 2, 2, 1, 2, 1, 2, 2, 1, 2, 1, 2, 2, 1, 1, 2, 2, 2, 2,
> 2, 2, 2, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 1, 1,
> 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2,
> 2, 1, 2, 2, 1, 2)), .Label = c("120hr", "24hr"), class = "factor"),
>     treatment = structure(as.integer(c(1, 1, 1, 2, 2, 1, 1, 2,
>     1, 1, 2, 2, 1, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 2, 1, 2, 1,
>     2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 2, 2, 1, 2, 1, 1, 2, 2, 2,
>     1, 1, 1, 1, 1, 2, 2, 1, 1, 2, 1, 2, 1, 1, 1, 2, 2, 1, 1,
>     1, 1, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 2, 1, 2,
>     2, 1, 1, 2, 1, 1, 2, 2, 1, 2, 2, 1, 1, 1, 2, 1, 2, 2, 2,
>     1, 1, 1, 1, 2, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 1, 2, 2, 2,
>     2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 1, 1, 2, 1, 2, 2, 2, 2, 1,
>     1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2,
>     2, 2, 1, 2, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 2, 1, 1, 1, 1,
>     2, 2, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 1, 1,
>     2, 1)), .Label = c("control", "trt"), class = "factor"),
>     expREP = structure(as.integer(c(1, 1, 1, 3, 1, 1, 1, 1, 2,
>     2, 1, 1, 2, 2, 1, 3, 1, 3, 3, 3, 1, 2, 1, 2, 2, 2, 2, 3,
>     3, 2, 2, 1, 2, 3, 3, 1, 1, 2, 3, 1, 3, 3, 3, 3, 1, 3, 1,
>     1, 2, 1, 1, 2, 3, 2, 2, 1, 3, 2, 2, 2, 3, 2, 1, 2, 2, 2,
>     2, 1, 1, 1, 3, 2, 2, 3, 3, 2, 2, 2, 3, 2, 3, 2, 3, 1, 2,
>     3, 3, 1, 1, 1, 3, 3, 1, 1, 3, 1, 1, 1, 1, 1, 3, 3, 3, 1,
>     1, 1, 2, 3, 2, 2, 3, 2, 2, 2, 1, 1, 1, 3, 3, 2, 2, 2, 1,
>     3, 1, 2, 3, 1, 3, 3, 1, 2, 3, 1, 2, 1, 3, 1, 3, 3, 2, 2,
>     2, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1, 2, 2, 2, 1, 3, 1, 1, 1,
>     1, 3, 3, 1, 1, 1, 1, 3, 3, 3, 3, 3, 1, 3, 1, 1, 1, 1, 3,
>     3, 1, 1, 1, 3, 2, 1, 1, 2, 1, 3, 2, 1, 2, 1, 3, 1, 1, 2,
>     3)), .Label = c("expREP1", "expREP2", "expREP3"), class =
>     "factor"), techREP = structure(as.integer(c(3, 2, 1, 1, 1, 3, 1,
>     3, 3, 2, 2, 1, 1, 3, 2, 3, 3, 1, 2, 1, 2, 1, 3, 1, 3, 2, 2, 3, 1,
>     1, 3, 3, 2, 3, 3, 3, 2, 2, 2, 2, 1, 1, 2, 3, 1, 2, 3, 1, 3, 2, 1,
>     1, 2, 2, 3, 3, 3, 2, 1, 2, 1, 2, 3, 2, 3, 2, 3, 1, 3, 2, 2, 1, 2,
>     1, 3, 2, 2, 1, 1, 3, 3, 3, 1, 3, 1, 3, 2, 2, 2, 1, 2, 1, 3, 1, 3,
>     2, 1, 3, 1, 3, 2, 3, 1, 1, 2, 3, 1, 1, 3, 3, 2, 2, 1, 2, 3, 2, 2,
>     3, 2, 2, 1, 2, 2, 3, 3, 3, 1, 1, 1, 3, 1, 1, 2, 3, 2, 3, 3, 1, 1,
>     3, 2, 3, 3, 3, 3, 1, 3, 2, 1, 3, 3, 1, 3, 2, 1, 2, 2, 1, 2, 1, 2,
>     1, 1, 1, 2, 2, 3, 3, 1, 2, 3, 2, 3, 3, 3, 1, 2, 2, 1, 3, 2, 3, 3,
>     2, 2, 2, 3, 2, 1, 3, 1, 3, 1, 3, 1, 1, 1, 2, 2, 3)), .Label =
>     c("techREP1", "techREP2", "techREP3"), class = "factor"), log2Abun
>     = c(14.4233129144089, 14.052822741429, 14.2281422686467,
>     13.8492096005693, 14.076481601207, 14.2139395740777,
>     14.3399195756207, 14.3625602954496, 14.0141948668145,
>     14.0980320829605, 14.3152203363759, 14.4528846974866,
>     13.9591869268449, 14.4064043323413, 14.0403753485321,
>     14.2285932517829, 14.1259784261721, 13.5925738310379,
>     13.5830827675029, 13.0280787227049, 15.0198078807043,
>     12.8423503434138, 12.645883554519, 13.4644181177386,
>     12.8399910705399, 12.7879025593914, 12.4978518369511,
>     14.3949985145017, 12.8670856466168, 12.9749522735341,
>     13.3456824481868, 13.4557125040673, 12.8989792046225,
>     16.0609491915918, 13.6795900568273, 16.456466720182,
>     13.6145948287653, 13.2604785448039, 14.8573006848798,
>     13.1382718001722, 13.690761908446, 14.0557060971613,
>     13.7495552174335, 13.6336764098923, 13.7844303674846,
>     15.9518993688317, 13.2452555803066, 13.1930632791304,
>     12.1919845133603, 13.8710388986595, 13.6375305515253,
>     12.5919897676151, 17.4797250127015, 17.4014712120155,
>     17.5948202702163, 12.6031626795344, 17.8287811089804,
>     11.3613955331659, 15.8064741020529, 15.1007855146758,
>     16.0553036215393, 15.7553570530353, 15.9747058600332,
>     15.776715745005, 15.8588066550904, 16.2935434944118,
>     16.271207673964, 16.3660489506706, 16.3273070282017,
>     15.7632383068689, 14.6030467398838, 14.7118820283521,
>     14.7577545959238, 14.7315311764619, 14.8250084466403,
>     15.6652803936783, 15.8249587405285, 15.6558660906456,
>     15.5387042614836, 14.8487696278309, 15.5477380355109,
>     15.9451465974129, 16.196755792715, 15.9999119421954,
>     15.8660714836595, 15.9406577104549, 15.8754613979164,
>     16.0358944927638, 16.1785092456522, 16.1992122284106,
>     15.8087128474547, 15.9373968104322, 16.1432636222427,
>     16.2412011305004, 15.9488234774507, 15.7820255767261,
>     15.7730361533934, 15.7459893802453, 20.7777738189812,
>     21.7489122647969, 21.0374490930058, 20.9765158780184,
>     21.0464959041766, 21.6790715518273, 21.8021013715842,
>     20.7652083875471, 20.6663696521617, 20.3963413756589,
>     20.7983642126234, 20.1864915044977, 20.4422216681915,
>     20.59064186918, 20.6964531077756, 20.6822196619653,
>     20.4532414913665, 20.8126113450884, 20.4397608946311,
>     21.4603719009067, 21.5318145314919, 21.0400816517662,
>     21.0466431076593, 20.7459819969019, 20.6723053403015,
>     20.4793421418014, 20.6432035537608, 20.6831942471622,
>     21.6913537667357, 20.6562913013787, 21.0940693071186,
>     20.9473294479256, 20.5087271424267, 16.0871520250047,
>     16.3816612332698, 16.998645516939, 15.7912392142223,
>     14.5058735666446, 13.6035104425928, 14.4369066987207,
>     14.6998435295626, 14.6818972267862, 14.1086877961546,
>     14.3539049235617, 15.40862828087, 15.0657947671893,
>     14.8615716011254, 14.5538692431961, 14.2397476835569,
>     13.4381420777437, 13.4499224158638, 13.6887966810545,
>     14.6550275257018, 13.500966330283, 14.9271297886953,
>     14.7405186421119, 15.0047910398043, 14.7051463678038,
>     14.8325933769599, 12.9854861991046, 13.4203550220891,
>     15.399010832952, 15.4064707685293, 15.0953970227926,
>     15.0712109416537, 15.7587957644032, 15.0013202225009,
>     15.7608498673217, 14.7604080920677, 14.2478533598602,
>     14.4140245098782, 14.7936541075062, 14.7684428120549,
>     14.595607155062, 16.1507389488284, 16.4915712924337,
>     14.490161446684, 14.721633263063, 14.4341721012904,
>     15.8747652729112, 14.543333961671, 14.8633635585377,
>     14.6696601802386, 13.3020676725265, 14.0190694293311,
>     15.2168973938334, 12.6304946615056, 12.1972166931101,
>     12.7960396088298, 14.4285564621952, 14.5308330346953,
>     14.1496677436943, 14.0823985634278, 12.8407779235951,
>     14.6543003749437, 14.3202364452416, 15.1723493709662,
>     14.0744760007345, 14.8132801684508, 12.9183042336999,
>     14.5225202325766, 13.742309436084)), .Names = c("cpdID", "time",
>     "treatment",
> "expREP", "techREP", "Y")))



More information about the R-help mailing list