[R] ddply

arun smartpink111 at yahoo.com
Wed Feb 26 02:03:48 CET 2014



Hi Felipe,

Pasting the code from your second email with ?which.max()
#changed 'test' to `hw` as 

hw2 <- ddply(hw,"id",summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),
                                       Recaps=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),
                                       WaterTemp=mean(WaterTemp), TrapWeather=TrapWeather[which.max(Recaps)])


Now, change which.max to max, 

ddply(hw,"id",summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),
                                       Recaps=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),
                                       WaterTemp=mean(WaterTemp), TrapWeather=max(Recaps))

# id subSiteName nReleased Recaps   MeanFL TrapTurbidity WaterTemp TrapWeather
#1  1       north       686      5 36.05128         2.450  8.395417           5
#2  2       north       540     11 35.47000         2.770  8.824167          11
#3  3       north      1995     51 38.32692         1.700  9.220000          51
#4  4       north      1309     35 37.17000         1.615  9.277917          35
#5  5       north       995     47 38.84152         1.815  8.660625          47


So, it is better to rename the Recaps column to something else:
ddply(hw,"id",summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),
                                       Recaps1=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),
                                       WaterTemp=mean(WaterTemp), TrapWeather=max(Recaps))

##Check the difference

If there are multiple rows with max values, then:
 hw1 <- data.frame(id=5, subSiteName="north", nReleased= 995, Recaps=46, MeanFL=38.42, TrapTurbidity=2.23, WaterTemp=8.6234, TrapWeather= "Clear")
hw2 <- rbind(hw,hw1)
#either create a list column

res <- ddply(hw2,.(id),summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),Recaps1=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),WaterTemp=mean(WaterTemp),TW=list(as.character(TrapWeather[Recaps %in% max(Recaps)])))

#or use paste()

res1 <- ddply(hw2,.(id),summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),Recaps1=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),WaterTemp=mean(WaterTemp),TW=paste(TrapWeather[Recaps %in% max(Recaps)],collapse=","))


A.K.



  








On Tuesday, February 25, 2014 5:31 PM, Felipe Carrillo <mazatlanmexico at yahoo.com> wrote:

Hi Arun,
Could you help me with this what appears to be a simple question?
I want to create a column called TW with a value from TrapWeather is
selected based on the max value of Recaps by id.
for example for id=4 the max Recaps value is 34 so I want TrapWeather to be 'Foggy'
and so on. Thanks Arun
library(plyr)
hw <- structure(list(id = c(1L, 2L, 2L, 3L, 4L, 4L, 5L, 5L), subSiteName = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "north", class = "factor"),
    nReleased = c(686L, 540L, 540L, 1995L, 1309L, 1309L, 995L,
    995L), Recaps = c(5L, 8L, 3L, 51L, 34L, 1L, 46L, 1L), MeanFL = c(36.05128205,
    35.38, 35.56, 38.32692308, 36.48, 37.86, 38.44230769,
39.24074074
    ), TrapTurbidity = c(2.450000048, 2.710000038, 2.829999924,
    1.700000048, 2.130000114, 1.100000024, 2, 1.629999995), WaterTemp = c(8.395416667,
    8.55625, 9.092083333, 9.22, 9.180833333, 9.375, 8.63875,
    8.6825), TrapWeather = structure(c(2L, 1L, 1L, 3L, 3L, 1L,
    2L, 4L), .Label = c("Clear", "Cloudy", "Foggy", "Rainy day"
    ), class = "factor")), .Names = c("id", "subSiteName", "nReleased",
"Recaps", "MeanFL", "TrapTurbidity", "WaterTemp", "TrapWeather"
), class = "data.frame", row.names = c(NA, -8L))
hw2 <- ddply(test,"id",summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),
                                       Recaps=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),
                                       WaterTemp=mean(WaterTemp), TW=TrapWeather Where Recaps==max(Recaps))

hw2



More information about the R-help mailing list