[R] weighted average grouped by variables

Massimo Bressan massimo.bressan at arpa.veneto.it
Thu Nov 9 12:20:52 CET 2017


hi all 

I have this dataframe (created as a reproducible example) 

mydf<-structure(list(date_time = structure(c(1508238000, 1508238000, 1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class = c("POSIXct", "POSIXt"), tzone = ""), 
direction = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), 
type = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L), .Label = c("car", "light_duty", "heavy_duty", "motorcycle"), class = "factor"), 
avg_speed = c(41.1029082774049, 40.3333333333333, 40.3157894736842, 36.0869565217391, 33.4065155807365, 37.6222222222222, 35.5), 
n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)), 
.Names = c("date_time", "direction", "type", "speed", "n_vehicles"), 
row.names = c(NA, -7L), 
class = "data.frame") 

mydf 

and I need to get to this final result 

mydf_final<-structure(list(date_time = structure(c(1508238000, 1508238000, 1508238000, 1508238000), class = c("POSIXct", "POSIXt"), tzone = ""), 
type = structure(c(1L, 2L, 3L, 4L), .Label = c("car", "light_duty", "heavy_duty", "motorcycle"), class = "factor"), 
weighted_avg_speed = c(36.39029, 38.56521, 37.53333, 36.08696), 
n_vehicles = c(1153L,69L,45L,23L)), 
.Names = c("date_time", "type", "weighted_avg_speed", "n_vehicles"), 
row.names = c(NA, -4L), 
class = "data.frame") 

mydf_final 


my question: 
how to compute a weighted mean i.e. "weighted_avg_speed" 
from "speed" (the values whose weighted mean is to be computed) and "n_vehicles" (the weights) 
grouped by "date_time" and "type"? 

to be noted the complication of the case "motorcycle" (not present in both directions) 

any help for that? 

thank you 

max 



	[[alternative HTML version deleted]]



More information about the R-help mailing list