[R] conditional mean between two data frames with different levels

Joshua Wiley jwiley.psych at gmail.com
Fri Nov 19 03:45:56 CET 2010


Hi Alberto,

This should do it.  'x' is equivalent to your alfa vector.

Cheers,

Josh


## your example data, in a form R can read
dat1 <- data.frame(score = c(1, 2, 0, 2, 1, 1, 3, 2, 1, 0),
  teams = c("a", "b", "c", "d", "e", "a", "b", "c", "d", "e"))
dat2 <- data.frame(score = c(2, 3, 1, 0, 0, 0, 4, 2, 1, 2),
  teams = c("b", "c", "d", "e", "f", "b", "c", "d", "e", "f"))

## calculate mean by team from data frame 1, but repeat the means per
data frame 2
## levels from dat2 not present in dat1 are NA
x <- with(dat1, c(by(score, teams, mean))[as.character(dat2$teams)])
## replace missing values with the overall mean of dat1 scores
x[is.na(x)] <- mean(dat1$score)
## print final results
x


On Thu, Nov 18, 2010 at 9:17 AM, albechan <alberto.casetta at satt.biz> wrote:
>
> Thank you very much Josh, I guess you`re right.
> So this is an example:
> data frame 1 has 2 columns and 10 rows. The first column is "score" a
> variable indicating the number of goals scored by a football team
> score<-c(1,2,0,2,1,1,3,2,1,0), column 2 contains the football "teams " where
> teams<-c(a,b,c,d,e,a,b,c,d,e).
> Data frame 2 has the following variables: score<-c(2,3,1,0,0,0,4,2,1,2) and
> "teams"<-c(b,c,d,e,f,b,c,d,e,f).
> What I need is to create a vector "alfa"<-numeric(10) where the first
> element contains the mean of the number of goals scored by team b in the
> previous season, the second element contains the mean of the number of goals
> scored by team c in the previous season and so on. In correspondance of team
> f, the average of the whole score vector of the previous season.
> alfa should be (2.5, 1, 1.5, 0.5, 1.3, 2.5, 1, 1.5, 0.5, 1.3)
> The problem arises because "f" doesnt appear in the first data frame as it
> replaced "a".
> Hope the issue is more understandable now.
> Thanks a lot!
> alberto
> --
> View this message in context: http://r.789695.n4.nabble.com/conditional-mean-between-two-data-frames-with-different-levels-tp3049010p3049171.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list