[R] Looking for easy way to normalize data by groups

hadley wickham h.wickham at gmail.com
Mon Jun 8 17:44:47 CEST 2009


On Mon, Jun 8, 2009 at 10:29 AM, Herbert
Jägle<herbert.jaegle at uni-tuebingen.de> wrote:
> Hi,
>
> i do have a dataframe representing data from a repeated experiment. PID is a
> subject identifier, Time are timepoints in an experiment which was repeated
> twice. For each subject and all three timepoints there are 2 sets of four
> values.
>
> df <- data.frame(PID = c(rep("A", 12), rep("B", 12), rep("C", 12)),
>                Time = rep(c(0, 0, 0, 0, 30, 30, 30, 30, 60, 60, 60, 60), 3),
>                Dset = rep(c(1, 2),18),
>                Val1 = rnorm(36),
>                Val2 = rnorm(36),
>                Val3 = rnorm(36),
>                Val4 = rnorm(36))
>
> You can plot the data nicely with x=Time and y=Val1 by grouping PID and
> facetting for Dset.
>
> p <- ggplot(df) +
>       geom_line(aes(x=Time,y=Val1,group=PID)) +
>       geom_point(aes(x=Time,y=Val1,colour=PID)) +
>       facet_grid(. ~ Ecc)
>   theme_set(theme_bw())
> p
>
> I would now like to normalize these data to the mean of the two values at
> Time = 0 for each subject (so having plots in % of the mean Time=0 value
> rather than absolute values).

Maybe like this?

library(plyr)

ggplot(df, aes(Time, Val1, colour = PID)) +
  geom_line(stat="summary", fun.y = mean) +
  geom_point() +
  facet_grid(. ~ Dset)

std <- ddply(df, c("PID", "Dset"), transform, Val1 = Val1 /
mean(Val1[Time == min(Time)]))

last_plot() %+% std

I modified the plot so it's a bit more informative.

Hadley

-- 
http://had.co.nz/




More information about the R-help mailing list