[R] help with handling replicates before reshaping data

hadley wickham h.wickham at gmail.com
Fri Jul 13 20:25:54 CEST 2007


Hi Tom,

>   I have a dataset consists of duplicated sequences within day for each patient (see below data) and I want to reshape the data with patient as time variable. However the reshape function only takes the first sequence of the replicates and ignores the second. How can I 1) average the duplicates and 2) give the duplicated sequences unique names before reshaping the data ?
>
>   > data
>      patient day  seq           y
>   1       10   1 acdf -0.52416066
>   2       10   1 cdsv  0.62551539
>   3       10   1 dlfg -1.54668047
>   4       10   1 acdf  0.82404978
>   5       10   1 cdsv -1.17459914
>   6       10   2 acdf  0.47238216

You mind find that the functions in the reshape package give you a bit
more flexibility.

# The reshape package expects data like to have
# the value variable named "value"
d2 <- rename(data, c("y" = "value"))

# I think this is the format you want, which will average over the reps
cast(d2, day + seq ~ patient, mean)


Hadley



More information about the R-help mailing list