# [R] Newbie question: Statistical functions (e.g., mean, sd) in a "transform" statement?

Gavin Simpson gavin.simpson at ucl.ac.uk
Fri Jan 19 19:53:31 CET 2007

```On Fri, 2007-01-19 at 11:54 -0600, Ben Fairbank wrote:
> Greetings listeRs -

Here are two solutions, depending on whether you wanted the NA's or not,
and I assume you wanted the row means:

> times3 <- transform(times, meantime = rowMeans(times))
> times3
time1    time2     time3    time4 meantime
1 70.408543 48.92378  7.399605 95.93050 55.66561
2 17.231940 27.48530 82.962916 10.20619 34.47159
3 20.279220 10.33575 66.209290 30.71846 31.88568
4        NA 53.31993 12.398237 35.65782       NA
5  9.295965       NA 48.929201       NA       NA
6 63.966518 42.16304  1.777342       NA       NA
> times4 <- transform(times, meantime = rowMeans(times, na.rm = TRUE))
> times4
time1    time2     time3    time4 meantime
1 70.408543 48.92378  7.399605 95.93050 55.66561
2 17.231940 27.48530 82.962916 10.20619 34.47159
3 20.279220 10.33575 66.209290 30.71846 31.88568
4        NA 53.31993 12.398237 35.65782 33.79200
5  9.295965       NA 48.929201       NA 29.11258
6 63.966518 42.16304  1.777342       NA 35.96897

HTH

G

>
> Given a data frame such as
>
>
>
> times
>
>        time1    time2     time3    time4
>
> 1  70.408543 48.92378  7.399605 95.93050
>
> 2  17.231940 27.48530 82.962916 10.20619
>
> 3  20.279220 10.33575 66.209290 30.71846
>
> 4         NA 53.31993 12.398237 35.65782
>
> 5   9.295965       NA 48.929201       NA
>
> 6  63.966518 42.16304  1.777342       NA
>
>
>
> one can use "transform" to total all or some columns, thus,
>
>
>
> times2 <- transform(times,totaltime=time1+time2+time3+time4)
>
>
>
> > times2
>
>        time1    time2     time3    time4 totaltime
>
> 1  70.408543 48.92378  7.399605 95.93050  222.6624
>
> 2  17.231940 27.48530 82.962916 10.20619  137.8863
>
> 3  20.279220 10.33575 66.209290 30.71846  127.5427
>
> 4         NA 53.31993 12.398237 35.65782        NA
>
> 5   9.295965       NA 48.929201       NA        NA
>
> 6  63.966518 42.16304  1.777342       NA        NA
>
>
>
> I cannot, however, find a way, other than "for" looping,
>
> to use statistical functions, such as mean or sd, to
>
> compute the new column.  For example,
>
>
>
> >
> times2<-transform(times,meantime=(mean(c(time1,time2,time3,time4),na.rm=
> TRUE)))
>
>
>
> > times2
>
>
>
>  time1    time2     time3    time4 meantime
>
> 1  70.408543 48.92378  7.399605 95.93050 45.54178
>
> 2  17.231940 27.48530 82.962916 10.20619 45.54178
>
> 3  20.279220 10.33575 66.209290 30.71846 45.54178
>
> 4         NA 53.31993 12.398237 35.65782 45.54178
>
> 5   9.295965       NA 48.929201       NA 45.54178
>
> 6  63.966518 42.16304  1.777342       NA 45.54178
>
>
>
> How can this be done?  And, generally, what is the recommended method
>
> for creating computed new columns in data frames when "for" loops take
>
> too long?
>
>
>
> With thanks for any suggestions,
>
>
>
> Ben Fairbank
>
>
>
> Using version 2.4.1 on a Windows XP professional operating system.
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> and provide commented, minimal, self-contained, reproducible code.
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson                 [t] +44 (0)20 7679 0522
ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

```