[R] Is this sapply behaviour normal?

Rolf Turner r.turner at auckland.ac.nz
Wed Jun 25 22:36:17 CEST 2008


The answer to your question is ``yeah, sort of''.  The reason for the
difference is that mean() is generic and has a method for data frames,
according to which the mean of each column of the data frame is found
in some ``appropriate'' manner.  (Essentially the columns of the data
frame must be either numeric or have some sort of date persuasion, else
you get a warning and an NA for the column in question.  The function  
min()
is not generic and so if you hit a data frame with min() it (apparently)
treats that data frame as if it were an atomic vector of data and finds
the minimum of that atomic vector.  Given, of course, that doing so  
makes
sense.

It would seem that you want min() to mimic the behaviour of mean().  To
achieve this you can, in this instance at least (I think!) simply do

	sapply(dats,function(x){sapply(x,min)})

HTH.

		cheers,

			Rolf Turner

On 26/06/2008, at 3:00 AM, Victor Homar wrote:

>
>    Hi, I'm trying to use sapply to compute the min of several  
> variables, each
>    of them stored in data.frames, grouped as a list:
>    Is it normal that mean() and min() produce different objects  
> dimensions?
>> str(dats)
>    List of 5
>     $ log20:'data.frame':  83 obs. of  5 variables:
>       ..$  DATE  : int [1:83] 2001081500 2001081512 2001081600  
> 2001081612
>    2001081700 2001081712 2001081800 2001081812 2001081900  
> 2001081912 ...
>      ..$ logrho: num [1:83]  1.16 -1.30 -1.30 -1.30 -1.30 ...
>      ..$ w2    : num [1:83] 1.01 1.27 1.24 1.31 1.28 ...
>      ..$ rms   : num [1:83] 5.001 0.630 0.616 0.685 0.655 ...
>      ..$ maxi  : num [1:83] 8.66 3.39 3.83 3.35 3.23 ...
>     $ log30:'data.frame':  71 obs. of  5 variables:
>       ..$  DATE  : int [1:71] 2001081500 2001081512 2001081600  
> 2001081612
>    2001081700 2001081712 2001081800 2001081812 2001081900  
> 2001081912 ...
>      ..$ logrho: num [1:71]  1.16 -1.00 -1.00 -1.00 -1.00 ...
>      ..$ w2    : num [1:71] 1.01 1.27 1.21 1.29 1.27 ...
>      ..$ rms   : num [1:71] 5.001 0.851 0.802 0.877 0.864 ...
>      ..$ maxi  : num [1:71] 8.66 4.57 4.62 4.27 4.47 ...
>     $ log40:'data.frame':  14 obs. of  5 variables:
>       ..$  DATE  : int [1:14] 2001081500 2001081512 2001081600  
> 2001081612
>    2001081700 2001081712 2001081800 2001081812 2001081900  
> 2001081912 ...
>      ..$ logrho: num [1:14]  1.16 -0.50 -0.50 -0.50 -0.50 ...
>      ..$ w2    : num [1:14] 1.01 1.27 1.18 1.23 1.17 ...
>      ..$ rms   : num [1:14] 5.00 1.40 1.26 1.36 1.25 ...
>      ..$ maxi  : num [1:14] 8.66 7.54 6.39 6.68 5.83 ...
>     $ log50:'data.frame':  69 obs. of  5 variables:
>       ..$  DATE  : int [1:69] 2001081500 2001081512 2001081600  
> 2001081612
>    2001081700 2001081712 2001081800 2001081812 2001081900  
> 2001081912 ...
>      ..$ logrho: num [1:69]  1.16 -1.50 -1.50 -1.50 -1.50 ...
>      ..$ w2    : num [1:69] 1.01 1.27 1.25 1.33 1.31 ...
>      ..$ rms   : num [1:69] 5.001 0.516 0.516 0.577 0.554 ...
>      ..$ maxi  : num [1:69] 8.66 2.77 3.44 2.92 3.25 ...
>     $ log60:'data.frame':  66 obs. of  5 variables:
>       ..$  DATE  : int [1:66] 2001081500 2001081512 2001081600  
> 2001081612
>    2001081700 2001081712 2001081800 2001081812 2001081900  
> 2001081912 ...
>      ..$ logrho: num [1:66]  1.16 -2.00 -2.00 -2.00 -2.00 ...
>      ..$ w2    : num [1:66] 1.01 1.27 1.27 1.34 1.34 ...
>      ..$ rms   : num [1:66] 5.001 0.313 0.326 0.371 0.366 ...
>      ..$ maxi  : num [1:66] 8.66 1.68 2.43 2.23 3.85 ...
>> sapply(dats,mean)
>                   log20         log30         log40          
> log50         log60
>    DATE    2.001088e+09  2.001087e+09  2.001082e+09  2.001087e+09   
> 2.001086e+09
>    logrho -1.270326e+00 -9.695748e-01 -3.816967e-01 -1.461383e+00  
> -1.951941e+00
>    w2      1.324907e+00  1.283293e+00  1.217808e+00  1.345297e+00   
> 1.398435e+00
>    rms     7.752963e-01  9.623266e-01  1.636949e+00  6.788993e-01   
> 4.810029e-01
>    maxi    4.016900e+00  4.466865e+00  6.205573e+00  3.672405e+00   
> 2.898055e+00
>> sapply(dats,min)
>         log20      log30      log40      log50      log60
>    -1.3000708 -1.0000960 -0.5000176 -1.5001134 -2.0001350

######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}



More information about the R-help mailing list