[R] Variance Computing- - HELP!!!!!!!!!!!!!!!!!!

Liaw, Andy andy_liaw at merck.com
Tue Aug 19 20:15:07 CEST 2003


First of all, your subscripting is wrong.  The first index is for row, and
the second for column.  Thus large[i,] refers to the i-th row of large,
rather than the i-th column.  Also, the code as you provided contain syntax
error.

Try:

set.seed(311)  ## Always a good idea to set seed for simulation!
large <- matrix(rnorm(1000*1000), 1000, 1000)
small <- matrix(rnorm(100*1000), 100, 1000)
var.large <- apply(large, 2, var)  ## Apply the var function to each column
var.small <- apply(small, 2, var)

The result looks like:
> summary(var.large); summary(var.small)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.8617  0.9705  1.0010  1.0020  1.0320  1.1520 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.5846  0.9021  0.9948  0.9990  1.0850  1.5360 

as expected:  The mean is about the same, but the spread is much smaller for
larger sample size.

This sort of things can be computed exactly using basic math stat, BTW.

Andy


> -----Original Message-----
> From: Padmanabhan, Sudharsha [mailto:sudAR_80 at neo.tamu.edu] 
> Sent: Tuesday, August 19, 2003 1:43 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Variance Computing- - HELP!!!!!!!!!!!!!!!!!!
> 
> 
> 
> Hello,
> 
> I am running a few simulations for clinical trial anlysis. I 
> want some help 
> regarding the following.
> 
> We know trhat as the sample size increases, the variance 
> should decrease, but 
> I am getting some unexpected results. SO I ran a code (shown 
> below) to check 
> the validity of this.
> 
> large<-array(1,c(1000,1000))
> small<-array(1,c(100,1000))
> for(i in 1:1000){large[i,]<-rnorm(1000,0,3)}
> for(i in 1:1000){small[i,]<-rnorm(100,0,3)}}
> yy<-array(1,100)
> for(i in 1:100){yy[i]<-var(small[i,])}
> y1y<-array(1,1000)
> for(i in 1:1000){y1y[i]<-var(large[i,])}
> mean(yy);mean(y1y);
> [1] 8.944
> [1] 9.098
> 
> 
> This shows that on an average,for 1000 such samples of 1000 
> Normal numbers, 
> the variance is higher than that of a 100 samples of 1000 
> random numbers.
> 
> Why is this so?
> 
> 
> Can someone please help me out????
> 
> Thanks.
> 
> Regards
> 
> ~S.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
> 

------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA), and/or
its affiliates (which may be known outside the United States as Merck Frosst,
Merck Sharp & Dohme or MSD) that may be confidential, proprietary copyrighted
and/or legally privileged, and is intended solely for the use of the
individual or entity named on this message.  If you are not the intended
recipient, and have received this message in error, please immediately return
this by e-mail and then delete it.




More information about the R-help mailing list