[R] Doubt about Student t distribution simulation

Sundar Dorai-Raj sundar.dorai-raj at pdf.com
Fri Aug 4 22:27:12 CEST 2006


Hi, Jose/John,

Here's an example to help Jose and highlights John's advice. Also 
includes set.seed which should be included in all simulations posted to 
R-help.

set.seed(42)
mu <- 10
sigma <-  5
n <- 3
nsim <- 10000
m <- matrix(rnorm(n * nsim, mu, sigma), nsim, n)
t <- apply(m, 1, function(x) (mean(x) - mu)/(sd(x)/sqrt(n)))

library(lattice)
qqmath(t, distribution = function(x) qt(x, n - 1),
        panel = function(x, ...) {
          panel.qqmath(x, col = "darkblue", ...)
          panel.qqmathline(x, col = "darkred", ...)
        })


With n = 3, expect a few outliers.

--sundar


John Fox wrote:
> Dear Jose,
> 
> The problem is that you're using the population standard deviation (sigma)
> rather than the sample SD of each sample [i.e., t[i]  = (mean(amo.i) - mu) /
> (sd(amo.i) / sqrt(n)) ], so your values should be normally distributed, as
> they appear to be.
> 
> A couple of smaller points: (1) Even after this correction, you're sampling
> from a discrete population (albeit with replacement) and so the values won't
> be exactly t-distributed. You could draw the samples directly from N(mu,
> sigma) instead. (2) It would be preferable to make a quantile-comparison
> plot against the t-distribution, since you'd get a better picture of what's
> going on in the tails.
> 
> I hope this helps,
>  John 
> 
> --------------------------------
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> 905-525-9140x23604
> http://socserv.mcmaster.ca/jfox 
> -------------------------------- 
> 
> 
>>-----Original Message-----
>>From: r-help-bounces at stat.math.ethz.ch 
>>[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Jose 
>>Claudio Faria
>>Sent: Friday, August 04, 2006 3:09 PM
>>To: R-help at stat.math.ethz.ch
>>Subject: [R] Doubt about Student t distribution simulation
>>
>>Dear R list,
>>
>>I would like to illustrate the origin of the Student t 
>>distribution using R.
>>
>>So, if (sample.mean - pop.mean) / standard.error(sample.mean) 
>>has t distribution with (sample.size - 1) degree free, what 
>>is wrong with the simulation below? I think that the 
>>theoretical curve should agree with the relative frequencies 
>>of the t values calculated:
>>
>>#== begin options=====
>># parameters
>>   mu    = 10
>>   sigma =  5
>>
>># size of sample
>>   n = 3
>>
>># repetitions
>>   nsim = 10000
>>
>># histogram parameter
>>   nchist = 150
>>#== end options=======
>>
>>t   = numeric()
>>pop = rnorm(10000, mean = mu, sd = sigma)
>>
>>for (i in 1:nsim) {
>>   amo.i = sample(pop, n, replace = TRUE)
>>   t[i]  = (mean(amo.i) - mu) / (sigma / sqrt(n)) }
>>
>>win.graph(w = 5, h = 7)
>>split.screen(c(2,1))
>>screen(1)
>>hist(t,
>>      main     = "histogram",
>>      breaks   = nchist,
>>      col      = "lightgray",
>>      xlab     = '', ylab = "Fi",
>>      font.lab = 2, font = 2)
>>
>>screen(2)
>>hist(t,
>>      probability = T,
>>      main        = 'f.d.p and histogram',
>>      breaks      = nchist,
>>      col         = 'lightgray',
>>      xlab        = 't', ylab = 'f(t)',
>>      font.lab    = 2, font = 2)
>>
>>x = t
>>curve(dt(x, df = n-1), add = T, col = "red", lwd = 2)
>>
>>Many thanks for any help,
>>___
>>Jose Claudio Faria
>>Brasil/Bahia/Ilheus/UESC/DCET
>>Estatística Experimental/Prof. Adjunto
>>mails: joseclaudio.faria at terra.com.br
>>        jc_faria at uesc.br
>>        jc_faria at uol.com.br
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide 
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
> 
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list