[R] Sample Poisson Distribution

(Ted Harding) ted.harding at nessie.mcc.ac.uk
Wed Feb 7 19:51:55 CET 2007


On 07-Feb-07 Thor wrote:
> Hi,
>  I'm completely new to R, I am all at sea with the interface
> and the confusing help files, so would appreciate some help
> to do a simple task.
> 
> Need to present the mean and variance of 100 different samples
> of poisson distributions (N=1000, with fixed lambda) in a file
> in two columnns, and then produce histograms.
> 
> So far I have figured out:
> 
>> N <- 1000
>>  x <- rpois(N, 3.1) ,

Comment: The Poisson distribution has only one parameter, lambda,
so it should be rpois(N, lambda), e.g. rpois(N, 3). You will get
an error with your second parameter "1".

> and 
>> var(x) 
> and 
>> mean(x)
> , and I've seen the hist command, just need to tie it all together.
> I read that loops aren't really used in R, so what do i need to do?

Since you're completely new, there are features of how R handles
things in its data structures which are very useful for this kind
of thing.

In this case, the trick is that if you construct a matrix out of
a single vector with many elements in it, R will fill in the
columns from the vector working down each column anf then from
left to right. For example:

> matrix(c(1,2,3,4,5,6),ncol=2)
     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6

So you can get all 100 samples into 100 columns of a matrix A
with N rows as

N<-1000; Nsamp<-100
A <- matrix(rpois(N*Nsamp, 3),ncol=Nsamp)

See ?matrix for a summary of the above.

Then (though here it's not quite clear what you really want) you
can put the mean of each of the 100 columns into one column of
your results, and the variance of each column into the next column
of results, obtaining a matrix with 100 rows and 2 columns:

So now you need to get the mean and varuance of each column of A.
If you just try mean(A) you will get one number, because R will
simply calculate the mean of all the numbers in A. The function
to use here is apply():

means<-apply(A,2,mean)
vars <-apply(A,2,var)

since this works along the "2nd dimension" of A (i.e. the columns)
and calculates the mean for each one, and the variance fo each one.

You can tie it all together in one operation by using cbind(),
which assembles a collection of vectors (all theaame length)
into columns side by side and makes a matrix ofthem:

Result <- cbind(means, vars)

or, without the intermediate calculation,

Result <- cbind(apply(A,2,mean), apply(A,2,var))

(However, it will be useful later to have the separate intermediate
results).

At this stage I'm really not sure whatyou exactly want, since you
don't say what you want the histograms of. But I'm going to guess
that you want the histograms of the 100 means, and the 100 variances.

You can do this either with

hist(means)
hist(vars)

or equivalently with

hist(Result[,1])
hist(Result[,2])

In R there are many possibilities for neat manoevres of this kind,
and I tend to agree that they are not always easily found by people
new to R. It's well worth reading the introductory documentation
for R, under "Documentation" on the CRAN website, especially
"An Introduction to R" and (under "Contributed Documentation")
"Using R for Data Analysis and Graphics - Introduction, Examples and
Commentary", "Simple R", "Practical Regression and Anova using R"
and "R for Beginners". You will find several examples of data
manipulation techniques in these. Once you get used to R you will
be using them all the time.

Best wishes, and good luck with R!
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 07-Feb-07                                       Time: 18:51:52
------------------------------ XFMail ------------------------------



More information about the R-help mailing list