[R] QQplots of probability vector data

bioinformatics_guy wwwhitener at gmail.com
Mon Jun 22 13:04:08 CEST 2009


I'm trying to determine if a set of data is normal from a qq plot but seem to
be having a bit of difficulty.

I have a file of the following form

      9 36
      3 37
      6 38
      7 39
      .....

where the left column is the frequency of the number in the right column. 
I've found the probabilities of each number and put it in a file of the form

     36 .0009
     37 .0003
     38 .0006
     39 .0007

where the first column is the number and the second is the probability of
that number.

Now what I've done so far is as follows: 

null=read.table("probfile.txt",header=FALSE)    #Prob file is the 2nd file
where $V1 is the number and $V2 is    the probability

and I want to do 
x=qnorm(null$V2, ... ) but I don't know how to get the mean and sd from
those 2 files.  When I looked up mean() and sd(), I had to give a vector of
numbers.  As I only have the number of occurances and their probability, I
can't really get a vector of this data unless I make another file that has
the frequency of the numbers written out -- which is intractable given I
have 10000000 data points.

I mean I could write a quick program to get the mean (which I did in perl)
but I'd rather not do that for the sd as I am sure there is an easier way to
do this.

Either way, once I have the qnorm stored in x, all I have to do is qqnorm(x)
right? 
  
-- 
View this message in context: http://www.nabble.com/QQplots-of-probability-vector-data-tp24145321p24145321.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list