[R] Histograms, density, and relative frequencies

Bret Collier bacolli at uark.edu
Wed Jul 7 19:29:40 CEST 2004

         I have been using R for about 1 year, and I have run across a 
couple of graphics problem that I am not quite sure how to address.  I have 
read up on the email threads regarding the differences between density and 
relative frequencies (count/sum(count) on the R list, and I am hoping that 
someone could provide me with some advice/comments concerning my 
approach.  I will admit that some of the underlying mathematics of the 
density discussion are beyond my current understanding, but I am looking 
into it.

I have a data set (600,000 obs) used to parameterize a probabilistic causal 
model where each obs is a population response for one of 2 classes (either 
regs1 and regs2).  I have been attempting to create 1 marginal probability 
plot with 2 lines (one for each class).  Using my rather rough code, I 
created a plot that seems to adhere to the commonly used (although from 
what I can understand wrong) relative frequency histogram approach.

My rough code looks like this:

bk <- c(0, .05, .1, .15, .2, .25,.3, .35, 1)
par(mfrow=c(1, 1))
fawn1 <- hist(MFAWNRESID[regs1], plot=F, breaks=bk)
fawn2 <- hist(MFAWNRESID[regs2], plot=F, breaks=bk)
count1 <- fawn1$counts/sum(fawn1$counts)
count2 <- fawn2$counts/sum(fawn2$counts)
b <- c(0, .05, .1, .15, .2, .25, .3, .35)
plot(count1~b,xaxt="n", xlim=c(0, .5), ylim=c(0, .40), pch=".", bty="l")
lines(spline(count1~b), lty=c(1), lwd=c(2), col="black")
lines(spline(count2~b), lty=c(2), lwd=c(2), col="black")
axis(side=1, at=c(0, .05, .1, .15, .2,  .25, .3, .35))

Using the above, I get frequency values for regs1 that look like this 
(which is the same as output for my probabilistic model):
 > count1
[1] 1.213378e-01 3.454324e-01 3.365343e-01 1.580839e-01 3.342101e-02
[6] 4.698426e-03 4.488942e-04 4.322685e-05

First, count1 is the frequency of occurrence within range 0-0.05, but when 
plotted is the value at b=0 and does not really represent the range?  Are 
there any suggestions on a technique to approach this?

Next:  Using the above code, the x-axis values end at 0.35, but the axis 
continues (because bk ends at 1)?  While there is the chance of occurrence 
out past .35, it is low and I want to extend the lines to about .35 and 
clip the x-axis.  But, I have been unable to figure out how to clip  Could 
someone point me in the correct direction?


Bret A. Collier
Arkansas Cooperative Fish and Wildlife Research Unit
Department of Biological Sciences University of Arkansas

More information about the R-help mailing list