[R] Statistically detecting thresholds...

Tue Jun 16 23:23:31 CEST 2009

Rers:

I have some ecological data (stream velocity vs. % cover of submerged 
weeds) that shows strong evidence of a thresholding step-function, e.g. 
below some velocity, % cover ranges from 0% to 100% (with no apparent 
relationship to velocity within this range of velocities), but above a 
certain "threshold" velocity, the % cover does not appear to exceed, 
say, 10%.  There are good mechanistic reasons for believing there is a 
step function there, so I'm trying to determine the velocity position 
and significance of this threshold.  I came up with the following 
approach, but I was hoping to find out if there is a more "standardized" 
way of doing this:

1) Give small velocity steps ranging from 0 to max velocity,  classify 
all samples (% cover vs. velocity) into "above threshold" and "below 
threshold".
2) Perform a t-test on these two groups, store the p-value to an array, 
building up a database of p-value vs. velocity threshold.
3) Determine the minimum p-value from the previous step, which should be 
the velocity threshold -- the p-value at this threshold is the 
significance of the threshold.

Does this make sense?  Is there a better way of doing this?  When I ran 
this on the data, you see a nice, nearly parabolic relationship around 
what the threshold appears to be of p-value vs. threshold.

--j

-- 

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Cell: 415-794-5043
AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307