[R] Incidence Function Model in R help

David Winsemius dwinsemius at comcast.net
Mon Jul 6 16:54:08 CEST 2009



It appears to me that you really have no idea what the original author  
was doing. I honestly don't either. I never used the sweep operator  
and I don't know what his "S"'s represent. I do understand what  
offsets represent in the context of glm(), and this set of operations  
does not fit my understanding. It makes little statistical sense that  
an offset would be something that was constructed through a search for  
an optimum parameter driven by data. Maybe someone on the list can  
offer me a counter-example.

Furthermore the range of A in your data seems extreme and the degree  
of variablility remains low in terms of your "p" variable . To this  
point your problems have been of a mathematical nature and are  
amenable to critique purely on that basis, but it appears that you  
need more fundamental statistical support and you should either seek a  
real statistical consult or contact Dr Oksanen for guidance.

Attempting further debugging, I was surprised that I actually had a  
"d" object lying around in my workspace and realized it was from your  
question yesterday. It should not have existed from the code in you  
most recent posting since all those operations were inside a  
function.  Messing around with the earlier  code and setting alpha=0.1  
in tha code  and recalculating S with the new  d matrix, I got a non- 
zero S vector. Staring from that point I still got the error you got.  
When I limited the range for the "alphascan" to( 0.1, 10) I got an  
error, but when I limited it on the other end to (0.01, 0.1) I got  
these results:

 > (sol<-optimize(alphascan,c(0.1,10),d=d,A=A,p=p))
Error: NA/NaN/Inf in foreign function call (arg 4)
 >
 > (sol<-optimize(alphascan,c(0.01,.1),d=d,A=A,p=p))
$minimum
[1] 0.0312918

$objective
[1] 144.1746

As suggested above this could be mathematical gibberish. I cannot  
recommend thinking of this particular alpha as scientifically  
meaningful without careful review by someone who can apply a critical  
view using  domain knowledge to this outcome.

-- 
David Winsemius

On Jul 5, 2009, at 5:15 PM, sjkimble wrote:

>
> R Help:
>
>>
>> When I look at the object S, I see about half of them are
>> zeroes.
>
> I've address the object S being zero so often by changing the value of
> alpha, which I'm allowed to do according to the author. All values  
> of S are
> non zero and should not give Inf.
>
>> The "expectation" of a binomial model is for the LHS of the
>> formulat to be either a 1/0 vector or something of the form
>> cbind(events, non_events). You have satisfied that expectation with p
>> but there are only 2 such cases. It seems unreasonable to my thinking
>> to expect that a logistic regression model can deliver sensible  
>> output
>> from only 2 events. And that holds doubly (or perhaps infinitely?)
>> true when you are starting out with half of your covariates equal to
>> log(0) = -Inf.
>
> Object p is presence (1) or absence (0) of a species on the 12  
> islands, and
> some of them are rare so they are mostly zeros. Nevertheless I tried  
> with a
> more common species whose data are:
>
>    x.crd   y.crd           A p
> 1  361763 1034071      94.169 1
> 2  370325 1027277     127.642 1
> 3  370416 1027166     127.961 1
> 4  370471 1027148    1804.846 1
> 5  369050 1031312    1790.493 1
> 6  370908 1026354     199.103 1
> 7  361562 1034311    2047.637 1
> 8  365437 1022188 1622678.961 1
> 9  347047 1025334      21.169 1
> 10 349186 1024441  408556.801 1
> 11 361762 1034052       3.414 0
> 12 370799 1026557     103.994 0
>
> The code, plus error message:
>
>> haliclona_reniera_manglaris<-read.table(file.choose(),header=T)
>> attach(haliclona_reniera_manglaris)
>> alphascan<-function(alpha,d,A,p){
> + edis<-as.matrix(exp(-alpha*d))
> + edis<-sweep(edis,2,A,"*")
> + S<-rowSums(edis[,p>0])
> + mod<-glm(p~offset(2*log(S))+log(A),family=binomial)
> + deviance(mod)
> + }
>> (sol<-optimize(alphascan,c(0.001,10),d=d,A=A,p=p))
> Error: NA/NaN/Inf in foreign function call (arg 4)
>
>
> Details:
> Reference: Incidence Function Model in R, J Oksanen, 2004, available
> at:
> cc.oulu.fi/~jarioksa/opetus/openmeta/metafit.pdf
> OS: Mac OS 10.5.7
> R version: 2.9.0
> GUI: 1.28 Tiger build 32-bit
>
> Thanks,
> Steve Kimble
> Department of Forestry and Natural Resources
> Purdue UniversityWest Lafayette, IN 47907
>
> -- 
> View this message in context: http://www.nabble.com/Incidence-Function-Model-in-R-help-tp24138251p24346992.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

On Jul 6, 2009, at 9:52 AM, David Winsemius wrote:

>
> On Jul 5, 2009, at 5:15 PM, sjkimble wrote:
>
>>
>> R Help:
>>
>>>
>>> When I look at the object S, I see about half of them are
>>> zeroes.
>>
>> I've address the object S being zero so often by changing the value  
>> of
>> alpha, which I'm allowed to do according to the author. All values  
>> of S are
>> non zero and should not give Inf.
>>
>>> The "expectation" of a binomial model is for the LHS of the
>>> formulat to be either a 1/0 vector or something of the form
>>> cbind(events, non_events). You have satisfied that expectation  
>>> with p
>>> but there are only 2 such cases. It seems unreasonable to my  
>>> thinking
>>> to expect that a logistic regression model can deliver sensible  
>>> output
>>> from only 2 events. And that holds doubly (or perhaps infinitely?)
>>> true when you are starting out with half of your covariates equal to
>>> log(0) = -Inf.
>>
>> Object p is presence (1) or absence (0) of a species on the 12  
>> islands, and
>> some of them are rare so they are mostly zeros. Nevertheless I  
>> tried with a
>> more common species whose data are:
>>
>>   x.crd   y.crd           A p
>> 1  361763 1034071      94.169 1
>> 2  370325 1027277     127.642 1
>> 3  370416 1027166     127.961 1
>> 4  370471 1027148    1804.846 1
>> 5  369050 1031312    1790.493 1
>> 6  370908 1026354     199.103 1
>> 7  361562 1034311    2047.637 1
>> 8  365437 1022188 1622678.961 1
>> 9  347047 1025334      21.169 1
>> 10 349186 1024441  408556.801 1
>> 11 361762 1034052       3.414 0
>> 12 370799 1026557     103.994 0
>
> So now you have gone from a situation with only two events to one  
> with only two non-events. That is not a situation that is really any  
> better from a statistical point of view.
>>
>> The code, plus error message:
>>
>>> haliclona_reniera_manglaris<-read.table(file.choose(),header=T)
>>> attach(haliclona_reniera_manglaris)
>>> alphascan<-function(alpha,d,A,p){
>> + edis<-as.matrix(exp(-alpha*d))
>> + edis<-sweep(edis,2,A,"*")
>> + S<-rowSums(edis[,p>0])
>
> Hey,....  you told me you did <something> to get rid of the zeros in  
> S. When I take this code out of the functional wrapper I get:
>
> > S
>            1             2             3             4              
> 5             6             7             8
> 6.965905e-07  5.139157e-07  8.325874e-60  1.318603e-22   
> 1.329174e-22  0.000000e+00  1.390849e-94  1.755094e-97
>            9            10            11            12
> 1.411310e-134  0.000000e+00  0.000000e+00  0.000000e+00
>
> So now instead of 3/4 of them being zero only 1/3 of them are still  
> zero. That does not seem to be satisfactory.
>
>> + mod<-glm(p~offset(2*log(S))+log(A),family=binomial)
>> + deviance(mod)
>> + }
>>> (sol<-optimize(alphascan,c(0.001,10),d=d,A=A,p=p))
>
>
> So maybe the problem remain the same as before and supplying the  
> missing d will not be enough. You still (I think) need a dataset  
> that supplies enough information for a statistical approach.
>
>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list