[R] problems with errors in randomization tests

CR Bleay, School Biological Sciences Colin.Bleay at bristol.ac.uk
Fri Jul 2 18:53:11 CEST 2004

I have been having problems with a randomization test.

essentially the goal is to use an original dataset and create a new data 
set with a pre-specified number of data points removed at random points. 
then to perform a glm.nb model on the new data set and store the 
coefficients and statistics from an anova table of the model in a number of 
arrays . this process is repeated a number of times (say 1000) so that i 
can perform descriptive stats and so look at the power of the original 
model as a function of sample size.

The section of code i am having problems with is :

while (countn<repetitions) {
												shit<-unique(sample(x, no)) # randomly selects the data points 
																				to be removed
												density.random2<-density.random1[-shit,] #cretes new dataset
), data=density.random2, na.action=na.omit, control = 
glm.control(maxit=100)) #performs model
												random.anova<-anova.glm(random.model, test="Chisq")

The problem that i am having is that every so often a data set will be 
created that will generate the following error that stops the function at 
the point of the glm.nb function:

Error: NA/NaN/Inf in foreign function call (arg 1)/In addition: Warning 
message: Step size truncated due to divergence

I have a number of questions about this.

1/ how can i prevent it from exiting the function. i have tried "try" and 
this will not resolve the issue, if i place it at the glm.nb function it 
results in an error:

Step size truncated due to divergence 
Error in "[<-"(`*tmp*`, countn, 
value = random.coefficients[1]) : 
	incompatible types

Is it possible to create an "if" step, ie. if error ignore and don't 
perform the assignment of data to the arrays else continue?

2/ given that a data set that would generate this error will be a valid 
dataset what should i do about the coefficients etc that are generated, 
ignoring those datasets would result in selection on my results.

3/ what is the actual cause of the error in the first place with respect to 
the data and the model

any assistance would be very much appreciated.

i have searched through the archives and could not find a solution. I have 
to admit that i do not adequately understand error capture and handling in 
R, and have been unable to find any documentation that gives a good 
explanation of it.



Dr Colin Bleay
Dept. Biological Sciences,
University of Bristol,
Woodlands rd.,
BS8 1UG.

Tel: 44 (0)117 928 7470
Fax: 44 (0)117

More information about the R-help mailing list