[R] package mgcv - predict with bam: Error in X[ind, ] : subscript out of bounds
may.katharina at googlemail.com
Mon Feb 3 12:59:57 CET 2014
many thanks for looking into this and making me understand the problem!
I'll adjust my factor levels right away...
On 3 February 2014 12:42, Simon Wood <s.wood at bath.ac.uk> wrote:
> Hi Katharina,
> Thanks for sending this.
> The problem is that the prediction data for site contain levels not
> available in the (useable non-NA) fit data...
>  "KRB" "NP.FOR" "WKS.FRE" "WKS.KRE" "WKS.RIE" "WKS.WUE"
>  "KRB" "NP.FOR" "RIE.2" "WKS.BBR" "WKS.FRE" "WKS.HOE" "WKS.KRE"
>  "WKS.RIE" "WKS.WUE"
> predict.lm has a check for this, and so fails with a rather more informative
> error message. e.g.
> m0 <- lm(sensor1 ~ sensor2 + site + site:NthSampling,
> ... factor site has new levels RIE.2, WKS.BBR
> I'll add a better check to predict.gam.
> ps. if you want predictions with the random effects for site set to zero
> then one trick is to use terms like s(site,bs="re",by=dum) in fitting with
> dum set to 1. Then in prediction you can set 'site' to any existing level,
> and dum to zero, in order to get a prediction for the missing level, with
> the 'site' effect set to zero.
> On 02/02/14 17:52, Katharina May wrote:
>> Hi Simon,
>> thank you for your reply, I really appreciate any help to understand
>> the problem here...
>> Unluckily the package upgrade didn't help with this issue.
>> An example reproducing the error, and a current sessionInfo() Output
>> can be found below.
>> Many thanks once again,
>> R Code Example
>> #retrieve xylemRohWeekXnn2011 test data frame
>> eval( expr = parse( text =
>> xylemRohWeekXnn.fit.bam <- bam(sensor1 ~ sensor2 + s(site, bs="re")
>> + s(site, NthSampling, bs="re") , data=xylemRohWeekXnn2011,
>> #subset data containing gaps for predicting
>> gapData <- xylemRohWeekXnn2011[is.na(xylemRohWeekXnn2011[,2]) &
>> !is.na(xylemRohWeekXnn2011[,11]),c(2:3,6:7, 11)]
>> xylemRohWeekXnnSite.fit <-
>> predict.gam(xylemRohWeekXnn.fit.bam,gapData, type="response", se=F)
>> My current Session Information (sessionInfo() Output - also confirming
>> that the problem exists on both Windows and Mac OS X):
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>  en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>> attached base packages:
>>  stats graphics grDevices utils datasets methods base
>> other attached packages:
>>  mgcv_1.7-28 nlme_3.1-113 RCurl_1.95-4.1 bitops_1.0-6
>> loaded via a namespace (and not attached):
>>  grid_3.0.2 lattice_0.20-24 Matrix_1.1-2 tools_3.0.2
>> On 31/01/14 12:57, Simon Wood wrote:
>>> Hi Katharina,
>>> Could you try upgrading to mgcv_1.7-28, please? There was an occasional
>>> problem to do with matching factor levels, which is fixed, but I'm not
>>> very confident that is what is going on.
>>> If upgrading doesn't work, is there any chance you could send me a small
>>> example dataset and code that produces the error, and I'll look at it?
>>> Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
>>> +44 (0)1225 386603 http://people.bath.ac.uk/sw283
> Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
> +44 (0)1225 386603 http://people.bath.ac.uk/sw283
BSc. Forest Science and Resource Management
IT Specialist (CCI)
More information about the R-help