[R] Problems with "predict" function

Simmering, Jacob E jacob-simmering at uiowa.edu
Wed Jan 31 18:29:35 CET 2018


Your messages about masking come from attaching your data set to the R session. In general, that is bad practice as it leads to confusing code. It is typically better to use the “data” argument in things like lm() to accomplish this task. 

As near as I can tell, your second set of predictions is not working because your call to lm() directly references vectors from the highdf data frame. If you do this:

h.lm <- lm(sales ~ month, data = highdf)
news <- data.frame(month = nrow(ptr) + 1)
hcs <- predict(h.lm, news, interval = "predict")

You should see the expected results. Note that here I’m directly referring to the variables “sales” and “month” and not using the bracket notation. 

> On Jan 31, 2018, at 11:08 AM, WRAY NICHOLAS via R-help <r-help at r-project.org> wrote:
> 
> Hello,
> 
> I am synthesising some sales data over a twelve month period, and then trying to
> use the "predict" function, firstly to generate a thirteenth month forecast with
> upper and lower 95% confidence limits.  So far so good
> 
> But what I then want to do is add the upper sales value at the 95th confidence
> limit to the vector of thirteen months and their respective sales to create a
> fourteenth month with a predicted sale and the 95% upper confidence limit for
> this, and so on  The idea being to create a "trumpet" of extreme posistions
> 
> But I keep getting instead of one line of predictions for the fourteenth month,
> a whole set.  What I don't understand is why it works OK with my original
> synthetic set of twelve months, but doesn't like the set of thirteen sales data
> points, even though as far as I can see I'm just repeating the process, albeit
> with a different label  I have tried to use different column labels in case that
> was the problem but it doesn't seem to make any difference
> 
> I am also getting these weird warning messages telling me that things are being
> "masked":
> 
> The following object is masked _by_ .GlobalEnv:
> 
> sales
> 
> The following object is masked from highdf (pos = 4):
> 
> sales
> Etc
> 
> Is it something to do with attaching the various data frames?  I am a bit at sea
> on this and would be thankful for any pointers
> 
> Nick
> 
> My code:
> 
> 
> m<-runif(1,0,1)
> m
> mres<-m*(seq(1,12))
> mres
> ssd<-rexp(1,1)
> ssd
> devs<-rep(0,length(mres))
> for(i in 1:length(mres)){devs[i]<-rnorm(1,0,ssd)}
> devs
> plot(-10,-10,xlim=c(1,24),ylim=c(0,20000))
> sales<-round((mres+devs)*1000)
> 
> points(sales,pch=19)
> 
> ptr<-cbind(1:length(sales),sales,sales,sales)
> 
> ptr
> sdf<-data.frame(cbind(1:nrow(ptr),sales))
> sdf
> 
> colnames(sdf)<-c(“monat”,“mitte”)
> sdf
> attach(sdf)
> s.lm<-lm(mitte~monat)
> 
> s.lm
> abline(s.lm,lty=2)
> news<-data.frame(monat=nrow(sdf)+1)
> news
> fcs<-predict(s.lm,news,interval="predict")
> fcs
> 
> points(1+nrow(ptr),fcs[,1],col="grey",pch=19)
> points(1+nrow(ptr),fcs[,2])
> points(1+nrow(ptr),fcs[,3])
> ptr<-rbind(ptr,c(1+nrow(ptr),fcs[2],fcs[1],fcs[3]))
> ptr
> 
> highdf<-data.frame(ptr[,c(1,4)])
> highdf
> colnames(highdf)<-c(“month”,“sales”)
> highdf
> 
> attach(highdf)
> h.lm<-lm(highdf[,2]~highdf[,1])
> h.lm
> abline(h.lm,col="gray",lty=2)
> news<-data.frame(month=nrow(ptr)+1)
> news
> hcs<-predict(h.lm,news,interval="predict")
> hcs
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list