[R] Newbie question: Linear regression with error bars.

ben@zoo.ufl.edu ben at zoo.ufl.edu
Tue Aug 29 15:05:35 CEST 2000


  Here are a few ideas to get you going.  The answers to your two basic
questions are (1) you can access the columns of a data frame with the
column number as x[,1], as an element in a list as x[[1]], and with the
column name as x$whatever; the column will be treated as a vector in any
case; (2) use the "weights" option to lm() [using 1/variance as the
weights] to do a weighted linear regression.

  I've included a confidence-interval plotter, originally written by Bill
Venables and hacked a bit, for your convenience.

  I don't have the context, but I would be careful with linear regression
in this context (there are obvious problems with the naive regression line
shown below).  It makes me nervous that the estimated error of the
intercept is so much smaller than the errors associated with the points
...

  Ben Bolker

## assuming that the column labels are the first line of your data file
x <- read.table("R.tmp",header=TRUE)  

plotCI <- function (x, y = NULL, uiw, liw = uiw, ylim=NULL, sfrac = 0.01, add=FALSE,
                    col=par("col"), lwd=par("lwd"), slty=par("lty"),
                    xlab=deparse(substitute(x)), ylab=deparse(substitute(y)), ...)  {
# from Bill Venables, R-list
  if (is.list(x)) {
    y <- x$y
    x <- x$x
  }
  if (is.null(y)) {
    if (is.null(x)) 
      stop("both x and y NULL")
    y <- as.numeric(x)
    x <- seq(along = x)
  }
  ui <- y + uiw
  li <- y - liw
  if (is.null(ylim)) ylim <- range(c(y, ui, li), na.rm=TRUE)
  if (add) {
    points(x, y, col=col, lwd=lwd, ...)
  } else {
    plot(x, y, ylim = ylim, col=col, lwd=lwd, xlab=xlab, ylab=ylab, ...)
  }
  smidge <- diff(par("usr")[1:2]) * sfrac
  segments(x, li, x, ui, col=col, lwd=lwd, lty=slty)
  x2 <- c(x, x)
  ul <- c(li, ui)
  segments(x2 - smidge, ul, x2 + smidge, ul, col=col, lwd=lwd)
  invisible(list(x = x, y = y))
}

## linear regr. of Energy as a function of Timestep
lm0 <- lm(x$Energy ~ x$Timestep ,weights=1/x$Var)
summary(lm0)
plotCI(x$Timestep,x$Energy,sqrt(x$Var))
abline(lm0)  ## add a regression line to the plot
interc <- summary(lm0)$coef[1,]  ## estimate, std. error, etc.
plotCI(0,interc["Estimate"],interc["Std. Error"],col="red",add=TRUE)


On Mon, 28 Aug 2000, Alan Aspuru-Guzik wrote:

> 
> Hello guys,
> I am a total newbie on R, having downloaded it, read the documentation and
> started playing with it right now.
> 
> My general question is what 'lr' model can be used for doing a linear
> regression on points that have a variance associated with them (ie. Monte
> Carlo simulation results).
> 
> Actually my Data sets look like:
> 
> Timestep Energy Variance_of_the_Energy
> 0.0005 -14.876840 0.000670 
> 0.001 -14.883960 0.000670 
> 0.002 -14.887360 0.000700 
> 0.05 -14.888730 0.000430 
> 0.1 nan nan 
> 
> And what I want to do is to obtain the intercept of the best possible
> linear fit (weeding out outliers if necessary) but having the intercept
> with an associated error bar.
> 
> What I managed to do today is read the files from a table and learn how to
> access the columns of the data frame . And managed to do a linear
> regression of the type x ~ y with normal vectors.
> 
> I will eventually will like to use something like the leaps package
> (regression subset selection) and then do a linear regression including
> the error bars with the best points obtained by the 'leaps' package.
> 
> The specific questions are:
> x. How can I Convert a data frame's column into a vector?
> x. Now that I have the vectors, how can I specify a linear model of a
> linear regression with associated variances?
> 
> Thanks a lot in advance!!
> 
> Alan Aspuru-Guzik
> 
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 
> 

-- 
318 Carr Hall                                bolker at zoo.ufl.edu
Zoology Department, University of Florida    http://www.zoo.ufl.edu/bolker
Box 118525                                   (ph)  352-392-5697
Gainesville, FL 32611-8525                   (fax) 352-392-3704

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list