[R] Need to fit a regression line using orthogonal residuals

Henrik Bengtsson hb at stat.berkeley.edu
Wed Jan 31 03:18:56 CET 2007


The iwpca() in Bioconductor package aroma.light takes a matrix of
column vectors and "fits an R-dimensional hyperplane using iterative
re-weighted PCA".  From ?iwpca.matrix:

"Arguments:
X 	N-times-K matrix where N is the number of observations and K is the
number of dimensions.

Details:
This method uses weighted principal component analysis (WPCA) to fit a
R-dimensional hyperplane through the data with initial internal
weights all equal. At each iteration the internal weights are
recalculated based on the "residuals". If method=="L1", the internal
weights are 1 / sum(abs(r) + reps). This is the same as
method=function(r) 1/sum(abs(r)+reps). The "residuals" are orthogonal
Euclidean distance of the principal components R,R+1,...,K. In each
iteration before doing WPCA, the internal weighted are multiplied by
the weights given by argument w, if specified."

Thus, in your case you want to do:

X <- cbind(x,y)
fit <- iwpca(X)

There are different ways of robustifying the estimate, cf. argument
'method'. For heteroscedastic noise, fitting in L1 is convenient since
there is no bandwidth parameter.

Hope this helps

Henrik

On 1/30/07, Bill.Venables at csiro.au <Bill.Venables at csiro.au> wrote:
> Jonathon Kopecky asks:
>
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Jonathon Kopecky
> Sent: Tuesday, 30 January 2007 5:52 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Need to fit a regression line using orthogonal residuals
>
> I'm trying to fit a simple linear regression of just Y ~ X, but both X
> and Y are noisy.  Thus instead of fitting a standard linear model
> minimizing vertical residuals, I would like to minimize
> orthogonal/perpendicular residuals.  I have tried searching the
> R-packages, but have not found anything that seems suitable.  I'm not
> sure what these types of residuals are typically called (they seem to
> have many different names), so that may be my trouble.  I do not want to
> use Principal Components Analysis (as was answered to a previous
> questioner a few years ago), I just want to minimize the combined noise
> of my two variables.  Is there a way for me to do this in R?
> [WNV] There's always a way if you are prepared to program it.  Your
> question is a bit like asking "Is there a way to do this in Fortran?"
> The most direct way to do it is to define a function that gives you the
> sum of the perpendicular distances and minimise it using, say, optim().
> E.g.
>         ppdis <- function(b, x, y) sum((y - b[1] - b[2]*x)^2/(1+b[2]^2))
>         b0 <- lsfit(x, y)$coef  # initial value
>         op <- optim(b0, ppdis, method = "BFGS", x=x, y=y)
>         op  # now to check the results
>         plot(x, y, asp = 1)  # why 'asp = 1'?? exercise
>         abline(b0, col = "red")
>         abline(op$par, col = "blue")
> There are a couple of things about this you should be aware of, though
> First, this is just a fiddly way of finding the first principal
> component, so your desire not to use Principal Component Analysis is
> somewhat thwarted, as it must be.
> Second, the result is sensitive to scale - if you change the scales of
> either x or y, e.g. from lbs to kilograms, the answer is different.
> This also means that unless your measurement units for x and y are
> comparable it's hard to see how the result can make much sense.  A
> related issue is that you have to take some care when plotting the
> result or orthogonal distances will not appear to be orthogonal.
> Third, the resulting line is not optimal for either predicting y for a
> new x or x from a new y.  It's hard to see why it is ever of much
> interest.
> Bill Venables.
>
>
> Jonathon Kopecky
> University of Michigan
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list