[R] optimization problem

Sun Jan 17 17:19:59 CET 2010

Klaus,

I am happy to know that the quadratic cost LSAP seems to work well for you.  

The Hungarian algorithm is a classic for solving linear sum assignment problem, which is closely related to matching in bipartite graphs.  You can google or wiki these terms to get papers and books on this topic. 

Also look at the PhD thesis by one of Nick Higham's students (Section 3.5):

 Parallel Solution of SVD-Related Problems, With Applications

at

http://www.maths.manchester.ac.uk/~higham/misc/past-students.php

This thesis has a good discussion of different variants of your problem, some of which are more general than yours.  

Your problem is one of the variants of "procrustes-type" problems found in multivariate statistics.  The very same problem that you posed is supposed to occur in multidimensional scaling.  So, you might also want to look in that literature.

Best,
Ravi.

____________________________________________________________________

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvaradhan at jhmi.edu

----- Original Message -----
From: klausch at gmx.de
Date: Sunday, January 17, 2010 8:06 am
Subject: Re: [R] optimization problem
To: Ravi Varadhan <rvaradhan at jhmi.edu>, erwin.kalvelagen at gmail.com, hwborchers at googlemail.com
Cc: r-help at stat.math.ethz.ch

> Dear Erwin, Ravi and Hans Werner,
> 
> thanks a lot for your replies. I don't think I have access to Cplex 
> and therefore probably cannot try that out but will read about MIQP.
> 
> I played a bit then around with Ravi's suggestions and made also the 
> observation that the linear cost function often found the exact 
> solution but now always - but in my tests the quadratic cost function 
> version always found the correct result. But I'll test that still a 
> bit more detailed.
> 
> Would anyone of you know a good reference for an overview what 
> algorithms are there and for which problems they can be used?
> 
> Thank a lot again!
> 
> Klaus
> 
> -------- Original-Nachricht --------
> > Datum: Sat, 16 Jan 2010 23:42:08 -0500
> > Von: Ravi Varadhan <rvaradhan at jhmi.edu>
> > An: Erwin Kalvelagen <erwin.kalvelagen at gmail.com>
> > CC: r-help at stat.math.ethz.ch
> > Betreff: Re: [R] optimization problem
> 
> > 
> > Interesting! 
> > 
> > Now, if I change the "cost matrix", D,  in the LSAP formulation slightly
> > such that it is quadratic, it finds the best solution to your example:
> > 
> > 
> > pMatrix.min <- function(A, B) {
> > # finds the permutation P of A such that ||PA - B|| is minimum
> > # in Frobenius norm
> > # Uses the linear-sum assignment problem (LSAP) solver
> > # in the "clue" package
> > # Returns P%*%A and the permutation vector `pvec' such that
> > # A[pvec, ] is the permutation of A closest to B
> > 	n <- nrow(A)
> > 	D <- matrix(NA, n, n)
> > 	for (i in 1:n) {
> > 	for (j in 1:n) {
> > #	D[j, i] <- sqrt(sum((B[j, ] - A[i, ])^2))
> > 	D[j, i] <- (sum((B[j, ] - A[i, ])^2))  # this is better
> > 	} }
> > vec <- c(solve_LSAP(D))
> > list(A=A[vec,], pvec=vec)
> > }
> > 
> >  > X<-pMatrix.min(A,B)
> > > X$pvec
> > [1] 6 1 3 2 4 5
> > > dist(X$A, B)
> > [1] 10.50172
> > >
> > 
> > This should be fine.  Any counter-examples to this?!
> > 
> > Best,
> > Ravi.
> > ____________________________________________________________________
> > 
> > Ravi Varadhan, Ph.D.
> > Assistant Professor,
> > Division of Geriatric Medicine and Gerontology
> > School of Medicine
> > Johns Hopkins University
> > 
> > Ph. (410) 502-2619
> > email: rvaradhan at jhmi.edu
> > 
> > 
> > ----- Original Message -----
> > From: Erwin Kalvelagen <erwin.kalvelagen at gmail.com>
> > Date: Saturday, January 16, 2010 5:26 pm
> > Subject: Re: [R] optimization problem
> > To: Ravi Varadhan <rvaradhan at jhmi.edu>
> > Cc: r-help at stat.math.ethz.ch
> > 
> > 
> > > I believe this is a very good approximation but not a 100% correct
> > > formulation of the original problem.
> > > 
> > > E.g. for
> > > 
> > > 
> > > A <- matrix(c(
> > >    -0.62668585718,-0.78785288063,-1.32462887089, 0.63935994044,
> > >    -1.99878497801, 4.42667400292, 0.65534961645, 1.86914537669,
> > >    -0.97229929674, 2.37404268115, 0.01223810011,-1.24956493590,
> > >     0.92711756473,-1.51859351813, 3.76707054743,-2.30641777527,
> > >    -1.98918429361,-0.43634856903,-3.65989556798, 0.00073904070,
> > >    -1.44153837918, 1.42004180214,-1.81388322489,-1.92423923917,
> > >    -1.46322482727,-1.81783134835,-2.59801302663, 2.03462135096,
> > >     1.32546711550,-0.21519150247,-1.94319654347, 0.68773346973,
> > >    -2.75094791807,-1.44814080195, 3.14197123843,-0.52521369103
> > > ),nrow=6)
> > > 
> > > B<-diag(nrow=6)
> > > 
> > > X<-pMatrix.min(A,B)
> > > 
> > > dist(X$A, B)
> > > 
> > > X$pvec
> > > 
> > > bestpvec <- c(6,1,3,2,4,5)
> > > 
> > > dist(A[bestpvec,],B)
> > > 
> > > you get a norm of 10.58374 while the true optimal norm is 
> 10.50172.  For
> > > small problems you often get the optimal solution, but the error 
> > > caused by
> > > linearizing the objective becomes larger if the problems are 
> larger. 
> > > But the
> > > approximation is actually very good.
> > > 
> > > Erwin
> > > 
> > > 
> > > ----------------------------------------------------------------
> > > Erwin Kalvelagen
> > > Amsterdam Optimization Modeling Group
> > > erwin at amsterdamoptimization.com
> > > 
> > > ----------------------------------------------------------------
> > > 
> > > 
> > > On Sat, Jan 16, 2010 at 2:01 PM, Ravi Varadhan <rvaradhan at jhmi.edu>
> > wrote:
> > > 
> > > >
> > > > Yes, it can be formulated as an LSAP.  It works just fine.  I have
> > checked
> > > > it with several 3 x 3 examples.
> > > >
> > > > Here is another convincing example:
> > > >
> > > > n <- 50
> > > >
> > > > A <- matrix(rnorm(n*n), n, n)
> > > >
> > > > > # Find P such that ||PA - C|| is minimum
> > > >
> > > > vec <- sample(1:n, n, rep=FALSE)  # a random permutation
> > > >
> > > > C <- A[vec, ]  # the target matrix is just a permutation of 
> original 
> > > matrix
> > > >
> > > > B <- pMatrix.min(A, C)$A
> > > >
> > > > > dist(B, C)
> > > > [1] 0
> > > >
> > > > > dist(A, C)
> > > > [1] 69.60859
> > > >
> > > > Ravi.
> > > > ____________________________________________________________________
> > > >
> > > > Ravi Varadhan, Ph.D.
> > > > Assistant Professor,
> > > > Division of Geriatric Medicine and Gerontology
> > > > School of Medicine
> > > > Johns Hopkins University
> > > >
> > > > Ph. (410) 502-2619
> > > > email: rvaradhan at jhmi.edu
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: Erwin Kalvelagen <erwin.kalvelagen at gmail.com>
> > > > Date: Saturday, January 16, 2010 1:36 pm
> > > > Subject: Re: [R] optimization problem
> > > > To: Ravi Varadhan <rvaradhan at jhmi.edu>
> > > > Cc: r-help at stat.math.ethz.ch
> > > >
> > > >
> > > > > I also have doubts this can be formulated correctly as a linear
> > > > assignment
> > > > > problem. You may want to check the results with a small example.
> > > > >
> > > > > Erwin
> > > > >
> > > > > ----------------------------------------------------------------
> > > > > Erwin Kalvelagen
> > > > > Amsterdam Optimization Modeling Group
> > > > > erwin at amsterdamoptimization.com
> > > > >
> > > > > ----------------------------------------------------------------
> > > > >
> > > > >
> > > > > On Sat, Jan 16, 2010 at 9:59 AM, Ravi Varadhan <rvaradhan at jhmi.edu>
> > > > wrote:
> > > > >
> > > > > >
> > > > > > Thanks, Erwin, for pointing out this mistake.
> > > > > >
> > > > > > Here is the correct function for Frobenius norm.
> > > > > >
> > > > > > Klaus - Just replace the old `dist' with the following one.
> > > > > >
> > > > > > dist <- function(A, B) {
> > > > > >  # Frobenius norm of A - B
> > > > > >  n <- nrow(A)
> > > > > >  sqrt(sum((B - A)^2))
> > > > > >  }
> > > > > >
> > > > > > Ravi.
> > > > > >
> > > > > >
> > ____________________________________________________________________
> > > > > >
> > > > > > Ravi Varadhan, Ph.D.
> > > > > > Assistant Professor,
> > > > > > Division of Geriatric Medicine and Gerontology
> > > > > > School of Medicine
> > > > > > Johns Hopkins University
> > > > > >
> > > > > > Ph. (410) 502-2619
> > > > > > email: rvaradhan at jhmi.edu
> > > > > >
> > > > > >
> > > > > > ----- Original Message -----
> > > > > > From: Erwin Kalvelagen <erwin.kalvelagen at gmail.com>
> > > > > > Date: Saturday, January 16, 2010 2:35 am
> > > > > > Subject: Re: [R] optimization problem
> > > > > > To: r-help at stat.math.ethz.ch
> > > > > >
> > > > > >
> > > > > > > Ravi Varadhan <rvaradhan <at> jhmi.edu> writes:
> > > > > > > > dist <- function(A, B) {
> > > > > > > > # Frobenius norm of A - B
> > > > > > > >     n <- nrow(A)
> > > > > > > >     sum(abs(B - A))
> > > > > > > > }
> > > > > > > >
> > > > > > >
> > > > > > > See  for a definition of the
> > > > > > > Frobenius norm.
> > > > > > >
> > > > > > >
> > > > > > > Erwin
> > > > > > >
> > > > > > > ----------------------------------------------------------------
> > > > > > > Erwin Kalvelagen
> > > > > > > Amsterdam Optimization Modeling Group
> > > > > > > erwin at amsterdamoptimization.com
> > > > > > >
> > > > > > >
> > > > > > > ______________________________________________
> > > > > > > R-help at r-project.org mailing list
> > > > > > >
> > > > > > > PLEASE do read the posting guide
> > > > > > > and provide commented, minimal, self-contained, reproducible
> > code.
> > > > > >
> > > >
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > 
> > PLEASE do read the posting guide
> > 
> > and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Preisknaller: GMX DSL Flatrate für nur 16,99 Euro/mtl.!
>