[R] Solving an optimization problem: selecting an "optimal" subset

Hans W Borchers hwborchers at googlemail.com
Sat Jan 30 14:08:16 CET 2010


Dimitri Shvorob <dimitri.shvorob <at> gmail.com> writes:

> 
> > Is it a subset of a vector containing 100 elements, or 10000ths? 
> 
> I need to pick 2-40 elements out of a 50-200-element-long vector.
> 
> > A random number of elements that should be chosen, or the best 10 values
> > which sums up to a defined value? 
> 
> The best 10 values. 
> 
> I still think that Rcplex is the way to go; what's missing is some
> linear-algebra expertise on my part to set up the problem as quadratic. 
> 

This is a "subset sum" problem and has been discussed here in December for
integers. For floats it is even more difficult as there is no reliable
stopping criterion.

Can you settle for an approximate solution? If you insist on the true
minimum, you will have to live with the combinatorial explosion searching
through all (appropriate) subsets.

Rcplex: This is a combinatorial problem and cannot be formulated as a
quadratic optimization problem.
It is NP-hard and cannot be solved via Dynamic Programming.
As a combinatorial problem, it also will not yield to GA approaches, at
least not to find the minimum reliably.

Hans Werner

P.S.:   Example (solution maybe is the best possible)

    set.seed(8232)
    x <- runif(100)

    # Find subset that sums up close to 2.0 !
    i <- c(84,54,11,53,88,12,26,45,25,62,96,23,78,77,66,1)
    sum(x[i])
    #=> 2.000451



More information about the R-help mailing list