[R] Subset sum problem.

Hans W Borchers hwborchers at googlemail.com
Thu Dec 10 05:20:39 CET 2009


Geert Janssens <janssens-geert <at> telenet.be> writes:
> 
> On Wednesday 9 December 2009, Hans W Borchers wrote:
> > Geert Janssens <janssens-geert <at> telenet.be> writes:
> > > [ ... ]
> > > Has anybody tackled this issue before in R ? If so, I would be very
> > > grateful if you could share your solution with me.
> >
> > Is it really true that you only want to see a "Yes or No" answer to this
> > question whether a subset sums up to s --- without learning which numbers
> > this subset is composed of (the pure SUBSET SUM problem)?
> > Then the following procedure does that in a reasonable amount of time
> > (returning 'TRUE' or 'FALSE' instead of "Y-or-N"):
> >
> Unfortunately no. I do need the numbers in the subset. But thank you for 
> presenting this code.
> 
> Geert
> 

Okay then, here we go. But don't tell later that your requirement was to
generate _all_ subsets that add up to a certain amount.  I will generate
only one (with largest elements).

For simplicity I assume that the set is prepared s.t. it is decreasingly
ordered, has no elements larger than the amount given, and has a total sum
larger than this amount.

    # Assume S decreasing, no elements > t, total sum >= t
    solveSubsetSum <- function(S, t) {
      L <- c(0)
      inds <- NULL
      for (i in 1:length(S)) {
        # L <- unique(sort(c(L, L + S[i])))
        L <- c(L, L+S[i])
        L <- L[L <= t]
        if (max(L) == t) {
          inds <- c(i)
          t <- t - S[i]
          while (t > 0) {
            K <- c(0)
            for (j in 1:(i-1)) {
              K <- c(K, K+S[j])
              if (t %in% K) break
            }
            inds <- c(inds, j)
            t <- t - S[j]
          }
          break
        }
      }
      return(inds)
    }

    # former example
    amount <- 4748652
    products <- 
    c(30500,30500,30500,30500,42000,42000,42000,42000,
      42000,42000,42000,42000,42000,42000,71040,90900,
      76950,35100,71190,53730,456000,70740,70740,533600,
      83800,59500,27465,28000,28000,28000,28000,28000,
      26140,49600,77000,123289,27000,27000,27000,27000,
      27000,27000,80000,33000,33000,55000,77382,48048,
      51186,40000,35000,21716,63051,15025,15025,15025,
      15025,800000,1110000,59700,25908,829350,1198000,1031655)

    # prepare set
    prods <- products[products <= amount]  # no elements > amount
    prods <- sort(prods, decreasing=TRUE)  # decreasing order

    # now find one solution
    system.time(is <- solveSubsetSum(prods, amount))
    #  user  system elapsed 
    # 0.320   0.032   0.359 

    prods[is]
    #  [1]   70740   70740   71190   76950   77382   80000   83800
    #  [8]   90900  456000  533600  829350 1110000 1198000

    sum(prods[is]) == amount
    # [1] TRUE

Note that running times and memory needs will be much higher when more
summands are needed.  To mention that too: I have not tested the code
extensively.

Regards
Hans Werner




More information about the R-help mailing list