[R] Optimization in R similar to MS Excel Solver

Sat Oct 20 11:49:00 CEST 2012

I do not know what algorithms the Excel solver function uses.

See inline for how to do what you want in R.
Forgive me if I have misinterpreted your request.

On 19-10-2012, at 16:25, Richard James wrote:

> Dear Colleagues,
> I am attempting to develop an optimization routine for a river suspended
> sediment mixing model. I have Calcium and Magnesium concentrations (%) for
> sediments from 4 potential source areas (Topsoil, Channel Banks, Roads,
> Drains) and I want to work out, based on the suspended sediment calcium and
> magnesium concentrations, what are the optimal contributions from each
> source area to river suspended sediments. The dataset is as follows:
> 
>                                    Topsoil         Channel Bank          
> Roads          Drains
> Ca(%)                         0.03                0.6                              
> 0.2                  0.35
> Mg(%)                        0.0073           0.0102                       
> 0.0141           0.012
> Contribution           0.25                0.25                            
> 0.25                0.25
> 
> and
> 
> Ca in river(%)         0.33
> Mg in river (%)      0.0114
> 

Read data (it would have been a lot more helpful if you had provided the result of dput for the data and the target).

basedat <- read.table(text="Topsoil         Channel-Bank          Roads          Drains
Ca(%)         0.03             0.6                0.2             0.35
Mg(%)         0.0073           0.0102             0.0141          0.012
Contribution  0.25             0.25               0.25            0.25", header=TRUE)

basedat
target <- c("Ca in river(%)"=0.33,"Mg in river (%)"=0.0114)

# convert to matrix and vector for later use

bmat   <- as.matrix(basedat[1:2,])
pstart <- as.numeric(basedat["Contribution",1:3])

> I want to optimize the contribution values (currently set at 0.25) by
> minimizing the sum of squares of the relative errors of the mixing model,
> calculated as follows:
> 
> SSRE =( (x1-((a1*c1)+(a2*c2)+(a3*c3)+(a4*c4))/x1)^2) +
> (x2-((b1*c1)+(b2*c2)+(b3*c3)+(b4*c4))/x2)^2)
> 

I do not understand why you are dividing by x1 and x2. It make no sense to me to calculate (xi- (xi_calculated)/xi)^2
Given what your stated purpose then (xi- xi_calculated)^2 or something like (1-x1/xi_calculated)^2 seem more appropriate.

> Where:
> x1 = calcium in river;
> x2 = magnesium in river;
> a1:a4 = Ca in topsoil, channel banks, roads, drains
> b1:b4 = Mg in topsoil, channel banks, roads, drains
> c1:4 = Contribution to be optimized
> 
> I can generate a solution very quickly using the MS Excel Solver function,
> however later I want to be able to run a Monte-Carlo simulation on to
> generate confidence intervals based on variance in the source area
> concentrations – hence my desire to use R to develop the mixing model.
> 
> So far I have been using the ‘optim’ function, however I’m confused as to
> what form the ‘par’ and ‘fn’ arguments should take from my data above. I am
> also unsure of how to write the model constraints – i.e. total contribution
> should sum to 1, and all values must be non-negative.
> 

The 'par' should be a vector of starting values of the parameters for your objective function.
The 'fn' is the function that calculates a scalar given  the parameter values.

Your 'par' is a vector with all elements between 0 and 1 and with a sum == 1.
That can't be done with optim but you can simulate the requirements by letting optim work with  a three element vector and defining the fourth value as 
1-sum(first three params). The requirement that all params must lie between 0 and 1 can be met by making 'fn' return a large value when the requirement is not met.

Some code:

SSRE  <- function(parx) {
    par<- c(parx,1-sum(parx))
    if(all(par > 0 & par < 1)) { # parameters meet requirements
        sum((target - (bmat %*% par))^2)  # this is a  linear algebra version of your objective without  the division by xi
# or if you want to divide by target (see above) see below in the benchmark section for comment
#        sum(((target - (bmat %*% par))/target)^2)
    } else 1e7  # penalty for parameters not satisfying constraints
}

SSRE(pstart)

z <- optim(pstart,SSRE)
z
c(z$par, 1-sum(z$par))  # final contributions

Results:

# > z <- optim(pstart,SSRE)
# > z
# $par
# [1] 0.1492113 0.2880078 0.2950191
# 
# $value
# [1] 2.157446e-12
# 
# $counts
# function gradient 
#      108       NA 
# 
# $convergence
# [1] 0
# 
# $message
# NULL

You can also try this:

z <- optim(pstart,SSRE,method="BFGS")
z
c(z$par, 1-sum(z$par))

z <- nlminb(pstart,SSRE)
z
c(z$par, 1-sum(z$par))

If you intend to do run these optimizations many times I wouldn't use optim without specifying the method argument.
See this benchmark.

> library(rbenchmark)
> benchmark(optim(pstart,SSRE),optim(pstart,SSRE,method="BFGS"), nlminb(pstart,SSRE),
+           replications=1000, columns=c("test","replications","elapsed","relative"))
                                  test replications elapsed relative
3                 nlminb(pstart, SSRE)         1000   0.829    1.264
1                  optim(pstart, SSRE)         1000   1.647    2.511
2 optim(pstart, SSRE, method = "BFGS")         1000   0.656    1.000

If you use  sum(((target - (bmat %*% par))/target)^2) as objective you'll see different timings. nlminb turns out to be the fastest.

good luck,

Berend

> Any help, advise, or suggestions to this problem would be very much
> appreciated.  
> 
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Optimization-in-R-similar-to-MS-Excel-Solver-tp4646759.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.