[R] Very slow optim()

Sat Mar 13 12:03:48 CET 2021

Hi, Deepayan:

On 2021-03-13 01:27, Deepayan Sarkar wrote:
> On Sat, Mar 13, 2021 at 10:08 AM Spencer Graves
> <spencer.graves using effectivedefense.org> wrote:
>>
>> TWO COMMENTS:
>>
>>
>> 1.  DID YOU ASSIGN THE OUTPUT OF "optim" to an object, like "est <-
>> optim(...)"?  If yes and if "optim" terminated normally, the 60,000+
>> paramters should be there as est$par.  See the documentation on "optim".
>>
>>
>> 2.  WHAT PROBLEM ARE YOU TRYING TO SOLVE?
>>
>>
>>            I hope you will forgive me for being blunt (or perhaps bigoted), but
>> I'm skeptical about anyone wanting to use optim to estimate 60,000+
>> parameters.  With a situation like that, I think you would be wise to
>> recast the problem as one in which those 60,000+ parameters are sampled
>> from some hyperdistribution characterized by a small number of
>> hyperparameters.  Then write a model where your observations are sampled
>> from distribution(s) controlled by these random parameters.  Then
>> multiply the likelihood of the observations by the likelihood of the
>> hyperdistribution and integrate out the 60,000+ parameters, leaving only
>> a small number hyperparameters.
> 
> Just a comment on this comment: I think it's perfectly reasonable to
> optimize 60k+ parameters with conjugate gradient. CG was originally
> developed to solve linear equations of the form Ax=b. If x was not
> large in size, one would just use solve(A, b) instead of an iterative
> method.
> 
> Use of CG is quite common in image processing. A relatively small
> 300x300 image will give you 90k parameters.
> 
> -Deepayan
> 

	  Thanks for this.

	  If both A and b are 300x300, then x will also be 300x300.

	  What do you do in this case if A is not square or even ill conditioned?

	  Do you care if you get only one of many possible or approximate 
solutions, and the algorithm spends most of its time making adjustments 
in a singular subspace that would have best been avoided?

	  Spencer