[R] Looking for R-code for non-negative matrix factorization in the presence of Gaussian or Poisson noise

Thomas Lumley tlumley at u.washington.edu
Mon Jun 11 15:06:32 CEST 2007


On Mon, 11 Jun 2007, christian.ritter at shell.com wrote:

>
> Hi all,
>
> Has any of you implemented code for non-negative matrix factorization to 
> solve
>
> Y=T P' +E; dim(Y)=n,p ; dim(T)=n,nc; dim (P)=(p,nc); dim(E)=n,p
>
> where T and P must be non-negative and E either Gaussian or Poisson noise.
>
> I'm looking for two variants:
>
> 1. Easy (I think), T is known (that is we just want to solve the general 
> inverse problem)

This is non-negative least squares, a quadratic programming problem, for 
which there is code (at least if n and nc are not too big)

>
> 2. Harder (?), T is unknown (under some restrictions) [as an 
> intermediate, we may want to fix nc]
>

Even with fixed nc this is Distinctly Non-trivial. It often isn't 
identifiable, for a start.

I've encountered this problem in air pollution source apportionment, where 
people use an algorithm due to Paatero (1999) JCGS 8:854-8, which is a 
conjugate gradient algorithm that handles the constraints by creative 
abuse of preconditioning.  The algorithm seems fairly well-behaved, 
although the statistical properties of the estimates are not 
well-understood [at least, I don't understand them, and I have simulations 
that appear to contradict the views of people who claim to understand 
them].

The difficulty probably depends on the size of the problem -- the air 
pollution problems have n~1000, p~20, nc~7, or larger.

 	-thomas



More information about the R-help mailing list