[BioC] PLGEM on metabolomic data?

Wed Oct 3 04:43:38 CEST 2012

Dear Victor,

Thanks for contacting me about PLGEM. There is a mailing list to which
to address questions about Bioconductor packages. I suggest you to
sign up for it (if you haven't done so yet) and address your question
there. I will be happy to answer you through that forum. This is to
keep track of threads and to benefit the community. I am CCing the
mailing list to keep track.

To give you a quick answer, PLGEM is not restricted to use with
microarray or proteomics data. Although these two are the datasets
that have been validated and tested so far, I suspect there will be
other types of data that might be described by a power law
relationship between the standard deviation and the mean. Metabolomics
might be one of them, but I never tried to fit a PLGEM on a
metabolomic dataset. If you like, you can send me an example file and
I can try to see if it fits.

Regarding normalization, this is very much data-type-specific. I have
no experience analyzing metabolomic data, so I have no specific
recommendations to make. I suggest you to collaborate with a
statistician to try to find the best normalization method for your
specific data. again, if you like, you are welcome to send me an
example file, and I can try to see if simple normalization methods
might be "good enough", but I will take no responsibility for your
final choice. :-)

Anyone out there with experience with normalization of metabolomic
data that can chip in and give Victor some advice?

Hope this helps for now!
Good luck with your analysis.

Best,
Norman

From: Victor Nesati [mailto:lsivn at nus.edu.sg]
Sent: Monday, 1 October, 2012 3:55 PM
To: Norman Pavelka (SIgN)
Subject: PLGEM

Hello Norman,

Quite some time ago I read your paper on PLGEM and actually used a bit
Of it in my former spectral count based quantitative proteomics
project back in Suizzera.

Now I found myself quite close to you dealing with a bit different MS
related project.
Though it seems to me that even here your PLGEM may be quite applicable.
We are doing discovery based metabolomics and after a number of
Orbitrap runs comparing different samples  I found myself in quite
familiar
Mass vs intensity matrix territory.

Having playing with it quite a while I found that my results on the
ID-ing statistically significant features are greatly varying
depending on the choice of scaling, pre stat test normalization
procedure and stat test itself. So I was wondering a bit about
possibility of using your PLGEM in this settings. Brief literature
search showed that
Microarray normalization Procedures working for mRNA containing more
than 10,000 features  are not working that well for miRNA containing
much less.
We have something around 6000-9000 masses in the matrix. What would
your suggestion and choice of normalization, stat test to identify
most likely candidates.

With best wishes

Victor