[BioC] imputing missing data for 70mer array platform, need advice

Sean Davis sdavis2 at mail.nih.gov
Thu Jan 18 21:54:24 CET 2007


On Thursday 18 January 2007 15:33, Betty Gilbert wrote:
> Thanks for getting back to me Dr. Davis,
>
> >1)  Why are the data "missing"?  Is it due to quality of the spot or due
> > to low intensity?  These are two related but different situations.
>
> Most are low intensity spots but since this dataset was
> created,normalized, and filtered with Genepix and Acuity software the
> data matrix that is output no longer differentiates between those two
> situations.

Unfortunately, if you are thinking about dealing with missing data 
appropriately, you will probably want to think about going back to the raw 
data to determine the reasons for and the best ways of dealing with 
questionable data.  

> >2)  Why not use a package like limma, or some other package that can
> > account for missing data and/or downweight questionable values?  I don't
> > know about ACUITY, but it sounds like it may be doing something like
> > that.
> >
> >Sean
>
> I didn't use limma mainly because I didn't want to have to backtrack
> all the way back to image analysis. I was hoping to start with
> already normalized log ratio values and work from there. I'm assuming
> that ACUITY does in fact filter out missing data prior to the t test.
> The technical support for the program is not very forthcoming about
> how the program treats the data before doing a t test. It just spits
> out a corrected p-value at the end. I'm looking at limma now. Thank
> you for your suggestion.

Again, if you are unclear about how your data are being handled (and you are 
concerned about it, as you are), you may need to use a more transparent 
software package.  I think learning to use limma is a reasonable way to go 
(but not the only way).

Sean



More information about the Bioconductor mailing list