[BioC] spot filtering
etbp2 at borcim.wustl.edu
Fri Jul 22 20:04:12 CEST 2005
Thank you for replying. That is very interesting I am not a statistician,
but when I told some people I used a similar approach of leaving all data in
and filtering later people heavily criticized it (mainly biologists). They
said that if you put junk into the system you'll get junk out.. In my
opinion this would be more important if you have a lot of bad spots, but how
many is too many? Have you looked at the effect of leaving the "bad" data in
particularly the data and make up of the lists you get out?
From: michael watson (IAH-C) [mailto:michael.watson at bbsrc.ac.uk]
Sent: Friday, July 22, 2005 12:01 PM
To: Brooke-Powell, Elizabeth; bioconductor at stat.math.ethz.ch
Subject: RE: spot filtering
Actually, I don't use any of the bioconductor functions for reading in flags
or weighting values depending on quality of the spot etc.
Generally, what I do is create a table of flags - with spots as the rows and
array as the column. These flags are sometimes genepix flags, sometimes
composite flags I made up.
Then I do all of my analysis in limma, using all data, I don't weight
anything, and I don't convert anything into NAs.
At the end, I output the data from topTable() into a text file, load it into
MySQL or MS Access, link it to the flags data and decide which, out of my
list from topTable, I believe according to the flags.
Note you *could* do this linking in R using the merge() function too.
From: Brooke-Powell, Elizabeth [mailto:etbp2 at borcim.wustl.edu]
Sent: Fri 22/07/2005 5:41 PM
To: bioconductor at stat.math.ethz.ch
Cc: michael watson (IAH-C)
Subject: spot filtering
I was interested in how you flag your data, when you load your files do you
read in your flag column as part of a standard GenePix type output file, so
limma uses it when the linear model is fit? I use BlueFuse and its flag
column is quite different from GenePix and the like and at present not able
to be used in limma. I am wondering how to mark (flag) the bad data and
either leave it in or what to put in the data file to get the data ignored
i.e. can you put NA in place of the data point and have it ignored? Is it as
simple as creating a new flag column converting the BlueFuse flags into
GenePix like flags? If I load the data file using the other file type option
in LimmaGUI it doesn't allow me to tell it where there is a flag column. Is
this something that could be fixed assuming the flag column conforms to the
GenePix style of 0, +1 and -1 calls?
Thanks for the help and insight,
More information about the Bioconductor