[BioC] QC question using simpleaffy

Laurent Gatto L.Gatto at dnavision.be
Tue Jun 26 16:36:18 CEST 2007

Hi James,
I don't know how disperse your percent.present values are, but I don't
think that this is the best quality metrics to rely on. I would suggest
starting to look at the scale factor.
Concerning these percent.present values, if you see large differences,
you may start by separating the different groups you analyze, and check
if the values are coherent inside these groups. To detect outliers, you
may try to plot the percent.present values as a boxplot to get an idea
of putative outliers. Have a look here [*] for an example. 


[*] http://www.dnavision.be/lgatto/yaqc.png
If you are interested in generating such plots, let me know. I have a
package that generates them.

Laurent Gatto
From: James Anderson [mailto:janderson_net at yahoo.com] 
Sent: Tuesday, June 26, 2007 3:12 PM
To: Laurent Gatto; bioconductor
Subject: RE: [BioC] QC question using simpleaffy


Yeah, I figured this out later. They are different. So in terms of
identifying outliers by percentage of present calls, would that be
reasonable to just pick up those with lowest percentage of present
calls? The manual says that the
absolute value of percent.present is not a good metric, since some cells
naturally express more genes than others, then is there any other way to
take into consideration of this? 


Laurent Gatto <L.Gatto at dnavision.be> wrote:
Dear James,

bioBCalls tells if the bioB probe (one hybridization control probe,
AFFX-BioB-3_at or AFFX-r2-Ec-bioB-3_at depending on the array type) is
present whereas is percent.present gives the percentage of all probe
sets that are called present on the array. 



Laurent Gatto

-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of James
Sent: Monday, June 25, 2007 9:05 PM
To: bioconductor
Subject: [BioC] QC question using simpleaffy

I read from the manual of QC using simpleaffy, the object obtained is
called "qc", which I type 
slotNames (qc)

[1] "scale.factors" "target" "percent.present" 
[4] "average.background" "minimum.background" "maximum.background"
[7] "spikes" "qc.probes" "bioBCalls" 

I am trying to locate those arrays with some quality problem.
when I type
which(qc at bioBCalls == "A"), they are not fully consistent with the
arrays with the lowest qc at percent.present. The manual says that the
absolute value of percent.present is not a good metric, since some cells
naturally express more genes than others. So what's the other
information except percent.present that is used to determine whether an
array is "A" or "P"? 

Many thanks,


The fish are biting.

[[alternative HTML version deleted]]

Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
Search the archives:

Be a better Heartthrob. Get better relationship answers from someone who
Yahoo! Answers - Check it out.

More information about the Bioconductor mailing list