[BioC] Detection Above Background (DABG) on Gene Level for Exon and Gene Arrays
michael.imbeault at sympatico.ca
Wed Sep 22 10:31:52 CEST 2010
I asked the same question to Affy support (which are very responsive) 6
months ago and here is the answer I got (last part is the most relevant):
Hello Dr. Imbeault,
Thank you for contacting Affymetrix Technical Support. I have attached a paper that you will find useful. As far as DABG, DABG is an acronym for Detected Above Background. It is an algorithm that was intended to serve as a confidence score in lieu of a Detection call ("Present" or "Absent"). The first official application of DABG was introduced in the ExACT software for WT Exon Array data analysis. DABG is also represented in Expression Console.
However, based on an internal assessment of performance by our Bioinformatics team (Alan Williams primarily), DABG is not considered to be a very informative or robust metric. Customers should utilize signal estimates and secondary/tertiary analysis methods to determine robust confidence scores and ultimately gene expression results.
The individual probe p-values are computed based on the rank order against the background probe set intensities (probes in the .BGP file). The probe level p-values are combined into a probe set level p-value using the Fisher equation. There are DABG options available in APT to use a percentile (i.e. median) rather than the Fisher equation.
DABG Limitations in gene-level analyses?
The DABG algorithm evaluates detection by combining probe-level p-values that are assumed to be monitoring the same region of a transcript. This assumption is met for an exon-level probe. However, when combining all probes across a transcript in a gene-level analysis, this assumption is not guaranteed. Some of the probes may hit parts of the gene which are expressed while others may not, and yet the gene is still expressed. Since this type of scenario could generate misleading detection calls, DABG is not considered a robust gene-level metric. (from the WT QC WRC).
Please let me know if you have any other questions.
On 22/09/2010 4:17 AM, Pascal Gellert wrote:
> Dear Christian & Mark,
> Thanks for your answer.
>> HuGene arrays were originally designed by Affymetrix as arrays
>> measuring the transcript level and thus I assume they have tried to
>> select mainly probes which are not affected by alternative splicing,
>> so the statement may not apply.
> I am not sure, but at his point I think I have to disagree, because
> HuGene and HuExon have many (~65%) identical probes. This is from the
> Affymetrix Technical Note for HuGene:
> "The Human Gene 1.0 ST Array design, wherever possible, uses a subset
> of the same probes on the Human Exon 1.0 ST Array to interrogate the
> more focused, better-annotated content at the gene level."
> So this sounds like HuGene is designed from the core HuExon (but only
> two probes per probe set). It got additional probe sets for genes with
> few exons, but I am not sure if they avoided regions which are likely
>> I guess to be really thorough, one could use the Affymetrix sample
>> data that's been run on 133+2, Gene ST and Exon ST platforms to see
>> how DABG compares to the old mas5calls on the same RNA
> I think I really have to do this. It also would be interesting, if
> MAS5 for HuGene and HuExon with the XPS package performs better than
> Best regards,
> On 09/21/2010 11:18 PM, cstrato wrote:
>> Dear Pascal,
>> Dut to the design of xps it is not only able to support DABG calls at
>> both the probeset and transcript level, but does also support MAS5
>> detection calls for both Exon 1.0 ST and Gene 1.0 ST Arrays.
>> To answer your question whether DABG is valid on the transcript level
>> I think you need to distinguish between HuExon and HuGene arrays:
>> For HuExon arrays the statement of Affymetrix is probably true for
>> genes where alternative splicing occurs. However, HuGene arrays were
>> originally designed by Affymetrix as arrays measuring the transcript
>> level and thus I assume they have tried to select mainly probes which
>> are not affected by alternative splicing, so the statement may not
>> I would also be interested to hear from user experiences not only
>> with DABG on the transcript level but also with MAS5 calls on Exon ST
>> and Gene ST arrays. In my own experience the p-values between DABG
>> and MAS5 calls are almost identical for very low p-values but partly
>> tend to differ for larger p-values.
>> Best regards
>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a
>> V.i.e.n.n.a A.u.s.t.r.i.a
>> e.m.a.i.l: cstrato at aon.at
>> On 9/21/10 2:27 PM, Pascal Gellert wrote:
>>> Hi all,
>>> The detection above background algorithm calculates a p-value for each
>>> probe set, indication if this probe set is expressed or not (within the
>>> background noise).
>>> This is similar to the MAS5 detection calls, but Exon 1.0 ST and Gene
>>> 1.0 ST Arrays don't have mismatch probes, therefore MAS5 cannot be
>>> According to Affymetrix, the DABG is not valid on gene level:
>>> "There is a strong assumption in DABG
>>> that all the probes are measuring the same
>>> thing (i.e., the same transcript). This is not
>>> the case at the gene level due to alternative
>>> splicing. For example, probes for a cassette
>>> exon that is skipped will contribute to a mis-
>>> leadingly insignificant p-value."
>>> To obtain, if a gene is expressed, often all probe sets of a gene were
>>> used. If less than e.g. 50% of the exons of the gene are above a DABG
>>> threshold, the gene is considered as not expressed.
>>> Nevertheless, the XPS package supports DABG on gene level. Does anyone
>>> has experiences with DAGB on gene level?
>>> Pascal Gellert
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> Search the archives:
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> Search the archives:
More information about the Bioconductor