[BioC] Limma

James W. MacDonald jmacdon at med.umich.edu
Wed Apr 6 17:59:00 CEST 2011


Hi Seraya,

On 4/6/2011 11:16 AM, Seraya Maouche wrote:
> Dear Jenny,
> Thank you,
> I agree that "absence" does not mean that the transcript is not expressed
> because if the probe sequence used to target that transcript  is not
> performing well so non expression is the result of the technical problem
> rather than the abundance of the transcript in the sample analyzed. We can
> also have other factors such image acquisition conditions, etc.
>
> But for my analyses, as I have a large sample size (n>  1500), if a
> transcript is absent in all RNA samples, we cannot expect that this is just
> an artifact.
> What I am doing is  calculating detection calls and then I have three
> situation: 1) probes present in all samples, 2) probes absent in all
> samples, and 3) probes absent in a subset of samples. In order to not lose
> information, I am not filtering probes of category (3)
>
> And I also examine the concordance of detection call for all probes
> (spliced variants or technical replicate probes) tagging the same gene, so
> if I have 3 probes
> for the same gene, all absent I consider this as confidence to say that the
> gene is not expressed.
>
> In addition, I am using other gene expression datasets (for the same cell
> type), generated using different platforms to check whether the probe absent
> in dataset A is also absent in dataset B, C..
>
> Now for the reviewer, he cannot understand how we say that a gene is not
> expressed and on the other hand, we use the intensity for this  gene to
> calculate a fold change. But in signal processing, we can have a quantity of
> signal even it corresponds to noise and not to the real signal.

I think the reviewer's point is that you are computing a fold-change 
based on something that is primarily due to signal divided by something 
that is primarily due to noise. In other words, the numerator is signal, 
and the denominator is noise.

In addition, the value due primarily to noise is very close to zero, so 
you end up with a huge fold change that can vary widely, depending on 
the variability of your noise signal. So the actual value of the ratio 
is probably not that meaningful, but the fact that it is reliably large 
probably means that the gene is truly differentially expressed. You just 
cannot say reliably by how much.

Best,

Jim


>
> Best wishes,
> Seraya
>
>
> -----Original Message-----
> From: Jenny Drnevich [mailto:drnevich at illinois.edu]
> Sent: Mittwoch, 6. April 2011 16:53
> To: Seraya Maouche; 'James W. MacDonald'; 'Wei Shi'
> Cc: Bioconductor at r-project.org
> Subject: Re: [BioC] Limma
>
> Hi Seraya,
>
> I think your explanation for the reviewer is on track. Despite the fact that
> some microarray platforms allow the calculation of a metric called "present"
> or "absent", they really cannot detect whether a gene is truly expressed or
> not.
> Instead, it's just a metric saying if the signal was above some noise
> threshold. Now, the signal measured will always be non-zero, so we can use
> these numbers to calculate fold-change regardless of the "present" or
> "absent" call. You are right to throw out only those genes that are called
> "absent" in all samples, but for the rest of the genes, the "present/absent"
> metric is not good enough to categorize "off" versus "on" genes, so we just
> use the numbers measured, calculate fold-changes and do statistical tests.
> Hopefully the reviewer will be satisfied with an explanation of this sort.
>
> Good luck,
> Jenny
>
> At 09:39 AM 4/6/2011, Seraya Maouche wrote:
>> Dear  Jim, dear Wei,
>> Thanks for your help, it is not a two color analysis, it is Illumina.
>>
>> Best wishes
>> seraya
>>
>> -----Original Message-----
>> From: James W. MacDonald [mailto:jmacdon at med.umich.edu]
>> Sent: Mittwoch, 6. April 2011 15:55
>> To: Wei Shi
>> Cc: Seraya Maouche; Bioconductor at r-project.org
>> Subject: Re: [BioC] Limma
>>
>> Hi Wei,
>>
>> I think you misunderstood the OP. Seraya _didn't_ remove genes that
>> were only present in one condition. The problem is that the reviewer
>> didn't like ratios with a zero in the denominator, which is a fair
> complaint.
>>
>> I don't do two color analyses, so don't know what the consensus is for
>> handling logratios where the denominator is really close to zero. Since
>> you guys do this stuff all the time, perhaps you have some pointers?
>>
>> Best,
>>
>> Jim
>>
>>
>>
>> On 4/5/2011 6:43 PM, Wei Shi wrote:
>>> Hi Seraya:
>>>
>>>        Genes which are present in one condition but not in the other
>>> should
>> NOT be removed from your analysis. Only those gene which are absent in
>> both conditions should be filtered out to improve the power to detect
>> differentially expressed genes. It is very likely that a lot of genes
>> of biological interest were not included in your analysis results due
>> to the removal of genes which are present in one condition but not in the
> other.
>> Have a look at the case study for processing Illumina BeadChip data in
>> limma user guide about the probe filtering.
>>>
>>>        Hope this helps.
>>>
>>> Cheers,
>>> Wei
>>>
>>> On Apr 6, 2011, at 2:05 AM, Seraya Maouche wrote:
>>>
>>>> Dear Prof Gordon, dear Bioconductor members,
>>>>
>>>> I have performed gene expression analysis using Limma (Illumina
>>>> human
>>>> ref8) comparing two types of cells (referred below as cond1 and cond2).
>>>> Based on detection call, I filtered out transcripts which are
>>>> absent in both types of cells. Transcripts which were expressed
>>>> only in one cell type were included in the analysis.
>>>>
>>>> I have received the comment below from a reviewer who seems not
>>>> agree to calculate fold change for genes expressed only in one
> condition.
>>>> Would it be possible to have your opinion about this.
>>>>
>>>> Thank you in advance for your time, S Maouche
>>>>
>>>> "There is a little conceptual difficulty related to the cond1/cond2
>>>> comparisons for genes that are considered not detected. If a gene
>>>> product is absent (0) in one cell then no fold change can be
>>>> computed (table 2). I don’t know how to circumvent this difficult
>>>> except by saying that the “noise” is considered to reflect low
>>>> expression. The terms “not detected” and “not expressed” are often
>>>> used interchangeably while this is not the same. Detection is based
>>>> on the definition adopted and in many places of the manuscript it
>>>> should be used in place of expression."
>>>>
>>>>
>>>>
>>>> Universitätsklinikum Schleswig-Holstein Rechtsfähige Anstalt des
>>>> öffentlichen Rechts der Christian-Albrechts-Universität zu Kiel und
>>>> der Universität zu Lübeck
>>>>
>>>> Vorstandsmitglieder: Prof. Dr. Jens Scholz (Vorsitzender), Peter
>>>> Pansegrau, Christa Meyer Vorsitzende des Aufsichtsrates: Dr.
>>>> Cordelia Andreßen
>>>> Bankverbindungen: Förde Sparkasse BLZ 210 501 70 Kto.-Nr. 100 206,
>>>> Commerzbank AG BLZ 230 800 40 Kto.-Nr. 300 041 200
>>>>
>>>> Diese E-Mail enthält vertrauliche Informationen und ist nur für die
>>>> Personen bestimmt, an welche sie gerichtet ist. Sollten Sie nicht
>>>> der bestimmungsgemäße Empfänger sein, bitten wir Sie, uns hiervon
>>>> unverzüglich zu unterrichten und die E-Mail zu vernichten.
>>>> Wir weisen darauf hin, dass der Gebrauch und die Weiterleitung
>>>> einer nicht bestimmungsgemäß empfangenen E-Mail und ihres Inhalts
>>>> gesetzlich verboten sind und ggf. Schadensersatzansprüche auslösen
>> können.
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>> ____________________________________________________________________
>>> __ The information in this email is confidential and
>>> intend...{{dropped:6}}
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>> **********************************************************
>> Electronic Mail is not secure, may not be read every day, and should
>> not be used for urgent or sensitive issues
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list