[BioC] some questions about lumi package and Illumina arrays

Fri Mar 30 11:47:01 CEST 2012

Dear Javier,
You can summarise probes to genes either by selecting probeset with highest expression or taking an average of the expression values for each gene mapped to the probeset. You would need the data to be annotated before you do this. Limma has a function called avereps() which would summarise probes to genes assigning an average value of expression.

This way you can choose a representative for the probeset. This should be done before you do limma on your fit data.

Hope this is what you were looking for.

Best,
Ekta

-----Original Message-----
From: bioconductor-bounces at r-project.org [mailto:bioconductor-bounces at r-project.org] On Behalf Of Javier Pérez Florido
Sent: 30 March 2012 14:14
To: Wei Shi
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] some questions about lumi package and Illumina arrays

Thanks,

Yes, limma has its own pipeline for analyzing Illumina BeadChip data. 
However, all the steps before the detection of differentially expressed 
genes are handled by means of lumi package in my case. As stated, I'm 
working with probe level data but instead of using the Sample Probe 
Profile file, I'm working with another file which provides the same 
information as sample probe profile plus information related to beads 
(average, std, etc). However, this file has no information related to 
annotation, so, such information is obtained through the lumiHumanAll 
annotation database once limma has been applied.

I've checked all the annotation elements given by lumiHumanAll.db and, 
as far as I know, none of them provides the definition of probes. If I'm 
not wrong, this definition can distinguishes isoforms and other factors.
How can I get such information from the lumiHumanAll annotation file? 
For example, gene A1CF has three different probes, corresponding each 
one to a different transcript variant, but I cannot obtain such 
information from the lumiHumanAll database.

I'm a computer scientist, so, apologizes about my little knowledge about 
biology.

Thanks again,
Javier

On 30/03/2012 0:18, Wei Shi wrote:
> Dear Javier,
>
> Firstly, let me point out that limma has its own pipeline for analyzing Illumina BeadChip data, which includes data input, neqc normalization, differential expression analysis, etc.. Section 11.7 in limma user's guide gives a case study for analyzing BeadChip data. The analysis included was performed at probe level.
>
> We always perform probe level expression analysis for Illumina arrays. If you want to perform gene level analysis, you'll have to summarize probes to genes in some way. However, different probes for the same gene might have variable expression levels, due to multiple isoforms and other factors. This makes it hard to summarize probes and doing so may result in misleading results.
>
> On the other hand, you can choose a representative probe for each gene after you perform the probe level analysis, by choosing the one which has the largest average expression value across all samples for example. This will give you one expression value for each gene in each sample, if this is what you want.
>
> Cheers,
> Wei
>
> On Mar 30, 2012, at 3:39 AM, Javier Pérez Florido wrote:
>
>> Dear list,
>> I have few questions about lumi package:
>>
>>   * How can the control plots be interpreted? For example:
>>     plotControlData, plotHousekeepingGene and plotStringencyGene. I
>>     don't know what the scenario to detect bad samples should be when
>>     using these plots.
>>   * What is the meaning of the density plot of coefficient of variance?
>>     How can bad arrays be detected? Similar to traditional density plots
>>     (that is, different distributions)?
>>   * In lumi vignette, as well as in other packages, it is recommended to
>>     work with probe information instead of gene information. For
>>     example, when the sample probe profile file is opened, there might
>>     be several probes that target the same gene. If working with such
>>     sample probe profile file and after variance stabilizing transform
>>     and normalization are made, the expression set generated can be used
>>     as the input of limma. My question is: does limma detect
>>     differentially expressed probes or differentially expressed genes?
>>      From a practical point of view, is it the same?
>>
>>
>> Thanks,
>> Javier
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> ______________________________________________________________________
> The information in this email is confidential and inte...{{dropped:25}}