[BioC] how deal with multiplicate affy probes?

Sun Mar 21 12:20:35 CET 2004

Hi Jonathan,

interesting question. Basically if I'm just interested in the set of
differentially regulated genes I ignore redundant affy probe sets. I.e. if at
least one probe set for a given gene fullfills the selection criteria (fold
change, p-value ...), I include the *gene* into my list.

I usually convert all affy probeset accession codes into their corresponding
LocuusLink IDs, from which I then remove duplicates. You could also use
UniGene Cluster accession codes. Most of this info is provided my NetAffx.
However, not all probe sets can be mapped to unigene or locuslink (I consider
these as orphans and treat them as single genes each).

Calculating a fold change for gene for which one has > 1 probe set is a nasty
problem. Alternative splicing may play a role, too! I suggest to keep the
most extreme fold change of the corresponding probe sets, since fold changes
of the probe sets within a gene can be very different (also with different
significance for differential expression).

	regards,

	Arne

--
Arne Muller, Ph.D.
Toxicogenomics, Aventis Pharma
arne dot muller domain=aventis com

> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch]On Behalf Of Johnnidis,
> Jonathan
> Sent: 20 March 2004 16:40
> To: bioconductor at stat.math.ethz.ch
> Subject: [BioC] how deal with multiplicate affy probes?
> 
> 
> Hi Bioconductor,
> 
> I'm a new list member and am not quite sure if this question 
> is appropriate for the list, but will shoot anyway. I'm 
> analyzing a bunch of data from Affy MgU74Av2 chips and am a 
> bit perplexed as to how to treat conflicting expression data 
> from multiplicate probe sets (that is a gene that has  >1 
> probe set designed against it (for example, 97569_r_at and 
> 97658_r_at are both probes for the Insulin gene). 
> 
> Specifically, if probe #1 for geneX indicates significant 
> fold change for that gene, but probe #2 indicates something 
> else (no fold change, or even fold change in the opposite 
> direction! (rare, but possible)), how can the expression 
> status of geneX be properly evaluated?  Can one probe's 
> measurement be considered more reliable than another's (and 
> thus toss the one you suspect is wrong (although this could 
> introduce experimental bias))? Or is it most appropriate to 
> average the signal values for multiplicate probes together? 
> Or is there some other method?
> 
> On the MgU74Av2 chip at least, by my calculations there are 
> at least 1079 genes that have >1 probe agianst them (2323 
> probes total that are 'multiplicates'), so the numbers are 
> great enough to potentially impact my analysis.  Any 
> ideas/suggestions/criticisms will be much appreciated.
> 
> with thanks,
> 
> Jonathan Johnnidis
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>