[BioC] Averaging duplicate probes in an RGList object

Tom Wenseleers tom.wenseleers at bio.kuleuven.be
Thu Mar 25 08:16:40 CET 2010


Dear all,
On my microarray I have some genes for which I have duplicate probes, 
but not for all. Before I do the stats I would like to average the 
duplicate probes (e.g. GB10001-RA and GB10006-RA in the example below).
So I have a limma RGList object, with the gene names stored in 
RG$genes$GeneName and I would like to average the R / G expression 
values (on a Log scale) for all probes for which I have replicate 
spots (ideally it should work for any number of replicate spots, no 
replicate, or 1, 2 or more replicates). Does anybody have any ideas 
what would be the most elegant way to average out expression values 
based on the occurrence of duplicate entries in RG$genes$GeneName? 
Limma can only average duplicates if you have the same nr of 
replicates per probe, so that doesn't work for me... For the same 
reason I also can't use the DuplicateCorrelation() function...

cheers,
Tom

 > RG[1:5,]
An object of class "RGList"
$R
      sample1    sample2         sample3         sample4 ... sample16
[1,]                                        26.78257 
                        29.0004 
25.0079                                        26.39580
[2,]                                        43.38434 
                        28.8024 
26.6724                                        25.95790
[3,]                                     10714.26000 
                      9939.5930 
8176.1090                                      7799.02200
[4,]                                       104.95700 
                       111.3113 
  106.5972                                        82.49293
[5,]                                      8381.44300 
                      7420.4830 
7276.8480                                      8260.76500

$G
      sample1    sample2         sample3         sample4 .... sample16
[1,]                                        23.70600 
                       27.56034 
  26.02305                                        28.05054
[2,]                                        43.21391 
                       27.92983 
  27.30349                                        27.08347
[3,]                                      7163.33400 
                     7767.61200 
9167.84500                                     10383.52000
[4,]                                        53.17878 
                       73.86636 
129.05350                                        92.27209
[5,]                                      5974.27400 
                     8053.11300 
7722.92500                                      7193.23000

$targets
[1] "sample1.txt" "sample2.txt" "sample3.txt" "sample4.txt" ... "sample16.txt"

$genes
       FeatureNum Row Col       ProbeName ControlType   GeneName 
Description SystematicName Status
10005      10015 123  11 MAF_Amel_000001           0 
GB10001-RA     Unknown     GB10001-RA   gene
11060      11071 136   1 MAF_Amel_000002           0 
GB10001-RA     Unknown     GB10001-RA   gene
7355        7363  90  65 MAF_Amel_000003           0 
GB10006-RA     Unknown     GB10006-RA   gene
1433        1435  18  41 MAF_Amel_000004           0 
GB10006-RA     Unknown     GB10006-RA   gene
13775      13789 169  13 MAF_Amel_000005           0 
GB10007-RA     Unknown     GB10007-RA   gene

$source
[1] "agilent"



More information about the Bioconductor mailing list