[BioC] Combining data from different versions of Illumina HumanHT-12 v3

Fri Apr 15 12:52:36 CEST 2011

Dear Wei,

I've identified the duplicate probe:

common <- intersect(TB1$genes,TB2$genes)
test <- common==TB1$genes
Inspection of test shows that the repeated probe is "ILMN_2038777" at
position 324 in the vector.

Also discovered that the order of the probes is different between
batches so just deleting the duplicated probe will not allow me to use
cbind(). Just tried help file for cbind(), but that states only that
the probe order must be the same between EListRaw objects. limma
manual seems to suggest merge() command, but help(merge) says it only
takes RGList or MAlist objects, but looking at code for read.ilmn(),
read.ilmn() seems to use cbind to merge the input from different files
anyway so in theory merge() should work also?

Also tried using combine(), but combine() doesn't seem to be defined
for EListRaw.

Gavin.

On 15 April 2011 10:54, Gavin Koh <gavin.koh at gmail.com> wrote:
> Dear Wei,
>
> A little more information: the difference seems to be a single duplicated probe.
> Just comparing two batches (TB1 and TB2) with different probe numbers:
>> length(TB1$genes)
> [1] 48804
>> length(TB2$genes)
> [1] 48803
>> length(unique(TB2$genes))
> [1] 48803
>> length(unique(TB1$genes))
> [1] 48803
>> setdiff(TB1$genes,TB2$genes)
> character(0)
>> setequal(TB1$genes,TB2$genes)
> [1] TRUE
>
> That still leaves me the problem that I don't know how to identify the
> repeated probe or how to cbind TB1 and TB2... :-(
>
> Gavin
>
> On 15 April 2011 02:38, Wei Shi <shi at wehi.edu.au> wrote:
>> Hi Gavin:
>>
>>        The number of probes which were present in one batch but not in others should be very small. So you can use the probes which are common in all batches for your analysis.
>>
>>        Hope this helps.
>>
>> Cheers,
>> Wei
>>
>> On Apr 15, 2011, at 1:20 AM, Gavin Koh wrote:
>>
>>> I am trying to analyse data from ArrayExpress E-GEOD-22098 (published
>>> Dec last year).
>>> According to the study methods, the data are Illumina HumanHT-12 v3
>>> Expression BeadChips, but the hybridisation seems to have been done in
>>> several batches, with different numbers of probes in each batch,
>>> alternating between 48803 and 48804. Can anyone tell me how to combine
>>> these different batches into the same file, please? I am trying to
>>> read the probe data using the read.ilmn() function in limma, but
>>> failing, because cbind complains the matrices are not the same length
>>> (precise error is "Error in cbind(out$E, objects[[i]]$E) : number of
>>> rows of matrices must match (see arg 2)").
>>>
>>> Thank you in advance,
>>>
>>> Gavin Koh
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> ______________________________________________________________________
>> The information in this email is confidential and intended solely for the addressee.
>> You must not disclose, forward, print or use it without the permission of the sender.
>> ______________________________________________________________________
>>
>
>
>
> --
> Hofstadter's Law: It always takes longer than you expect, even when
> you take into account Hofstadter's Law.
> —Douglas Hofstadter (in Gödel, Escher, Bach, 1979)
>

-- 
Hofstadter's Law: It always takes longer than you expect, even when
you take into account Hofstadter's Law.
—Douglas Hofstadter (in Gödel, Escher, Bach, 1979)