[BioC] Beadarray - problem with BSData object created using 'summarize'

Mark Dunning mark.dunning at gmail.com
Mon Jan 17 16:02:21 CET 2011


Hi Kasia,

Many thanks for the bug report, looks like a typo in our code that I'll correct.

Regards,

Mark

On Fri, Jan 14, 2011 at 6:33 PM, Kasia Stepien <kasia at cmmt.ubc.ca> wrote:
> Hi Mark,
>
> Thanks a lot! Nice that it is an easy fix. I also managed to get
> around the problem by summarizing the BLData for each chip
> independently, then combining them afterwards.
>
> Another bug in the beadarray package that I found when running
> summarize, specific to ratv1 arrays, is that summarize looks for the
> file 'ratv1BeadLevelMapping.rda', which does not exist in the library.
> However, the file "ratBeadLevelMapping.rda" does, so we made a copy,
> renamed it, and saved it in the directory, which seemed to solve the
> problem temporarily.
>
>> BSData <- summarize(BLDataBsh, list(greenChannel), useSampleFac = FALSE)
> No sample factor specified. Summarizing each section separately
> Finding list of unique probes in beadLevelData
> 23401  unique probeIDs found
> Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
> In addition: Warning message:
> In readChar(con, 5L, useBytes = TRUE) :
>  cannot open compressed file
> '/usr/lib64/R/library/beadarray/extdata/ratv1BeadLevelMapping.rda',
> probable reason 'No such file or directory'
>> load("/usr/lib64/R/library/beadarray/extdata/ratBeadLevelMapping.rda")
>> ratv1BeadLevelMapping<-ratBeadLevelMapping
>> save(ratv1BeadLevelMapping, file="/usr/lib64/R/library/beadarray/extdata/ratv1BeadLevelMapping.rda")
>
> Cheers,
> Kasia
>
> On Tue, Jan 11, 2011 at 8:19 AM, Mark Dunning <mark.dunning at gmail.com> wrote:
>> Hi Kasla,
>>
>> This problem was due to some recent functionality added to beadarray.
>> In short, the summarize function tries to be clever and work out which
>> sections should be combined and renames columns accordingly It seems
>> there was a bug that got the names confused when multiple chips are
>> present in the bead-level object.
>>
>> For a simple fix for your data, if you try
>>
>>> BSData <- summarize(BLDataCombo, list(greenChannel),useSampleFac = FALSE)
>>
>> then it will not try this automatic grouping and naming of samples and
>> you should get the column names you expect. In future the bug will
>> have been fixed in the  devel and release versions of beadarray.
>>
>> Best wishes,
>>
>> Mark
>>
>> On Fri, Jan 7, 2011 at 7:02 PM, Kasia Stepien <kasia at cmmt.ubc.ca> wrote:
>>> Hello!
>>>
>>> I am using beadarray v 2.0.2 to analyze beadlevel data for Illumina
>>> RatRef-12 whole genome gene expression arrays (I have 8 chips, with 12
>>> samples each, for a total of 96 arrays).
>>>
>>> When trying to create a bead summary object using "summarize", the
>>> section names from the BLData object appear to be recycled (eg.
>>> "5398636011_A", "5398636011_A.1", rather than "5398636011_A",
>>> "5398636011_B", etc). At first I thought the arrays themselves were
>>> being used reused, but the similarly named objects do not appear to be
>>> identical (see below).
>>>
>>> What could the reason be for this? Is there some argument for
>>> summarize that I can use to get around this problem?
>>>
>>> This is what it looks like for 2 chips, with 12 samples each:
>>>
>>>> BLData1 = readIllumina(dir="/home/kasia/kasiadata/5398636011filtered/", useImages=FALSE, illuminaAnnotation="Ratv1")
>>> Processing section 5398636011_A
>>> Processing section 5398636011_B
>>> Processing section 5398636011_C
>>> Processing section 5398636011_D
>>> Processing section 5398636011_E
>>> Processing section 5398636011_F
>>> Processing section 5398636011_G
>>> Processing section 5398636011_H
>>> Processing section 5398636011_I
>>> Processing section 5398636011_J
>>> Processing section 5398636011_K
>>> Processing section 5398636011_L
>>>> BLData2 = readIllumina(dir="/home/kasia/kasiadata/5398636033filtered/", useImages=FALSE, illuminaAnnotation="Ratv1")
>>> Processing section 5398636033_A
>>> Processing section 5398636033_B
>>> Processing section 5398636033_C
>>> Processing section 5398636033_D
>>> Processing section 5398636033_E
>>> Processing section 5398636033_F
>>> Processing section 5398636033_G
>>> Processing section 5398636033_H
>>> Processing section 5398636033_I
>>> Processing section 5398636033_J
>>> Processing section 5398636033_K
>>> Processing section 5398636033_L
>>>> BLDataCombo = combine(BLData1, BLData2)
>>>> is(BLDataCombo)
>>> [1] "beadLevelData"
>>>> sectionNames(BLDataCombo)
>>>  [1] "5398636011_A" "5398636011_B" "5398636011_C" "5398636011_D" "5398636011_E"
>>>  [6] "5398636011_F" "5398636011_G" "5398636011_H" "5398636011_I" "5398636011_J"
>>> [11] "5398636011_K" "5398636011_L" "5398636033_A" "5398636033_B" "5398636033_C"
>>> [16] "5398636033_D" "5398636033_E" "5398636033_F" "5398636033_G" "5398636033_H"
>>> [21] "5398636033_I" "5398636033_J" "5398636033_K" "5398636033_L"
>>>
>>>> myMean = function(x) mean(x, na.rm = TRUE)
>>>> mySd = function(x) sd(x, na.rm = TRUE)
>>>> greenChannel = new("illuminaChannel", logGreenChannelTransform, illuminaOutlierMethod, myMean, mySd, "G")
>>>> BSData <- summarize(BLDataCombo, list(greenChannel))
>>>> str(exprs(BSData))
>>>  num [1:23350, 1:24] 9.92 7.42 7.43 7.5 12.34 ...
>>>  - attr(*, "dimnames")=List of 2
>>>  ..$ : chr [1:23350] "ILMN_2039396" "ILMN_2040732" "ILMN_2039699"
>>> "ILMN_2038916" ...
>>>  ..$ : chr [1:24] "5398636011_A" "5398636033_B" "5398636011_C"
>>> "5398636033_D" ...
>>>
>>>> colnames(exprs(BSData))
>>>  [1] "5398636011_A"   "5398636033_B"   "5398636011_C"   "5398636033_D"
>>>  [5] "5398636011_E"   "5398636033_F"   "5398636011_G"   "5398636033_H"
>>>  [9] "5398636011_I"   "5398636033_J"   "5398636011_K"   "5398636033_L"
>>> [13] "5398636011_A.1" "5398636033_B.1" "5398636011_C.1" "5398636033_D.1"
>>> [17] "5398636011_E.1" "5398636033_F.1" "5398636011_G.1" "5398636033_H.1"
>>> [21] "5398636011_I.1" "5398636033_J.1" "5398636011_K.1" "5398636033_L.1"
>>>
>>>> identical(exprs(BSData)[,1],exprs(BSData)[,13])
>>> [1] FALSE
>>>
>>>> head(cbind(exprs(BSData)[,1],exprs(BSData)[,13]))
>>>                  [,1]      [,2]
>>> ILMN_2039396  9.917607  9.817626
>>> ILMN_2040732  7.415167  7.436922
>>> ILMN_2039699  7.432883  7.423043
>>> ILMN_2038916  7.504111  7.619327
>>> ILMN_1374916 12.342863 13.377692
>>> ILMN_1353986  7.210915  7.211393
>>>
>>>
>>>> sessionInfo()
>>> R version 2.12.0 (2010-10-15)
>>> Platform: x86_64-redhat-linux-gnu (64-bit)
>>>
>>> locale:
>>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] illuminaRatv1BeadID.db_1.8.0 org.Rn.eg.db_2.4.6
>>> [3] RSQLite_0.9-2                DBI_0.2-5
>>> [5] AnnotationDbi_1.12.0         beadarray_2.0.2
>>> [7] Biobase_2.8.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] KernSmooth_2.23-4 limma_3.4.5       tools_2.12.1
>>>
>>>
>>>
>>> Thank you!
>>> Kasia
>>>
>>> --
>>> Kasia Stepien, M.Sc. Candidate
>>> Department of Medical Genetics
>>> University of British Columbia
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>
>
>
> --
> Kasia Stepien, M.Sc. Candidate
> Department of Medical Genetics
> University of British Columbia
>



More information about the Bioconductor mailing list