[BioC] Problem with extracting subset of samples for 500K in aroma affymetrix

James W. MacDonald jmacdon at med.umich.edu
Mon Jul 19 18:59:51 CEST 2010


Hi Manisha,

This isn't the appropriate list for your query - aroma.affymetrix isn't 
a Bioconductor package. In addition, there is a mailing list for that 
project that would be an appropriate place to ask your question. See

http://www.aroma-project.org/

and/or the list

aroma-affymetrix at googlegroups.com

Best,

Jim



On 7/16/2010 2:38 PM, Manisha Brahmachary wrote:
> Hello,
>
> If anybody in this forum uses the aroma affymetrix, I will be very obliged
> if you can help me with this query:
> I am doing an unpaired copy number analysis with 500K arrays using the
> CRMAv2. I want to use a subset of my samples the reference, instead of
> the all the samples as robust average for CBS segmentation. So I want
> to run CBS as CbsModel(ref,tumor). I am stuck at the point where I
> cannot extract the subset of samples for reference properly. Since the
> CnChipEffectSet is a list of two enzyme set, how do I extract out the
> subset of samples for both enzymes
>
> I have also listed the error I get below:
>
>
>   Here is the code I run:
>
>
> library(aroma.affymetrix)
>
>
> log<- Arguments$getVerbose(-4, timestamp=TRUE)
>
>
> dataSetName<- "Test"
> chipTypes<- c("Mapping250K_Nsp","Mapping250K_Sty")
>
>
> cdfs<- lapply(chipTypes, FUN=function(chipType) {
>    AffymetrixCdfFile$byChipType(chipType)
>
>
> })
>
>
> print(cdfs)
>
> gis<- lapply(cdfs, getGenomeInformation)
> print(gis)
>
>
> sis<- lapply(cdfs, getSnpInformation)
> print(sis)
>
>
> acs<- lapply(lapply(cdfs, getChipType), FUN=function(chipType) {
> AromaCellSequenceFile$byChipType(chipType)})
> print(acs)
>
>
> cesNList<- list()
> chipType<- chipTypes[1]
> cs<- AffymetrixCelSet$byName(dataSetName,chipType=chipType)
> cs<- extract(cs,!isDuplicated(cs))
> print(cs)
>
>
> acc<- AllelicCrosstalkCalibration(cs,model="CRMAv2");
>    print(acc);
>    csAcc<- process(acc, verbose=log);
>    bpn<- BasePositionNormalization(csAcc,target="zero")
>    print(bpn)
>    csN<- process(bpn,verbose=log)
> plm<- RmaCnPlm(csN,mergeStrands=TRUE, combineAlleles=TRUE, shift=
> +300);
>    print(plm);
>    fit(plm, verbose=log);
>
>
>    if (length(findUnitsTodo(plm))>  0) {
>    units<- fitCnProbes(plm, verbose=verbose)
>    str(units)
>     units<- fit(plm, verbose=verbose)
>    str(units)
>     }
>     ces<- getChipEffectSet(plm)
>     print(ces)
>
>
>     fln<- FragmentLengthNormalization(ces, target="zero")
>     print(fln)
>
>
> cesNList[[chipType]]<- process(fln, verbose=verbose)
> ##########################################################################
> #-#
>     # This is for the second chipType
>      chipType<- chipTypes[2]
>      cs<- AffymetrixCelSet$byName(dataSetName,chipType=chipType)
>      cs<- extract(cs,!isDuplicated(cs))
>      print(cs)
>
>
>     acc<- AllelicCrosstalkCalibration(cs,model="CRMAv2");
>     print(acc);
>     csAcc<- process(acc, verbose=log);
>
>
>    bpn<- BasePositionNormalization(csAcc,target="zero")
>    print(bpn)
>    csN<- process(bpn,verbose=log);
>
>
>    plm<- RmaCnPlm(csN,mergeStrands=TRUE, combineAlleles=TRUE, shift=
> +300);
>    print(plm);
>    fit(plm, verbose=log);
> if (length(findUnitsTodo(plm))>  0) {
>    units<- fitCnProbes(plm, verbose=verbose)
>    str(units)
>    units<- fit(plm, verbose=verbose)
>    str(units)
>     }
>     ces<- getChipEffectSet(plm)
>     print(ces);
>
>
>     ces<- getChipEffectSet(plm)
>     print(ces)
>
>
>     fln<- FragmentLengthNormalization(ces, target="zero")
>     print(fln)
>
>
>     cesNList[[chipType]]<- process(fln, verbose=verbose)
> ============================================================
> cesNList is my CnChipEffectSet
> Now my CnChipEffectSet: looks like this
>
>
> $Mapping250K_Nsp
> CnChipEffectSet:
> Name: Test
> Tags: ACC,-XY,BPN,-XY,RMA,+300,A+B,FLN,-XY
> Path: plmData/Test,ACC,-XY,BPN,-XY,RMA,+300,A+B,FLN,-XY/
> Mapping250K_Nsp
> Platform: Affymetrix
> Chip type: Mapping250K_Nsp,monocell
> Number of arrays: 80
> Names: AA-HGG024, AA-HGG045, ..., OT-HGG155
> Time period: 2010-07-14 15:46:23 -- 2010-07-14 15:46:40
> Total file size: 764.52MB
> RAM: 0.14MB
> Parameters: (probeModel: chr "pm", mergeStrands: logi TRUE,
> combineAlleles: logi TRUE)
>
>
> $Mapping250K_Sty
> CnChipEffectSet:
> Name: Test
> Tags: ACC,-XY,BPN,-XY,RMA,+300,A+B,FLN,-XY
> Path: plmData/Test,ACC,-XY,BPN,-XY,RMA,+300,A+B,FLN,-XY/
> Mapping250K_Sty
> Platform: Affymetrix
> Chip type: Mapping250K_Sty,monocell
> Number of arrays: 79
> Names: AA-HGG024, AA-HGG045, ..., OT-HGG155
> Time period: 2010-07-15 17:19:56 -- 2010-07-15 17:20:13
> Total file size: 687.18MB
> RAM: 0.14MB
> Parameters: (probeModel: chr "pm", mergeStrands: logi TRUE,
> combineAlleles: logi TRUE)
> =================================================================
> To extract the subset of arrays from cesNList
>
>
> RefCes<- extract(cesNList, 49:77)
>
>
> Error in list(`extract(cesNList, 49:77)` =<environment>,
> `extract.default(cesNList, 49:77)` =<environment>,  :
>
>
> [2010-07-15 22:05:21] Exception: Do not know how to unwrap object:
> list
>    at throw(Exception(...))
>    at throw.default("Do not know how to unwrap object: ", class(x)[1])
>    at throw("Do not know how to unwrap object: ", class(x)[1])
>    at extract.default(cesNList, 49:77)
>    at extract(cesNList, 49:77)
> ==============================================================
> I also tried to first extract the array names I want to use as
> reference set and I did this:
> for (chiptype in names(cesNList)) {
>           ces<- cesNList[[chiptype]];
>           cesnames<- getNames(ces); }
>
>
> refcols<- cesnames[49:77]
> extract(cesNList,refcols)
>   extract(cesNList,refcols)
> Error in list(`extract(cesNList, refcols)` =<environment>,
> `extract.default(cesNList, refcols)` =<environment>,  :
>
>
> [2010-07-15 22:34:24] Exception: Do not know how to unwrap object:
> list
>    at throw(Exception(...))
>    at throw.default("Do not know how to unwrap object: ", class(x)[1])
>    at throw("Do not know how to unwrap object: ", class(x)[1])
>    at extract.default(cesNList, refcols)
>    at extract(cesNList, refcols)
>
>
> Since the CnChipEffectSet is a list of two enzyme set, how do I
> extract out the subset of samples for both enzymes and do the
> downstream analysis such as:
>
>
> ceR<- getAverageFile(RefCes, verbose=verbose)
> cesT<- extract(cesNList,1:48)
> cbs<- CbsModel(ceR, cesT)
>
>
> ==============================================================
> attached base packages:
> [1] stats     graphics  grDevices datasets  utils     methods
> base
>
>
> other attached packages:
>   [1] DNAcopy_1.18.0         preprocessCore_1.6.0
> aroma.affymetrix_1.5.0
>   [4] aroma.apd_0.1.7        affxparser_1.16.0
> R.huge_0.2.0
>   [7] aroma.core_1.5.0       aroma.light_1.16.0
> matrixStats_0.2.1
> [10] R.rsp_0.3.6            R.cache_0.3.0
> R.filesets_0.8.1
> [13] digest_0.4.2           R.utils_1.4.0
> R.oo_1.7.2
> [16] R.methodsS3_1.2.0
>
>
> Thanks
> Manisha
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list