[BioC] Unable to Generate QC Report for mogene10stv1

James W. MacDonald jmacdon at med.umich.edu
Fri Jan 7 21:47:50 CET 2011


Hi Rick,

What happens if you load the simpleaffy package first?

Best,

Jim

On 1/7/2011 2:14 PM, Rick Frausto wrote:
> Hi James,
>
> Below is the information that you requested - traceback() and sessioninfo().
> Doesn't seem like much to me, but perhaps you can help. As you answer to a
> lot of e-mails, thought I'd remind you that this is in regards to the "some
> row.names duplicated" error.
>
> Hope your holidays were good!
>
> -Rick
>
> [R.app GUI 1.35 (5632) x86_64-apple-darwin9.8.0]
>
> [Workspace restored from /Users/rickfrausto/.RData]
> [History restored from /Users/rickfrausto/.Rapp.history]
>
>> library(affy)
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
>    Vignettes contain introductory material. To view, type
>    'openVignette()'. To cite Bioconductor, see
>    'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>> mydata<- ReadAffy()
>> eset<- rma(mydata)
> Background correcting
> Normalizing
> Calculating Expression
>> write.exprs(eset, file="mydata.txt")
>> mypm<- pm(mydata)
>> mymm<- mm(mydata)
>> myaffyids<- probeNames(mydata)
>> result<- data.frame(myaffyids, mypm, mymm)
>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf")
> Loading required package: lattice
> Warning message:
> In data.row.names(row.names, rowsi, i) :
>    some row.names duplicated:
> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,52,53,5
> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,102,1
> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,141,142,1
> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,170,171,1
> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,206,207,2
> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250,251,2
> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,291,292,2
> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,334,337,3
> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,376,378,3
> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,405,406,4
> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,445,447,4
> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493,494,4
> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [...
> truncated]
> Error in plot(qc(object)) :
>    error in evaluating the argument 'x' in selecting a method for function
> 'plot'
>> traceback()
> 2: plot(qc(object))
> 1: QCReport(mydata, file = "ExampleQC.pdf")
>> sessionInfo()
> R version 2.12.0 (2010-10-15)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] affyQCReport_1.28.1   latptice_0.19-13       mogene10stv1cdf_2.7.0
> [4] affy_1.28.0           Biobase_2.10.0
>
> loaded via a namespace (and not attached):
>   [1] affyio_1.18.0         affyPLM_1.26.0        annotate_1.28.0
>   [4] AnnotationDbi_1.12.0  Biostrings_2.18.2     DBI_0.2-5
>   [7] gcrma_2.22.0          genefilter_1.32.0     grid_2.12.0
> [10] IRanges_1.8.7         preprocessCore_1.12.0 RColorBrewer_1.0-2
> [13] RSQLite_0.9-4         simpleaffy_2.26.1     splines_2.12.0
> [16] survival_2.36-2       tools_2.12.0          xtable_1.5-6
>>
>
>
>
>
> On 20/12/10 6:33 AM, "James W. MacDonald"<jmacdon at med.umich.edu>  wrote:
>
>> Hi Rick,
>>
>> On 12/17/2010 9:24 PM, Rick Frausto wrote:
>>> Hey Jim,
>>>
>>> Ok, I will give that a go. The only problem is an ExpressionSet contains all
>>> of the necessary information for further analysis (e.g. phenodata,
>>> featuredata and annotation, etc - including, treatment type, cell type, time
>>> points, replicates). I am still learning how to include all of these for a
>>> complete ExpressionSet. As a starting point I've loaded a txt file
>>> containing some of this information (gene abbrev, ontology, probeset ID)
>>> which I created using Affymetrix's Expression Console software, without
>>> replicate, time point and cell type info. Doing this I've gotten as far as
>>> creating a minimal ExpressionSet, which I guess the functions you mention
>>> below do just that but with the information contained in the CEL file only.
>>>
>>> In any case, since as you say, the functions in the online manual create a
>>> proper ExpressionSet why would I get the issue of duplication?
>>
>> Oh yeah, the original question ;-D. Try running QCreport() again, and
>> when it errors out run traceback() and send the output. Also include the
>> output of sessionInfo().
>>
>> Jim
>>
>>
>>>
>>> In regards to the 64-bit discussion. It may have very well made enough of a
>>> difference as it did not come up with the memory error the last time I tried
>>> it. Going to upgrade to 8GB RAM anyways, can't hurt.
>>>
>>> Cheers,
>>> Rick
>>>
>>>
>>> On 17/12/10 7:20 AM, "James W. MacDonald"<jmacdon at med.umich.edu>   wrote:
>>>
>>>> Hi Rick,
>>>>
>>>> On 12/16/2010 4:13 PM, Rick Frausto wrote:
>>>>> Hi Jim,
>>>>>
>>>>> How do I run an RMA analysis without a proper ExpresionSet? Honest answer,
>>>>> I
>>>>> don't know, I just put in a command line from a manual I found online and
>>>>> it
>>>>> spit out some result- see #3 Affy packages in following link (
>>>>> http://manuals.bioinformatics.ucr.edu/home/R_BioCondManual#biocon_intro).
>>>>
>>>> You are mistaken. All of the functions mentioned there result in a
>>>> proper ExpressionSet. And if you just do
>>>>
>>>> abatch<- ReadAffy()
>>>> eset<- rma(abatch)
>>>>
>>>> Then you will 100% surely get an ExpressionSet.
>>>>
>>>>>
>>>>> Perhaps you don't need an ExpressionSet until after the preprocessing, at
>>>>> least that is what I get from the "An Introduction to Bioconductor's
>>>>> ExpressionSet Class" written by Seth Falcon, Martin Morgan and Robert
>>>>> Gentleman. Everything seemed to be going smoothly until I tried to get a QC
>>>>> Report.
>>>>>
>>>>> Now, the answer for why I would want to do such a thing is easy. Simply
>>>>> that
>>>>> I don't know any better :) Just started working with R a few days ago, but
>>>>> I'm learning.
>>>>>
>>>>>
>>>>> Apparently Snow Leopard running on 32bit can only utilize about 3.2GB of
>>>>> RAM, whereas 64bit can make use of all 4GB. I'll switch to the 64 bit OS
>>>>> and
>>>>> see if it makes a difference.
>>>>
>>>> Well, it won't be much different. The reason a 32-bit OS can only use
>>>> about 3.2 Gb of RAM is that the OS needs some to run. The 64-bit OS also
>>>> needs to use some RAM, so you won't get all 4 Gb there either. The issue
>>>> is how much RAM can be allocated to a single process, and on a 64-bit OS
>>>> that gets bumped up significantly.
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>
>>>>
>>>>
>>>>>
>>>>> Thanks for your insight!
>>>>>
>>>>> Cheers,
>>>>> Rick
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 16/12/10 11:31 AM, "James W. MacDonald"<jmacdon at med.umich.edu>    wrote:
>>>>>
>>>>>> Hi Rick,
>>>>>>
>>>>>> On 12/16/2010 12:57 PM, Rick Frausto wrote:
>>>>>>> Thanks Jim! How much memory would I need, I currently have 4GB, but have
>>>>>>> quite a few other programs running in the background...I'll see if
>>>>>>> closing
>>>>>>> them helps. Perhaps setting up an "ExpressionSet" would solve the
>>>>>>> problem.
>>>>>>> I
>>>>>>> just started reading up on how to set one of these up yesterday. Will do
>>>>>>> this and see if the duplicates will go away.
>>>>>>>
>>>>>>> The "mydata" originates from CEL files and then I run the RMA analysis on
>>>>>>> it, but I didn't actually set up a proper ExpressionSet. I'm guessing
>>>>>>> that
>>>>>>> doing this might reduce the QCReport PDF file size quite considerably
>>>>>>> since
>>>>>>> I won't have any duplication and will make further analysis easier.
>>>>>>
>>>>>> How do you run an RMA analysis without setting up a proper
>>>>>> ExpressionSet? The default behavior is to create one. In addition, why
>>>>>> would you want to do such a thing? The ExpressionSet class is
>>>>>> specifically designed to contain these sorts of data.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I'm running Snow Leopard OSX which can be set up as 64bit. Would running
>>>>>>> as
>>>>>>> 64bit still necessitate more RAM?
>>>>>>
>>>>>> Probably. The difference isn't efficiency, but the ability to address
>>>>>> more RAM. A 32-bit OS can still address all the available memory that
>>>>>> you will have with just 4 Gb RAM, so you need to bump that up if you
>>>>>> want to do all the chips together. As for how much, I don't know. Since
>>>>>> RAM isn't that expensive these days, you might look at maxing your box
>>>>>> out.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Jim
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks again,
>>>>>>> Rick
>>>>>>>
>>>>>>>
>>>>>>> On 15/12/10 7:45 AM, "James W. MacDonald"<jmacdon at med.umich.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Rick,
>>>>>>>>
>>>>>>>> On 12/14/2010 3:55 PM, Rick Frausto wrote:
>>>>>>>>> Dear All,
>>>>>>>>>
>>>>>>>>> I have recently entered the world of R. Through some trial and error
>>>>>>>>> I'm
>>>>>>>>> becoming more familiar with R and the relevant Bioconductor Affy
>>>>>>>>> packages.
>>>>>>>>> I¹m a molecular and cell biologist with rudimentary statistical
>>>>>>>>> knowledge
>>>>>>>>> and even less knowledge with respect to R.
>>>>>>>>>
>>>>>>>>> When I enter the following:
>>>>>>>>>
>>>>>>>>> library(affyQCReport); QCReport(mydata, file="ExampleQC.pdf")
>>>>>>>>>
>>>>>>>>> I get some errors in return.
>>>>>>>>>
>>>>>>>>> Loading required package: lattice
>>>>>>>>> Error: cannot allocate vector of size 437.4 Mb
>>>>>>>>
>>>>>>>> This indicates that you need more RAM, as you are running out of memory.
>>>>>>>>
>>>>>>>>> In addition: Warning message:
>>>>>>>>> In data.row.names(row.names, rowsi, i) :
>>>>>>>>>        some row.names duplicated:
>>>>>>>>>
>>>>> 4,8,9,13,14,15,16,24,25,26,27,28,29,30,31,36,37,38,39,47,48,49,50,51,52,53,
>>>>>>>
>>>>>>>
>>>>> 5
>>>>>>>>>
>>>>> 4,58,59,60,64,65,66,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,102,
>>>>>>>
>>>>>>>
>>>>> 1
>>>>>>>>>
>>>>> 03,104,108,109,110,111,114,119,120,121,122,127,134,136,137,138,139,141,142,
>>>>>>>
>>>>>>>
>>>>> 1
>>>>>>>>>
>>>>> 47,148,149,152,153,156,157,158,159,162,163,164,165,166,167,168,169,170,171,
>>>>>>>
>>>>>>>
>>>>> 1
>>>>>>>>>
>>>>> 73,175,176,179,180,183,184,185,186,191,192,195,197,198,199,200,202,206,207,
>>>>>>>
>>>>>>>
>>>>> 2
>>>>>>>>>
>>>>> 10,219,220,227,228,229,230,233,234,235,240,241,243,245,246,248,249,250,251,
>>>>>>>
>>>>>>>
>>>>> 2
>>>>>>>>>
>>>>> 52,253,257,259,260,266,271,272,276,277,280,281,284,286,287,289,290,291,292,
>>>>>>>
>>>>>>>
>>>>> 2
>>>>>>>>>
>>>>> 96,297,298,302,304,305,306,310,311,312,313,317,318,319,321,322,324,334,337,
>>>>>>>
>>>>>>>
>>>>> 3
>>>>>>>>>
>>>>> 38,339,340,341,345,346,350,351,356,359,362,364,366,367,370,371,373,376,378,
>>>>>>>
>>>>>>>
>>>>> 3
>>>>>>>>>
>>>>> 82,383,384,385,386,387,388,389,391,394,395,397,398,399,400,402,403,405,406,
>>>>>>>
>>>>>>>
>>>>> 4
>>>>>>>>>
>>>>> 07,409,410,411,415,416,418,419,425,431,432,433,434,435,440,441,443,445,447,
>>>>>>>
>>>>>>>
>>>>> 4
>>>>>>>>>
>>>>> 49,450,452,454,455,456,461,464,466,470,472,473,481,487,488,491,492,493,494,
>>>>>>>
>>>>>>>
>>>>> 4
>>>>>>>>> 95,496,497,498,499,501,502,504,506,507,509,511,513,515,516,51 [...
>>>>>>>>> truncated]
>>>>>>>>
>>>>>>>> What exactly is 'mydata', and how did you generate it? The above error
>>>>>>>> indicates that you have duplicate row names, which IIRC isn't possible
>>>>>>>> to do with an expressionSet.
>>>>>>>>
>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error
>>>>>>>>> code=12)
>>>>>>>>> *** error: can't allocate region
>>>>>>>>> *** set a breakpoint in malloc_error_break to debug
>>>>>>>>> R(9062,0xa05c5540) malloc: *** mmap(size=458665984) failed (error
>>>>>>>>> code=12)
>>>>>>>>> *** error: can't allocate region
>>>>>>>>> *** set a breakpoint in malloc_error_break to debug
>>>>>>>>
>>>>>>>> More lack of memory errors.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Error in help(dt[i], package = pkg[i], htmlhelp = TRUE) :
>>>>>>>>>        unused argument(s) (htmlhelp = TRUE)
>>>>>>>>> In addition: Warning messages:
>>>>>>>>> 1: In data(package = .packages(all.available = TRUE)) :
>>>>>>>>>        datasets have been moved from package 'base' to package
>>>>>>>>> 'datasets'
>>>>>>>>> 2: In data(package = .packages(all.available = TRUE)) :
>>>>>>>>>        datasets have been moved from package 'stats' to package
>>>>>>>>> 'datasets'
>>>>>>>>> starting httpd help server ... done
>>>>>>>>>
>>>>>>>>> Would someone be able to diagnose the problem and suggest a solution?
>>>>>>>>
>>>>>>>> First, get more RAM. Second, you will be better off using a 64-bit OS.
>>>>>>>> Depending on your hardware, you might be able to just install a 64-bit
>>>>>>>> version of R.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Jim
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> If it is useful, I am using the following R software: R for Mac OS X
>>>>>>>>> GUI
>>>>>>>>> 1.35-dev Leopard build 32-bit. If there is any other info that would be
>>>>>>>>> useful please let me know.
>>>>>>>>>
>>>>>>>>> I had a read of the AffyQCReport Package pdf and I have added the
>>>>>>>>> following
>>>>>>>>> line: QCReport(ReadAffy(widget=TRUE)). Then I tried
>>>>>>>>> library(affyQCReport);
>>>>>>>>> QCReport(mydata, file="ExampleQC.pdf") again. It now seems to be doing
>>>>>>>>> something, in other words it doesn¹t go to the error, yet, but it¹s
>>>>>>>>> been
>>>>>>>>> processing for about 10 minutes. I am analyzing 35 chips.
>>>>>>>>>
>>>>>>>>> Perhaps it would work if I tried to generate each QCReport page
>>>>>>>>> separately
>>>>>>>>> rather than as a whole.
>>>>>>>>>
>>>>>>>>> Cordially,
>>>>>>>>> Rick
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioconductor mailing list
>>>>>>>>> Bioconductor at r-project.org
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>>>> Search the archives:
>>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>>
>>>>>
>>>
>

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list