[BioC] R: Puma question

Manca Marco (PATH) m.manca at maastrichtuniversity.nl
Thu Oct 21 11:28:09 CEST 2010



Dear Richard and dear BioC fellows

I'm following up  on my previous help request.

I have corrected the flaw in my phenoData which now looks as follows:

> phenoData(Data)
An object of class "AnnotatedDataFrame"
  sampleNames: 090320 Blanche 02_(MoGene-1_0-st-v1).CEL, 090320 Blanche 13_(MoGe
  ne-1_0-st-v1).CEL, ..., 090320 Blanche 30_(MoGene-1_0-st-v1).CEL  (12 total)
  varLabels and varMetadata description:
    Group: Group
    Angiotensin: Angiotensin administration

> pData(Data)
                                         Group Angiotensin
02_(MoGene-1_0-st-v1).CEL    WT           0
13_(MoGene-1_0-st-v1).CEL    WT           0
23_(MoGene-1_0-st-v1).CEL    WT           0
07_(MoGene-1_0-st-v1).CEL    KO           0
08_(MoGene-1_0-st-v1).CEL    KO           0
18_(MoGene-1_0-st-v1).CEL    KO           0
31_(MoGene-1_0-st-v1).CEL    WT           1
10_(MoGene-1_0-st-v1).CEL    WT           1
11_(MoGene-1_0-st-v1).CEL    WT           1
09_(MoGene-1_0-st-v1).CEL    KO           1
20_(MoGene-1_0-st-v1).CEL    KO           1
30_(MoGene-1_0-st-v1).CEL    KO           1


But still puma is giving me the same error:

> Data.mmgmos<-mmgmos(Data)
Error in exprs(object)[mmIndex, ] <- value : 
  NAs are not allowed in subscripted assignments


I have also performed RMA normalization as a reference, as suggested Richard, and that is running fine:

> Data.rma<-rma(Data)
Background correcting
Normalizing
Calculating Expression

> Data.rma
ExpressionSet (storageMode: lockedEnvironment)
assayData: 34760 features, 12 samples 
  element names: exprs 
phenoData
  sampleNames: 090320 Blanche 02_(MoGene-1_0-st-v1).CEL, 090320 Blanche 13_(MoGe
  ne-1_0-st-v1).CEL, ..., 090320 Blanche 30_(MoGene-1_0-st-v1).CEL  (12 total)
  varLabels and varMetadata description:
    Group: Group
    Angiotensin: Angiotensin administration
featureData
  featureNames: 10338001, 10338003, ..., 10608724  (34760 total)
  fvarLabels and fvarMetadata description: none
experimentData: use 'experimentData(object)'
Annotation: mogene10sttranscriptcluster.db 

> phenoData(Data.rma)
An object of class "AnnotatedDataFrame"
  sampleNames: 090320 Blanche 02_(MoGene-1_0-st-v1).CEL, 090320 Blanche 13_(MoGe
  ne-1_0-st-v1).CEL, ..., 090320 Blanche 30_(MoGene-1_0-st-v1).CEL  (12 total)
  varLabels and varMetadata description:
    Group: Group
    Angiotensin: Angiotensin administration

> pData(Data.rma)
                                         Group Angiotensin
02_(MoGene-1_0-st-v1).CEL    WT           0
13_(MoGene-1_0-st-v1).CEL    WT           0
23_(MoGene-1_0-st-v1).CEL    WT           0
07_(MoGene-1_0-st-v1).CEL    KO           0
08_(MoGene-1_0-st-v1).CEL    KO           0
18_(MoGene-1_0-st-v1).CEL    KO           0
31_(MoGene-1_0-st-v1).CEL    WT           1
10_(MoGene-1_0-st-v1).CEL    WT           1
11_(MoGene-1_0-st-v1).CEL    WT           1
09_(MoGene-1_0-st-v1).CEL    KO           1
20_(MoGene-1_0-st-v1).CEL    KO           1
30_(MoGene-1_0-st-v1).CEL    KO           1


Have you got any idea what is going on here? I'm sincerely lost =P

Thank you in advance for your attention and for any feedback.

Marco.


P.S.:  The code I used to patch the varMetadata is the following

pData(Data) <- data.frame("Group"=c("WT","WT","WT","KO","KO","KO","WT","WT","WT","KO","KO","KO"), "Angiotensin"=c("0","0","0","0","0","0","1","1","1","1","1","1"), row.names=rownames(pData(Data)))
varMetadata(Data)=data.frame(labelDescription=c("Group","Angiotensin administration"))


My session info is again
> sessionInfo()
R version 2.10.1 (2009-12-14) 
x86_64-pc-linux-gnu 

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C             
 [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8    
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8   
 [7] LC_PAPER=en_US.utf8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] puma_1.12.0              mogene10stv1.r3cdf_2.5.0 affy_1.24.2             
[4] Biobase_2.6.1           

loaded via a namespace (and not attached):
[1] affyio_1.14.0        preprocessCore_1.8.0 tools_2.10.1





--
Marco Manca, MD
University of Maastricht
Faculty of Health, Medicine and Life Sciences (FHML)
Cardiovascular Research Institute (CARIM)

Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands)
Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08,  Maastricht University Medical Center, P. Debyelaan 25, 6229  HX Maastricht

E-mail: m.manca at maastrichtuniversity.nl
Office telephone: +31(0)433874633
Personal mobile: +31(0)626441205
Twitter: @markomanka


*********************************************************************************************************************

This email and any files transmitted with it are confidential and solely for the use of the intended recipient.

It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for

delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED.

If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA

*********************************************************************************************************************
________________________________________
Da: Richard Pearson [richard.pearson at well.ox.ac.uk]
Inviato: giovedì 14 ottobre 2010 18.28
A: Manca Marco (PATH)
Cc: bioconductor mailing list
Oggetto: Re: [BioC] Puma question

Hi Marco

My guess is that there is a problem with your targets.csv file. It seems that your "adf" object has no varMetadata:

   varLabels and varMetadata description:
     X.Group.:
     X.Treatment.:

To confirm this is not a problem specifically with puma, could you try a different summarisation method on your AffyBatch object, e.g. what do you get
if you try:

rma(Data)

If you'd like to send your targets.csv file to me I could have a quick look to see if I can spot the problem.

Best wishes

Richard


On 14/10/2010 10:49, Manca Marco (PATH) wrote:
>
>
> Dear BioC members,
>
> I'm trying to perform an analysis of set of mouse microarrays (Affymetrix Mouse Gene 1.0-ST Array Transcriptcluster) using the package puma. I'm quite new to this package so I'm trying to follow the vignette but I'm getting stuck with a very early error that I'm unable to interpret and tackle:
>
>> Data.mmgmos<- mmgmos(Data)
> Error in exprs(object)[mmIndex, ]<- value :
>    NAs are not allowed in subscripted assignments
>
> Following I'm attaching my whole commands' sequence, and sessionInfo, for your convenience
>
>
>> library("affy", "mogene10stv1.r3cdf")
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
>    Vignettes contain introductory material. To view, type
>    'openVignette()'. To cite Bioconductor, see
>    'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>> getwd();
> [1] "/home/..."
>> workingDir = "/home/...";
>> setwd(workingDir);
>> #loading the data
>> Data<-read.affybatch("02_(MoGene-1_0-st-v1).CEL","07_(MoGene-1_0-st-v1).CEL","08_(MoGene-1_0-st-v1).CEL","09_(MoGene-1_0-st-v1).CEL","10_(MoGene-1_0-st-v1).CEL","11_(MoGene-1_0-st-v1).CEL","13_(MoGene-1_0-st-v1).CEL","18_(MoGene-1_0-st-v1).CEL","20_(MoGene-1_0-st-v1).CEL","23_(MoGene-1_0-st-v1).CEL","30_(MoGene-1_0-st-v1).CEL","31_(MoGene-1_0-st-v1).CEL", cdfname = "mogene10stv1.r3cdf")
> Warning message:
> In read.affybatch("02_(MoGene-1_0-st-v1).CEL", "07_(MoGene-1_0-st-v1).CEL",  :
>    Incompatible phenoData object. Created a new one.
>
>> annotation(Data) = "mogene10sttranscriptcluster.db"
>> Data
> AffyBatch object
> size of arrays=1050x1050 features (21 kb)
> cdf=mogene10stv1.r3cdf (34760 affyids)
> number of samples=12
> number of genes=34760
> annotation=mogene10sttranscriptcluster.db
> notes=
>> adf<- read.AnnotatedDataFrame("targets.csv",header=TRUE, sep="\t")
>> adf
> An object of class "AnnotatedDataFrame"
>    rowNames: "02_(MoGene-1_0-st-v1).CEL", "07_(MoGe
>    ne-1_0-st-v1).CEL", ..., "08_(MoGene-1_0-st-v1).CEL"  (12 total)
>    varLabels and varMetadata description:
>      X.Group.:
>      X.Treatment.:
>> phenoData(Data)<-adf
>> library("puma")
>> Data.mmgmos<- mmgmos(Data)
> Error in exprs(object)[mmIndex, ]<- value :
>    NAs are not allowed in subscripted assignments
>
>> sessionInfo()
> R version 2.10.1 (2009-12-14)
> x86_64-pc-linux-gnu
>
> locale:
>   [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
>   [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
>   [7] LC_PAPER=en_US.utf8       LC_NAME=C
>   [9] LC_ADDRESS=C              LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] puma_1.12.0              mogene10stv1.r3cdf_2.5.0 affy_1.24.2
> [4] Biobase_2.6.1
>
> loaded via a namespace (and not attached):
> [1] affyio_1.14.0        preprocessCore_1.8.0 tools_2.10.1
>
>
> Thank you in advance for your attention. Any comment or suggestion would be highly apreciated.
>
> Marco
>
>
> --
> Marco Manca, MD
> University of Maastricht
> Faculty of Health, Medicine and Life Sciences (FHML)
> Cardiovascular Research Institute (CARIM)
>
> Mailing address: PO Box 616, 6200 MD Maastricht (The Netherlands)
> Visiting address: Experimental Vascular Pathology group, Dept of Pathology - Room5.08,  Maastricht University Medical Center, P. Debyelaan 25, 6229  HX Maastricht
>
> E-mail: m.manca at maastrichtuniversity.nl
> Office telephone: +31(0)433874633
> Personal mobile: +31(0)626441205
> Twitter: @markomanka
>
>
> *********************************************************************************************************************
>
> This email and any files transmitted with it are confidential and solely for the use of the intended recipient.
>
> It may contain material protected by privacy or attorney-client privilege. If you are not the intended recipient or the person responsible for
>
> delivering to the intended recipient, be advised that you have received this email in error and that any use is STRICTLY PROHIBITED.
>
> If you have received this email in error please notify us by telephone on +31626441205 Dr Marco MANCA
>
> *********************************************************************************************************************
> ________________________________________
> Da: bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] per conto di Sean Davis [sdavis2 at mail.nih.gov]
> Inviato: martedì 12 ottobre 2010 11.40
> A: Georgia Tsiliki
> Cc: Bioconductor Newsgroup
> Oggetto: Re: [BioC] GEOquery question
>
> On Tue, Oct 12, 2010 at 4:51 AM, Georgia Tsiliki<g_tsiliki at hotmail.com>wrote:
>
>>   Dear Dr Davis,
>> I am a biostatistician at BRFAA, Athens. I am currently using the
>> 'GEOquery' package with Bioconductor/R. I had a problem with GSE3494 series;
>> particularly, i cannot download the 'Data Table of the Clinicopathological
>> variables of the Upsala cohort header description' and the 'GEO Sample
>> accession numbers and associated Patient IDs header description' files. Both
>> of them are included in the GEO accession Viewer with an option to download
>> them, but I'm not sure how i can do that via the GEOquery package. I don't
>> think there's a soft file for that particular series, do you think that
>> might be the problem?
>>
>>
> Hi, Georgia.
>
> I realized a few months ago that this GSE (and others like it) existed.  I
> added a function to GEOquery to grab the GSE data tables.  In the case of
> GSE3494, there are two of these data tables, so the function will return a
> list of two data.frames.
>
> gsedt = getGSEDataTables('GSE3494')
>
> Now, gsedt is a list of length 2 and holds each of the GSE data tables in
> the list.  You can use getGEOSuppFiles to get the actual raw data.  With the
> two pieces, it is not difficult to generate an ExpressionSet using the
> normal affy/Bioc tools.
>
> Hope that helps.
>
> Sean
>
>
>
>> Thank you very much for your time,
>> Georgia Tsiliki
>>
>
>          [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

--
Dr Richard D Pearson                       richard.pearson at well.ox.ac.uk
Wellcome Trust Centre for Human Genetics   http://www.well.ox.ac.uk/~rpearson
University of Oxford                       Tel: +44 (0)1865 617890
Roosevelt Drive                            Mob: +44 (0)7971 221181
Oxford OX3 7BN, UK                         Fax: +44 (0)1865 287664



More information about the Bioconductor mailing list