[BioC] (no subject)

Sat Mar 13 18:38:31 CET 2010

Actually, I did not express myself well.  I was not talking about 
error correlation.

People usually look at the correlation in M between arrays.  Only the 
differentially expressing genes should be correlated - the remainder 
of M-values are just noise and so should be uncorrelated across 
arrays.  For this reason the size of the correlation should be 
small.  Since M represents A - B on one array and B - A on the other 
(with treatments A and B), the correlation should be negative.

Great suggestions for reading.

--Naomi

At 08:54 AM 3/13/2010, James W. MacDonald wrote:
>Hi Ana,
>
>You need to re-read what Naomi said. A correlation between dye swaps 
>would be expected. What she was warning about was a correlation 
>between treatments.
>
>As for a good book, have you read the BioC monograph?
>
>http://bioconductor.org/pub/docs/mogr/
>
>The case studies might be of interest as well
>
>http://bioconductor.org/pub/biocases/
>
>Best,
>
>Jim
>
>
>
>Ana Staninska wrote:
>>Dear Naomi, Thank you very very much for your very helpful answers. 
>>Could you maybe tell me what did you mean that if the dye swap 
>>correlation between two treatments is -0.2 I am in trouble. What is 
>>considered to be a good dye swap correlation (I calculate it using 
>>duplicateCorrelation function in limma). Also what is considered as 
>>a good correlation between duplicate spots (after normalization) ? 
>>I know that the easiest way out is to ask a statistician to do the 
>>analysis, but I would like to learn it myself to do it (I am a 
>>mathematician, so I think I should be able to learn it). Could you 
>>maybe point out a literature that I could read and learn a proper 
>>way of dealing with any kind of microarrays.
>>Thank you very much one more time, Best, Ana
>>
>>>Date: Fri, 12 Mar 2010 16:29:03 -0500
>>>To: staninska at hotmail.com; naomi at stat.psu.edu; 
>>>bioconductor at stat.math.ethz.ch
>>>From: naomi at stat.psu.edu
>>>Subject: RE: [BioC] (no subject)
>>>
>>>Dear Ana,
>>>I actually meant that you should average dye swaps, not spots, 
>>>although either is OK as long as you use corfit for the other.
>>>
>>>If there are no technical replicates for some biological reps, the 
>>>analysis is much more complicated.  This really requires a 
>>>statistical consultant and someone who will do some detailed 
>>>preliminary analyses.
>>>
>>>Naomi
>>>
>>>p.s. I hope that the correlation of -0.2 for the dye swaps is for 
>>>R-G.  If it is for treatment A - treatment B, you have a problem.
>>>
>>>At 03:08 PM 3/12/2010, Ana Staninska wrote:
>>>>Dear Naomi,
>>>>
>>>>Thank you very much for your answer. I just have few follow up question.
>>>>
>>>>How big should be the correlation on my duplicate spots in order 
>>>>to "safetly" average them?
>>>>Before the normalization, the correlation on my duplicate spots 
>>>>is around 0.7-0.8, but after normalization
>>>>it is only around 0.4-0.6. Which I think it is not the best.
>>>>Probably I should mention that the correlation of dye swapped 
>>>>arrays is around -0.2.
>>>>
>>>>Also, for some of the experiments, we had to remove certain 
>>>>arrays, and therefore not all of my biological replicates are dye swapped.
>>>>In that case I think I should use the contrast matrix to average 
>>>>of the treated vs non-treated comparisons.
>>>>Isn't then better to use the corfit$consensus on my duplicate spots?
>>>>
>>>>Thank you very much in advance,
>>>>
>>>>All the best,
>>>>Ana
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>Date: Fri, 12 Mar 2010 12:28:06 -0500
>>>>>To: staninska at hotmail.com; bioconductor at stat.math.ethz.ch
>>>>>From: naomi at stat.psu.edu
>>>>>Subject: Re: [BioC] (no subject)
>>>>>
>>>>>The estimated error variance used for the test denominator will be an
>>>>>average of technical and biological replication, and therefore not
>>>>>really appropriate for your analysis. However, you could average the
>>>>>2 technical replicates prior to running limma which would give you
>>>>>the right error structure.
>>>>>
>>>>>--Naomi
>>>>>
>>>>>At 12:04 PM 3/12/2010, Ana Staninska wrote:
>>>>>
>>>>>>Dear Bioconductor,
>>>>>>I have a simple experiment that I have to analyze in order to find
>>>>>>differentially expressed genes. I have 10 biological replicates, and
>>>>>>each biological replicate has two technical replicates which appear
>>>>>>as dye swapped. So in total I have 20 arrays. Each of the probes are
>>>>>>spotted twice on the array (on the left and on the right hand side).
>>>>>>I use limma to do my analysis. I know at the moment it is not
>>>>>>possible to treat duplicate spots, technical replicates and
>>>>>>biological replicates, but I though if I use the
>>>>>>duplicateCorrelation function on my duplicate spots, and then to use
>>>>>>a contrast matrix to average of all of the Treated vs Non-treated
>>>>>>biological samples, I could address all 3 replications. Am I correct?
>>>>>>
>>>>>>
>>>>>>I am sending a copy of my code, if someone could look at it at tell
>>>>>>me whether I made somewhere a mistake.
>>>>>>Thank you very much in advance,
>>>>>>Sincerely Ana Staninska
>>>>>>
>>>>>>
>>>>>>library(limma)> library(statmod)> library(marray)>
>>>>>>library(convert)> library(hexbin)> library(gridBase)>
>>>>>>library(RColorBrewer)> > targets <-
>>>>>>readTargets("Lysi_270705.txt")> > ### Only manually removed ot
>>>>>>absent spots are given 0 weight ###> RGa <- read.maimages(targets,
>>>>>>source="genepix", wt.fun=wtflags(weight=0,
>>>>>>cutoff=-75), other.columns=c("F635 SD","B635 SD","F532 SD","B532
>>>>>>SD","B532 Mean","B635 Mean","F Pixels","B Pixels"))Read
>>>>>>LYSI270705_1_200905.gpr Read LYSI270705_1dw_200905.gpr Read
>>>>>>LYSI270705_2_200905.gpr Read LYSI270705_2dw_200905.gpr Read
>>>>>>LYSI270705_3_121005.gpr Read LYSI270705_3dw_121005.gpr Read
>>>>>>LYSI270705_4_121005.gpr Read LYSI270705_4dw_121005.gpr Read
>>>>>>LYSI270705_5_121005.gpr Read LYSI270705_5dw__121005.gpr Read
>>>>>>LYSI270705_6_121005.gpr Read LYSI270705_6dw__121005.gpr Read
>>>>>>LYSI270705_7_151001.gpr Read LYSI270705_7dw_151005.gpr Read
>>>>>>LYSI270705_8_151005.gpr Read LYSI270705_8dw_151005.gpr Read
>>>>>>LYSI270705_9_151005.gpr Read LYSI270705_9dw_151005.gpr Read LYSI270705!
>>>>>>_10_151005.gpr Read LYSI270705_10dw_151005.gpr > for(i in
>>>>>>1:nrow(RGa)){+ for(j in
>>>>>>1:ncol(RGa)){+ if(RGa$Rb[i,j]+RGa$R[i,j]+ RGa$G[i,j]+
>>>>>>RGa$Gb[i,j] ==0)+ RGa$weights[i,j]<-0+ }+ }> >
>>>>>>####################################################> ###
>>>>>>Background Correction = Normexp + offset 25 ####>
>>>>>>####################################################> > RG
>>>>>><-backgroundCorrect(RGa, method="normexp", , normexp.method="mle",
>>>>>>offset=25)Green channelCorrected array 1 Corrected array 2
>>>>>>Corrected array 3 Corrected array 4 Corrected array 5 Corrected
>>>>>>array 6 Corrected array 7 Corrected array 8 Corrected array 9
>>>>>>Corrected array 10 Corrected array 11 Corrected array 12 Corrected
>>>>>>array 13 Corrected array 14 Corrected array 15 Corrected array 16
>>>>>>Corrected array 17 Corrected array 18 Corrected array 19 Corrected
>>>>>>array 20 Red channelCorrected array 1 Corrected array 2 Corrected
>>>>>>array 3 Corrected array 4 Corrected array 5 Corrected array 6
>>>>>>Corrected array 7 Corrected array 8 Corrected array !
>>>>>>9 Corrected array 10 Corrected array 11 Corrected array 12 Corrected a
>>>>>>rray 13 Corrected array 14 Corrected array 15 Corrected array 16
>>>>>>Corrected array 17 Corrected array 18 Corrected array 19 Corrected
>>>>>>array 20 > ####################################################>
>>>>>>##### normalize Within arrays #########>
>>>>>>####################################################> > MA
>>>>>><-normalizeWithinArrays(RG, method="loess")> >
>>>>>>####################################################> ######
>>>>>>Contrast Matrix ############>
>>>>>>####################################################> >
>>>>>>design<-cbind( + MU1vsWT1=c(
>>>>>>1,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),+ MU2vsWT2=c(0,0,
>>>>>>1,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),+ MU3vsWT3=c(0,0,0,0,
>>>>>>1,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0),+ MU4vsWT4=c(0,0,0,0,0,0,
>>>>>>1,-1,0,0,0,0,0,0,0,0,0,0,0,0),+ MU5vsWT5=c(0,0,0,0,0,0,0,0,
>>>>>>1,-1,0,0,0,0,0,0,0,0,0,0),+ MU6vsWT6=c(0,0,0,0,0,0,0,0,0,0,
>>>>>>1,-1,0,0,0,0,0,0,0,0),
>>>>>>+ MU7vsWT7=c(0,0,0,0,0,0,0,0,0,0!
>>>>>>,0,0,
>>>>>>1,-1,0,0,0,0,0,0),+ MU8vsWT8=c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>>>>>>1,-1,0,0,0,0),+ MU9vsWT9=c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>>>>>>1,-1,0,0),+ MU10vsWT10=c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>>>>>>1,-1))> > cont.matrix <-
>>>>>>
>>>>>>####################################################>
>>>>>>### Duplicate Correlations on duplicate spots ####>
>>>>>>####################################################> >
>>>>>>corfit<-duplicateCorrelation(MA, ndups=2, spacing=192)> >
>>>>>>####################################################> ##### Linear
>>>>>>Fit Model and Contrasts fit #######>
>>>>>>####################################################> >
>>>>>>fit<-lmFit(MA, design, ndups=2, spacing=192,
>>>>>>cor=corfit$consensus)> > fit<-contrasts.fit(fit, cont.matrix)> >
>>>>>>####################################################>
>>>>>>######### eBayes Statistics ###############> #################!
>>>>>>###################################> > fit<-eBayes(fit)> > ###########
>>>>>>###################################################> ### Writing
>>>>>>the Results ######>
>>>>>>##############################################################>
>>>>>>TTnew<-topTable(fit,coef=1, number=100, adjust="BH")
>>>>>>
>>>>>>
>>>>>>Ana StaninskaHelmholtz-Zentrum MuenchenDepartment of Scientific
>>>>>>ComputingNeuherberg, Deutschland+49 (0) 89 3187 2656
>>>>>>
>>>>>>[[alternative HTML version deleted]]
>>>>>>
>>>>>>_______________________________________________
>>>>>>Bioconductor mailing list
>>>>>>Bioconductor at stat.math.ethz.ch
>>>>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>Search the archives:
>>>>>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>Naomi S. Altman 814-865-3791 (voice)
>>>>>Associate Professor
>>>>>Dept. of Statistics 814-863-7114 (fax)
>>>>>Penn State University 814-865-1348 (Statistics)
>>>>>University Park, PA 16802-2111
>>>Naomi S. Altman                                814-865-3791 (voice)
>>>Associate Professor
>>>Dept. of Statistics                              814-863-7114 (fax)
>>>Penn State University                         814-865-1348 (Statistics)
>>>University Park, PA 16802-2111
>>
>>         [[alternative HTML version deleted]]
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>Search the archives: 
>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>**********************************************************
>Electronic Mail is not secure, may not be read every day, and should 
>not be used for urgent or sensitive issues
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111