[altirriba@hotmail.com: [BioC] Design in factDesign] (fwd)

Jordi Altirriba Gutiérrez altirriba at hotmail.com
Tue Apr 13 11:10:07 CEST 2004


Thank you very much for your answers! And thanks for updating the factDesign 
so fast!
Yours sincerely,

Jordi Altirriba
IDIBAPS - Hospital Clínic - Barcelona

>From: Denise Scholtens <dscholte at hsph.harvard.edu>
>To: Kasper Daniel Hansen <k.hansen at biostat.ku.dk>
>CC: Jordi Altirriba Gutiérrez 
><altirriba at hotmail.com>,bioconductor at stat.math.ethz.ch
>Subject: Re: [altirriba at hotmail.com: [BioC] Design in factDesign] (fwd)
>Date: Mon, 12 Apr 2004 11:30:20 -0400 (EDT)
>
>I'm sorry I missed the difference between "*" and ":" previously - I
>always forget which means which.
>
>Kaspar is right - in statistics, it's not a good idea to fit an
>interaction between two terms in a linear model without including the main
>effects in the model as well.  In the situation you describe below, you
>could compare the "D+T+D:T" model to the "T" to find genes that are
>differentially expressed for diabetic patients compared to nondiabetic
>patients in either the presence or absence of treatment.  This will take
>account of the observations under all four treatment conditions.  You
>could then further analyze specific tests of contrasts for the parameters
>in the model to get at more directed questions about the behavior of the
>genes under the experimental conditions - the factDesign vignette contains
>examples of this.
>
>BTW - factDesign is now updated so that the contrastTest function returns
>the F-value referred to previously.
>
>Denise
>
>On Fri, 9 Apr 2004, Kasper Daniel Hansen wrote:
>
> > On Thu, Apr 08, 2004 at 08:35:08PM +0200, Jordi Altirriba Gutiérrez 
>wrote:
> > > But there is something that I don't understand (sorry if it is a very 
>basic
> > > question), because when I want to get the genes that are 
>differentially
> > > expressed due to diabetes I "fit" my data to the functions: lm(y ~ 
>DIABETES
> > > + TREATMENT + DIABETES * TREATMENT) and lm(y ~ TREATMENT). Therefore 
>the
> > > genes that "fit" better to the first function are differentially 
>expressed
> > > due to diabetes, but why don't I fit my data to the functions: lm(y ~
> > > DIABETES + TREATMENT + DIABETES * TREATMENT) and lm(y ~ TREATMENT +
> > > DIABETES * TREATMENT)? I know that the parameter DIABETES * TREATMENT 
>is
> > > the intersection of the other two parameters, but it should be 
>independent
> > > of these parameters.
> >
> > (I have not read the rest of the discussion)
> >
> > In R, T * D in a model formula is short hand for T + D + T:D. Using this 
>we get
> >   T + D + T*D "=" T + D + T + D + T:D "=" T + D + T:D
> > (since you only need one occurence of a term), and
> >   T + T*D "=" T + T + D + T:D "=" T + D + T:D
> > so the two formulas are equal.
> >
> > However, if I understand the intention behind the question, you want to 
>exclude a main effect in the presence of an interaction (or
> > to be presice, you want to test T + T:D vs T*D). This is something which 
>makes no sense at all. I suggest you pick up a basic book on
> > statistics and read on main effects and interactions.
> >
> > But ok, a very quick explanation: an interaction between diabetes and 
>treatment means that the effect of diabetes (on gene
> > expression) is different for the different treatment groups (eg. the 
>effect of diabetes may disappear amongst the treated patients).
> > Hence you have some effect of treatment as well as diabetes.
> >
> > /Kasper
> >
> > > Jordi Altirriba
> > > IDIBAPS - Hospital Clinic (Barcelona, Spain)
> > > >
> > > >Hello Jordi,
> > > >
> > > >My comments to your questions are below.  I hope this helps.  -Denise
> > > >
> > > 
> >__________________________________________________________________________
> > > >Denise Scholtens
> > > >Department of Biostatistics
> > > >Harvard School of Public Health
> > > >dscholte at hsph.harvard.edu
> > > >
> > > >Hi all!
> > > >I’ve been using RMA and LIMMA to analyse my data and I am currently 
>trying
> > > >to analyse it with the package factDesign. My design is a 2x2 
>factorial
> > > >design with 4 groups: diabetic treated, diabetic untreated, health 
>treated
> > > >and health untreated with 3 biological replicates in each group. I 
>want to
> > > >know what genes are differentially expressed due to diabetes,  to the
> > > >treatment and to the combination of both (diabetes + treatment).
> > > >My phenoData is:
> > > >>pData(eset)
> > > >         DIABETES    TREATMENT
> > > >DNT1     TRUE       FALSE
> > > >DNT2     TRUE       FALSE
> > > >DNT3     TRUE       FALSE
> > > >DT1        TRUE       TRUE
> > > >DT2        TRUE       TRUE
> > > >DT3        TRUE       TRUE
> > > >SNT1      FALSE     FALSE
> > > >SNT2      FALSE     FALSE
> > > >SNT3      FALSE     FALSE
> > > >ST1         FALSE     TRUE
> > > >ST2         FALSE     TRUE
> > > >ST3         FALSE     TRUE
> > > >
> > > >Are these commands correct to get the results listed below? Where are 
>the
> > > >errors?
> > > >>lm.full<-function(y) lm(y ~ DIABETES + TREATMENT + DIABETES * 
>TREATMENT)
> > > >>lm.diabetes<-function(y) lm(y ~ DIABETES)
> > > >>lm.treatment<-function(y) lm(y ~ TREATMENT)
> > > >>lm.diabetestreatment<-function(y) lm(y ~ DIABETES + TREATMENT)
> > > >>lm.f<-esApply(eset, 1, lm.full)
> > > >>lm.d<-esApply(eset, 1, lm.diabetes)
> > > >>lm.t<-esApply(eset, 1, lm.treatment)
> > > >>lm.dt<-esApply(eset, 1, lm.diabetestreatment)
> > > >
> > > >#####
> > > ># Yes, these commands look correct for making the linear models and
> > > ># running them for the exprSet called eset.
> > > >######
> > > >
> > > >## To get the genes characteristics of the treatment:
> > > >>Fpvals<-rep(0, length(lm.f))
> > > >>for (i in 1:length(lm.f)) {Fpvals[i]<-anova(lm.d[[i]], 
>lm.f[[i]])$P[2]}
> > > >>Fsub<-which(Fpvals<0.01)
> > > >>eset.Fsub<-eset[Fsub]
> > > >>lm.f.Fsub<-lm.f[Fsub]
> > > >>betaNames<-names(lm.f[[1]] [["coefficients"]])
> > > >>lambda<-par2lambda(betaNames, c("TREATMENTTRUE"), c(1)) ## I get the 
>same
> > > >>genes if I write : > lambda2 <- par2lambda (betaNames,
> > > >>list(c("TREATMENTTRUE" , "DIABETESTRUE:TREATMENTTRUE")),list( 
>c(1,1)))
> > > >>mainTR<-function(x) contrastTest(x,lambda,p=0.1) [[1]]
> > > >>mainTRgenes<-sapply(lm.f.Fsub, FUN=mainES)
> > > >
> > > >#####
> > > ># I think the problem is the use of mainES rather than mainTR in the 
>last
> > > ># sapply. mainES is a function that is defined in the factDesign 
>vignette
> > > ># - your own function should be used here instead.  If you define the
> > > ># function differently for different contrasts, my guess is you will 
>see
> > > ># different gene lists for the lambda and lambda2 defined above.
> > > >#####
> > > >
> > > >
> > > >## To get the genes characteristics of the diabetes:
> > > >>for (i in 1:length(lm.f)) {Fpvals[i]<-anova(lm.t[[i]], 
>lm.f[[i]])$P[2]}
> > > >>Fsub<-which(Fpvals<0.01)
> > > >>eset.Fsub<-eset[Fsub]
> > > >>lm.f.Fsub<-lm.f[Fsub]
> > > >>betaNames<-names(lm.f[[1]] [["coefficients"]])
> > > >>lambda<-par2lambda(betaNames, c("DIABETESTRUE"), c(1)) ## I get also 
>the
> > > >>same genes if I consider the intersection 
>DIABETESTRUE:TREATMENTTRUE.
> > > >>mainDI<-function(x) contrastTest(x,lambda,p=0.1) [[1]]
> > > >>mainDIgenes<-sapply(lm.f.Fsub, FUN=mainES)
> > > >
> > > >#####
> > > ># See above comments.
> > > >#####
> > > >
> > > >## To get the genes characteristics of the diabetes+treatment:
> > > >>for (i in 1:length(lm.f)) {Fpvals[i]<-anova(lm.dt[[i]], 
>lm.f[[i]])$P[2]}
> > > >>Fsub<-which(Fpvals<0.01)
> > > >>eset.Fsub<-eset[Fsub]
> > > >>lm.f.Fsub<-lm.f[Fsub]
> > > >>  betaNames<-names(lm.f[[1]] [["coefficients"]])
> > > >>lambda<-par2lambda(betaNames, c("DIABETESTRUE:TREATMENTTRUE"), c(1))
> > > >>mainDT<-function(x) contrastTest(x,lambda,p=0.1) [[1]]
> > > >>mainDTgenes<-sapply(lm.f.Fsub, FUN=mainES) ## I don’t get any “fail 
>to
> > > >>reject” gene.
> > > >
> > > >#####
> > > ># Again, I think changing mainES to mainDT will do the trick.
> > > >#####
> > > >
> > > >
> > > >When I get the “rejected” and the “failed to reject” genes, can I 
>classify
> > > >them by their Fvalues? How?
> > > >
> > > >#####
> > > ># Currently, the contrastTest function only returns the contrast 
>estimate
> > > ># (cEst), the pvalue from the F-test (pvalue), and a statement of 
>either
> > > ># "REJECT" or "FAIL TO REJECT" based on the p-value cutoff you 
>specify.
> > > ># This can be changed to return the F-value as well, and I'm happy to 
>put
> > > ># this change into the package.  Then you can use the Fvalues for 
>whatever
> > > ># you would like.
> > > >#
> > > ># One thing to consider if you are going to use p-values from the F 
>tests
> > > ># to select genes - you will want to corrent for multiple testing.  
>The
> > > ># multtest package is very useful for this.
> > > >######
> > > >
> > > >
> > > >Thank you very much for your comments and help.
> > > >Yours sincerely,
> > > >
> > > >Jordi Altirriba
> > > >IDIBAPS-Hospital Clinic (Barcelona, Spain)
> > >
> > > _________________________________________________________________
> > > Déjanos tu CV y recibe ofertas de trabajo en tu buzón. Multiplica tus
> > > oportunidades con MSN Empleo. http://www.msn.es/Empleo/
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at stat.math.ethz.ch
> > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
> >
> > --
> > Kasper Daniel Hansen, Research Assistant
> > Department of Biostatistics, University of Copenhagen
> >
> >
>
>
>__________________________________________________________________________
>Denise Scholtens
>Department of Biostatistics
>Harvard School of Public Health
>Office: M1B11, DFCI
>Phone: 617.632.4494
>dscholte at hsph.harvard.edu

_________________________________________________________________
Horóscopo, tarot, numerología... Escucha lo que te dicen los astros.



More information about the Bioconductor mailing list