[R] Determining variance components of classed covariates

Mon Jan 12 18:53:04 CET 2009

You might want to try the
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
list next time for mixed model questions.

At any rate the Variance column figures are variances, not percentages.

We can use anova with REML=FALSE to make comparisons among
models.  Below we find that removing the rs7074431 term makes very
little difference so we can drop it but removing the rs11834524 term
makes a big difference.  Thus modx0 can be used.

> modxx <- lmer(Expression ~ 1 + (1|rs11834524) + (1|rs7074431), input_new, REML = FALSE)
> modx0 <- lmer(Expression ~ 1 + (1|rs11834524), input_new, REML = FALSE)
> mod0x <- lmer(Expression ~ 1 + (1|rs7074431), input_new, REML = FALSE)
> anova(modxx, modx0)
Data: input_new
Models:
modx0: Expression ~ 1 + (1 | rs11834524)
modxx: Expression ~ 1 + (1 | rs11834524) + (1 | rs7074431)
      Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
modx0  3 111.386 119.460 -52.693
modxx  4 110.288 121.053 -51.144 3.0986      1    0.07836 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> anova(modxx, mod0x)
Data: input_new
Models:
mod0x: Expression ~ 1 + (1 | rs7074431)
modxx: Expression ~ 1 + (1 | rs11834524) + (1 | rs7074431)
      Df      AIC      BIC   logLik  Chisq Chi Df Pr(>Chisq)
mod0x  3  206.652  214.726 -100.326
modxx  4  110.288  121.053  -51.144 98.365      1  < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

On Mon, Jan 12, 2009 at 11:36 AM, Stephen Montgomery <sm8 at sanger.ac.uk> wrote:
> Hi -
>
> I am interested in solving variance components for the data below with
> respect to the response variable, Expression within R.
>
> However, the covariates aren't independent and they also have a class
> (of which the total variance explained by covariates in that class I am
> most interested in).
>
> Very naively, I have tried to look at each individual covariates
> variance like this
>
>> lm<-lmer(Expression ~ 1 + (1|rs11834524) + (1|rs7074431),
> data=input_new)
>> lm
> Linear mixed-effects model fit by REML
> Formula: Expression ~ 1 + (1 | rs11834524) + (1 | rs7074431)
>   Data: input
>   AIC   BIC logLik MLdeviance REMLdeviance
>  108.4 116.5 -51.22      102.5        102.4
> Random effects:
>  Groups     Name        Variance Std.Dev.
>  rs11834524 (Intercept) 0.485538 0.69681
>  rs7074431  (Intercept) 0.013720 0.11713
>  Residual               0.128853 0.35896
> number of obs: 109, groups: rs11834524, 3; rs7074431, 3
>
> Fixed effects:
>            Estimate Std. Error t value
> (Intercept)   9.9524     0.4098   24.29
>
> My assumption is that this is telling me that rs11834524 explains
> 0.485538 of the variance and rs7074431 explains 0.013720 of the variance
> in Expression when looked at independently.
>
> However, I would like to know how to write a model where I know how much
> of the total variance (in Expression) is described by covariates
> rs11834524, rs1682421, rs13383869 and rs9457141 (call it class A) and
> covariates rs9459617, rs7074431, rs12450785, rs592724 (call it class B).
> Assuming an additive model within the class.  The caveats are that there
> is missing data and again that there may be correlation between all the
> covariates.
>
> Such that a theoretical result may be that
> Class A: Explains 60% of the total variance in expression (response)
> Class B: Explains 10% of the total variance in expression
>
> Thanks for the help!  I am sorry I am R challenged here...I really
> appreciate the guidance!
>
> Stephen
>
>
>> dump("input_new", file=stdout())
> input_new <-
> structure(list(Individual = structure(1:109, .Label = c("NA06984",
> "NA06985", "NA06986", "NA06989", "NA06993", "NA06994", "NA07000",
> "NA07022", "NA07037", "NA07045", "NA07051", "NA07055", "NA07056",
> "NA07345", "NA07346", "NA07347", "NA07357", "NA07435", "NA11829",
> "NA11830", "NA11831", "NA11832", "NA11839", "NA11840", "NA11843",
> "NA11881", "NA11882", "NA11892", "NA11893", "NA11894", "NA11917",
> "NA11918", "NA11919", "NA11920", "NA11930", "NA11931", "NA11992",
> "NA11993", "NA11994", "NA11995", "NA12003", "NA12005", "NA12006",
> "NA12043", "NA12044", "NA12056", "NA12057", "NA12144", "NA12145",
> "NA12146", "NA12154", "NA12155", "NA12156", "NA12234", "NA12239",
> "NA12248", "NA12249", "NA12264", "NA12272", "NA12273", "NA12274",
> "NA12275", "NA12282", "NA12283", "NA12286", "NA12287", "NA12340",
> "NA12341", "NA12342", "NA12343", "NA12347", "NA12348", "NA12383",
> "NA12399", "NA12400", "NA12414", "NA12489", "NA12546", "NA12716",
> "NA12718", "NA12748", "NA12749", "NA12750", "NA12751", "NA12760",
> "NA12761", "NA12762", "NA12763", "NA12775", "NA12776", "NA12777",
> "NA12778", "NA12812", "NA12813", "NA12814", "NA12815", "NA12827",
> "NA12828", "NA12829", "NA12830", "NA12842", "NA12843", "NA12872",
> "NA12873", "NA12874", "NA12875", "NA12889", "NA12891", "NA12892"
> ), class = "factor"), Expression = c(9.46026823453575, 10.0788903323991,
>
> 9.20330296497174, 10.038741467793, 9.33092349416463, 11.0273957217919,
> 10.5498875891745, 9.81137299592747, 11.2023261987976, 9.90559354069027,
> 10.1524696609679, 10.3171767665993, 9.02155519577685, 9.84917871051438,
> 10.658877473136, 9.88895551011107, 8.62335008726357, 9.21529114100886,
> 10.7896248923916, 10.1302992505869, 8.64584282787018, 9.56057795866654,
> 9.89810004078774, 10.2557482141576, 8.95588077688637, 9.56452454115857,
> 9.26525135092154, 10.5438780642797, 9.8468571349548, 10.7416169225352,
> 10.5623721612979, 10.6565276881443, 9.67758493445612, 9.75385553511462,
> 8.997797236767, 11.0106882086179, 10.362578597992, 9.2745507212906,
> 10.7453355016181, 9.75998268015348, 9.45003620116962, 10.055504292376,
> 10.7072220720564, 10.0934686444392, 10.0472832129727, 10.1185615033486,
> 10.3340911031131, 9.70618910683157, 10.5953304905529, 10.4246307909547,
> 9.91463202635336, 10.249081562168, 10.9252022586474, 10.295544143525,
> 11.4838109797985, 10.5286570234792, 9.78692800868132, 10.0397050809162,
> 9.27914623343747, 10.37600233389, 9.27341681588134, 9.40195375611303,
> 10.8979822929135, 9.03922228977389, 10.3911745662505, 10.4345408213054,
> 9.8548491618724, 10.1897729275437, 10.2881888849609, 8.9656977165014,
> 9.81595398472166, 10.1856794532084, 9.3763789479684, 10.1712420020647,
> 10.2964594680427, 10.3515965292101, 8.94492585275159, 11.2529257614993,
> 9.25146912450726, 10.1904309237525, 10.7490591053023, 10.3883924463568,
> 10.097023765247, 10.0824730785217, 10.0828512817661, 10.6371064852226,
> 10.5831044752098, 10.4484786486601, 8.50264408341596, 10.3468670812262,
> 9.46061433005316, 8.90027436167269, 9.73630671555279, 9.40555522408144,
> 10.3220768104446, 8.55132985773453, 10.1678182524815, 10.6145417864386,
> 10.4169948161073, 10.0253039670548, 10.2568017077865, 10.5045847076951,
> 9.75993936712448, 8.99997092895909, 10.6742222414794, 10.8640943324257,
> 10.4295384371541, 10.1987862649656, 10.6744617172313), rs11834524 =
> structure(c(1L,
> 2L, 2L, 3L, 2L, 3L, 3L, 2L, 3L, 2L, 3L, 2L, 1L, 2L, 2L, 2L, 1L,
> 1L, 3L, 3L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 3L, 3L, 3L, 2L,
> 1L, 1L, 3L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
> 2L, 2L, 2L, 3L, 2L, 3L, 3L, 2L, 3L, 1L, 2L, 1L, 1L, 3L, 1L, 2L,
> 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 3L, 1L, 2L, 3L,
> 2L, 3L, 2L, 1L, 3L, 3L, 3L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L,
> 2L, 2L, 2L, 2L, 2L, 1L, 1L, 3L, 3L, 3L, 3L, 3L), .Label = c("AA",
> "AG", "GG"), class = "factor"), rs1682421 = structure(c(1L, 2L,
> 1L, 2L, 2L, 3L, 2L, 2L, 3L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L,
> 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 3L, 2L, 3L, 1L, 1L,
> 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
> 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 3L, 1L, 2L, 2L,
> 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L,
> 3L, 1L, 1L, 2L, 3L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L,
> 1L, 2L, 2L, 1L, 1L, NA, 3L, 2L, 3L, 2L, 2L), .Label = c("CC",
> "CT", "TT"), class = "factor"), rs13383869 = structure(c(2L,
> 2L, 2L, 2L, 2L, NA, 2L, 2L, 1L, 2L, 3L, 3L, 3L, 1L, 2L, 2L, 3L,
> 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 1L, 1L, 2L, 2L,
> 2L, 3L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 1L,
> 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 3L, 2L, NA, 2L, 2L, 3L, 2L,
> 2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 1L, 1L, 2L, 3L, 2L, 2L, 3L, 2L,
> 2L, 1L, 1L, 2L, 1L, 1L, 1L, 3L, 1L, 2L, 3L, 2L, 3L, 2L, 3L, 2L,
> 1L, 1L, 2L, 2L, NA, 2L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c("AA",
> "AG", "GG"), class = "factor"), rs9457141 = structure(c(1L, 2L,
> 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 3L, 1L,
> 3L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
> 2L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 1L, 2L, 3L, 2L, 1L,
> 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, NA, 2L, 1L, 2L, NA, 1L,
> 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("CC",
> "CT", "TT"), class = "factor"), rs9459617 = structure(c(1L, 2L,
> 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 3L, 1L,
> 3L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
> 2L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, NA, 1L, 3L, 3L, 2L, 1L,
> 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 1L, 2L, 1L, 2L, 2L, 1L,
> 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("CC",
> "CT", "TT"), class = "factor"), rs7074431 = structure(c(2L, 3L,
> 2L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 3L, 2L,
> 2L, 2L, 2L, 3L, 2L, 3L, 2L, 2L, 1L, 1L, 3L, 2L, 1L, 2L, 3L, 2L,
> 1L, 2L, 1L, 3L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 3L, 1L, 1L,
> 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 1L, 2L, 2L, 1L,
> 1L, 1L, 3L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 1L,
> 1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L,
> 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L), .Label = c("CC",
> "CT", "TT"), class = "factor"), rs12450785 = structure(c(2L,
> 2L, 2L, 2L, 2L, 2L, 1L, 3L, 1L, 3L, 3L, 2L, 2L, 1L, 2L, 3L, 2L,
> 3L, 1L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 2L, 3L, 2L, 2L,
> 2L, 2L, 2L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 2L,
> 1L, 2L, 2L, 1L, 1L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 2L, 1L, 3L, 2L,
> 2L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 2L, 2L, 3L, 1L, 3L, 2L, 2L,
> 1L, 3L, 2L, 3L, 1L, 3L, 2L, 3L, 3L, 2L, 2L, 2L, 3L, 2L, 3L, 1L,
> 2L, 2L, 3L, 2L, 2L, 1L, 3L, 3L, 3L, 2L, 3L, 2L), .Label = c("AA",
> "AG", "GG"), class = "factor"), rs592724 = structure(c(1L, 2L,
> 1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 1L, 2L,
> 2L, 2L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 3L, 1L, 3L,
> 1L, 3L, 2L, 1L, 1L, 2L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 2L, 2L,
> 3L, 1L, 3L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 3L, 1L, 3L, 3L,
> 2L, 2L, 1L, 1L, 3L, 2L, 2L, 2L, 1L, 3L, 2L, 3L, 1L, 3L, 3L, 2L,
> 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
> 1L, 1L, 2L, 1L, 2L, 1L, 2L, 3L, 2L, 2L, 2L), .Label = c("CC",
> "CT", "TT"), class = "factor"), Grp = structure(c(1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1", class =
> "factor")), .Names = c("Individual",
> "Expression", "rs11834524", "rs1682421", "rs13383869", "rs9457141",
> "rs9459617", "rs7074431", "rs12450785", "rs592724", "Grp"), row.names =
> c(NA,
> -109L), class = "data.frame")
>
>
>
> Stephen B. Montgomery
> Postdoctoral Researcher, Population and Comparative Genomics
> Wellcome Trust Sanger Institute
> Hinxton, Cambridge CB10 1SA
> Skype: stephen.b.montgomery
>
>
>
>
> --
>  The Wellcome Trust Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>