[R] Interpreting lmp() results

Tue Aug 20 19:43:17 CEST 2013

Thanks, Bert.  Perhaps it would serve me to be more direct and
concise.  Why am I unable to replicate lm() results using lmp() when
the documentation indicates that the results should be the same when
perm="", seqs=TRUE, center=FALSE?  Some of the estimates are
different. In addition, the standard errors, t-values, and P-values
differ for these models.

# load libraries
library(lmPerm)
library(lattice)

# simulate data
x1 <- c(rnorm(60, 150, 50),rnorm(60, 150, 50),rnorm(60, 150, 50))
y1 <- c(30-0.1*x1[1:60], rep(10, 60), 0.1*x1[121:180])
factor.levels1 <- c(rep("DOWN", 60), rep("FLAT", 60), rep("UP", 60))

# plot simulated data
xyplot(y1 ~ x1, groups = factor.levels1, auto.key = TRUE)

# run lmp() and lm() models
lmp.model.1 <- lmp(y1 ~ x1*factor.levels1 - 1,  perm = "", seqs =
TRUE, center = FALSE)
lm.model.1 <- lm(y1 ~ x1*factor.levels1 - 1)

# review output
summary(lmp.model.1)
summary(lm.model.1)

# some lmp() results match the lm() results, but others do not
# those that do not match:
# slope for UP (should be 0.1)
summary(lmp.model.1)$coef[4]
summary(lm.model.1)$coef[1] +summary(lm.model.1)$coef[6]

# those that match:
# intercept for DOWN
summary(lmp.model.1)$coef[1]
summary(lm.model.1)$coef[2]
# intercept for FLAT
summary(lmp.model.1)$coef[2]
summary(lm.model.1)$coef[3]
# intercept for UP (not exactly the same, but very close)
summary(lmp.model.1)$coef[3]
summary(lm.model.1)$coef[4]
# slope for DOWN (-0.1)
summary(lmp.model.1)$coef[4] + summary(lmp.model.1)$coef[5]
summary(lm.model.1)$coef[1]
# slope for FLAT (0)
summary(lmp.model.1)$coef[4] + summary(lmp.model.1)$coef[6]
summary(lm.model.1)$coef[1] +summary(lm.model.1)$coef[5]

I am wondering if I am doing something incorrectly because I cannot
reconcile the differences in these results.  Any help would be
appreciated.

Thanks! AM

Ann Marie Reinhold | Doctoral Candidate
Montana Cooperative Fishery Research Unit
Department of Ecology | Montana State University
Box 173460 | Bozeman, MT 59717
Email: reinhold [AT] montana [DOT] edu | Office: (406) 994-6643

On Tue, Aug 20, 2013 at 2:03 AM, Bert Gunter <gunter.berton at gene.com> wrote:
> I suggest you post (a shortened version of?) your tome to the
> r-sig-ecology list instead.
>
> Cheers,
> Bert
>
> On Mon, Aug 19, 2013 at 3:47 PM, Ann Marie Reinhold
> <reinhold at montana.edu> wrote:
>> I am running permutation regressions in package lmPerm using lmp().  I
>> am getting what I find to be confusing results and I would like help
>> understanding what is going on.  To illustrate my problem, I created a
>> simple example and am running lmp() such that the results of the lmp()
>> models should be identical to that of lm().  I'm referring to the
>> notes section of the lmp() documentation where it says that the
>> "function will behave identically to lm() if the following parameters
>> are set: perm="", seqs=TRUE, center=FALSE."
>>
>> Here is an example wherein I am unable to match my lmp() results to my
>> lm() results.
>>
>> library(lmPerm)
>> library(lattice)
>>
>> x1 <- c(rnorm(60, 150, 50),rnorm(60, 150, 50),rnorm(60, 150, 50))
>> y1 <- c(30-0.1*x1[1:60], rep(10, 60), 0.1*x1[121:180])
>> factor.levels1 <- c(rep("DOWN", 60), rep("FLAT", 60), rep("UP", 60))
>>
>> xyplot(y1 ~ x1, groups = factor.levels1, auto.key = TRUE)
>>
>> lmp.model.1 <- lmp(y1 ~ x1*factor.levels1 - 1,  perm = "", seqs =
>> TRUE, center = FALSE)
>> summary(lmp.model.1)
>> lm.model.1 <- lm(y1 ~ x1*factor.levels1 - 1)
>> summary(lm.model.1)
>>
>> Here are the results:
>>> summary(lmp.model.1)
>> Call:
>> lmp(formula = y1 ~ x1 * factor.levels1 - 1, perm = "", seqs = TRUE,
>>     center = FALSE)
>> Residuals:
>>        Min         1Q     Median         3Q        Max
>> -1.509e-13 -1.700e-16  4.277e-17  9.558e-16  1.621e-14
>> Coefficients:
>>                      Estimate Std. Error    t value Pr(>|t|)
>> factor.levels1DOWN  3.000e+01  7.359e-15  4.077e+15   <2e-16 ***
>> factor.levels1FLAT  1.000e+01  4.952e-15  2.019e+15   <2e-16 ***
>> factor.levels1UP   -5.809e-16  5.095e-15 -1.140e-01   0.9094
>> x1                  4.096e-17  2.137e-17  1.917e+00   0.0569 .
>> x1:factor.levels11 -1.000e-01  3.391e-17 -2.949e+15   <2e-16 ***
>> x1:factor.levels12 -4.500e-17  2.792e-17 -1.612e+00   0.1089
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> Residual standard error: 1.226e-14 on 174 degrees of freedom
>> Multiple R-Squared:     1,      Adjusted R-squared:     1
>> F-statistic: 3.721e+31 on 6 and 174 DF,  p-value: < 2.2e-16
>>
>>> summary(lm.model.1)
>> Call:
>> lm(formula = y1 ~ x1 * factor.levels1 - 1)
>> Residuals:
>>        Min         1Q     Median         3Q        Max
>> -3.141e-14 -3.190e-15 -9.880e-16  8.920e-16  1.905e-13
>> Coefficients:
>>                         Estimate Std. Error    t value Pr(>|t|)
>> x1                    -1.000e-01  5.638e-17 -1.774e+15   <2e-16 ***
>> factor.levels1DOWN     3.000e+01  9.099e-15  3.297e+15   <2e-16 ***
>> factor.levels1FLAT     1.000e+01  6.123e-15  1.633e+15   <2e-16 ***
>> factor.levels1UP      -3.931e-15  6.300e-15 -6.240e-01    0.533
>> x1:factor.levels1FLAT  1.000e-01  6.826e-17  1.465e+15   <2e-16 ***
>> x1:factor.levels1UP    2.000e-01  6.931e-17  2.886e+15   <2e-16 ***
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> Residual standard error: 1.515e-14 on 174 degrees of freedom
>> Multiple R-squared:      1,     Adjusted R-squared:      1
>>
>>
>> I thought that the results of summary(lmp.model.1) would be the same
>> as summary(lm.model.1).  However, I am concerned that I am
>> interpreting the results incorrectly because I can't get the results
>> to match.  Specifically, I simulated data with a slope for UP of 0.1,
>> the slope for FLAT of 0, and the slope for DOWN of -0.1. I can recover
>> these values in lm.model.1, but not lmp.model.1.  In the output for
>> the lmp.model.1, I am estimating the slope for DOWN to be
>> approximately -0.1 (4.096e-17-1.000e-01) and the slope of the FLAT to
>> be approximately 0 (4.096e-17-4.500e-17); however, the slope of UP
>> (what I think is equal to the reference level x1) is 4.096e-17.  Am I
>> interpreting the x1 term incorrectly?  Why are the lmp() results not
>> identical to the lm() results?
>>
>> I ran a similar example using a modification of the above data wherein
>> factor level A is equal to FLAT, factor level B is equal to DOWN, and
>> factor level C is equal to UP.  Again, I was unable to match the
>> results from lm() and lmp().
>>
>> x2 <- c(rnorm(60, 150, 50), rnorm(60, 150, 50),rnorm(60, 150, 50))
>> y2 <- c(rep(10, 60), 30-0.1*x2[61:120], 0.1*x2[121:180])
>> factor.levels2 <- c(rep("A", 60), rep("B", 60), rep("C", 60))
>>
>> xyplot(y2 ~ x2, groups = factor.levels2, auto.key = TRUE)
>> lmp.model.2 <- lmp(y2 ~ x2*factor.levels2 - 1,  perm = "", seqs =
>> TRUE, center = FALSE)
>> summary(lmp.model.2)
>> lm.model.2 <- lm(y2 ~ x2*factor.levels2 - 1)
>> summary(lm.model.2)
>>
>> Here are the results:
>>> summary(lmp.model.2)
>> Call:
>> lmp(formula = y2 ~ x2 * factor.levels2 - 1, perm = "", seqs = TRUE,
>>     center = FALSE)
>> Residuals:
>>        Min         1Q     Median         3Q        Max
>> -1.284e-13 -6.772e-16  1.439e-16  1.581e-15  4.323e-14
>> Coefficients:
>>                      Estimate Std. Error    t value Pr(>|t|)
>> factor.levels2A     1.000e+01  5.545e-15  1.803e+15  < 2e-16 ***
>> factor.levels2B     3.000e+01  4.707e-15  6.373e+15  < 2e-16 ***
>> factor.levels2C     1.556e-15  4.994e-15  3.120e-01 0.755688
>> x2                  6.840e-17  1.860e-17  3.677e+00 0.000314 ***
>> x2:factor.levels21  1.030e-16  2.734e-17  3.767e+00 0.000226 ***
>> x2:factor.levels22 -1.000e-01  2.550e-17 -3.921e+15  < 2e-16 ***
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> Residual standard error: 1.131e-14 on 174 degrees of freedom
>> Multiple R-Squared:     1,      Adjusted R-squared:     1
>> F-statistic: 4.719e+31 on 6 and 174 DF,  p-value: < 2.2e-16
>>
>> I would expect the reference level slope (term x2 in lmp.model.2,
>> which I believe is the slope for factor level C) to be 0.1.  However,
>> it is 6.840e-17. Am I interpreting the reference levels for the lmp()
>> models incorrectly?  Perhaps I am specifying the models incorrectly.
>> Any help would be very much appreciated.
>>
>>
>> My session info is as follows:
>> R version 3.0.1 (2013-05-16)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> locale:
>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> other attached packages:
>> [1] lattice_0.20-15 lmPerm_1.1-2
>> loaded via a namespace (and not attached):
>> [1] grid_3.0.1  tools_3.0.1
>>
>> Thanks,
>> Ann Marie
>>
>>
>>
>> Ann Marie Reinhold | Doctoral Candidate
>> Montana Cooperative Fishery Research Unit
>> Department of Ecology | Montana State University
>> Box 173460 | Bozeman, MT 59717
>> Email: reinhold [AT] montana [DOT] edu | Office: (406) 994-6643
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>