[Rd] Ad: Re: Ad: Re: R crashes for large formulas in lm() (PR#8181)

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Wed Oct 5 16:01:34 CEST 2005


On Wed, 5 Oct 2005 Hallgeir.Grinde at elkem.no wrote:

> Yes.
> so (x1*x2*x3*x4*x5*x6*x7*x8)^2 = (x1+x2+x3+x4+x5+x6+x7+x8)^8 ?

Yes in the sense that the simplified formula given by terms() is the same.

> and there is a difference in
> (x1*x2*x3*x4*x5*x6*x7*x8)^2
> and
> (x1*x2*x3*x4*x5*x6*x7*x8)
> althoug the resulting formulas are the same, or?

The first is reduced to the second by terms().

> This fikses my problem, but R still crashes for the large formula. It may
> be due to stack owerflow, but i guess this can be altered maually?

On Unix-alikes, at least.

The way the calculation is done can be improved, but is anyone going 
intentionally to write a 100,000 term formula?

>
> Prof Brian Ripley <ripley at stats.ox.ac.uk>
> 05.10.2005 12:50
>
>        Til:    Hallgeir.Grinde at elkem.no
>        cc:     Uwe Ligges <ligges at statistik.uni-dortmund.de>,
> R-bugs at biostat.ku.dk
>        Emne:   Re: Ad: Re: [Rd] R crashes for large formulas in lm()
> (PR#8180)
>
>
> On Wed, 5 Oct 2005 Hallgeir.Grinde at elkem.no wrote:
>
>> And some more informastion I forgot.
>> R does not crash if I write out the formula:
>>
>> set.seed(123)
>> x1 <- runif(1000)
>> x2 <- runif(1000)
>> x3 <- runif(1000)
>> x4 <- runif(1000)
>> x5 <- runif(1000)
>> x6 <- runif(1000)
>> x7 <- runif(1000)
>> x8 <- runif(1000)
>> y <- rnorm(1000)
>> fit <- lm(y~(x1*x2*x3*x4*x5*x6*x7*x8)^2)
>> -> R crashes
>>
>> fit <- lm(y~x1+x2+x3+x4+x5+x6+x7+x8
>>                +x1:x2+x1:x3+x1:x4+x1:x5+x1:x6+x1:x7+x1:x8
>>                +x2:x3++x2:x4+x2:x5+x2:x6+x2:x7+x2:x8
>>                +x3:x4+x3:x5+x3:x6+x3:x7+x3:x8
>>                +x4:x5+x4:x6+x4:x7+x4:x8
>>                +x5:x6+x5:x7+x5:x8
>>                +x6:x7+x6:x8
>>                +x7:x8)
>> -> R does not crash
>> This is the same formula, at least it should be.
>
> It is not the same formula at all.  Try
>
>> terms(y~(x1*x2*x3*x4*x5*x6*x7*x8)^2, simplify=TRUE)
> y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x1:x2 + x1:x3 + x1:x4 +
>     x1:x5 + x1:x6 + x1:x7 + x1:x8 + x2:x3 + x2:x4 + x2:x5 + x2:x6 +
>     x2:x7 + x2:x8 + x3:x4 + x3:x5 + x3:x6 + x3:x7 + x3:x8 + x4:x5 +
>     x4:x6 + x4:x7 + x4:x8 + x5:x6 + x5:x7 + x5:x8 + x6:x7 + x6:x8 +
>     x7:x8 + x1:x2:x3 + x1:x2:x4 + x1:x3:x4 + x1:x2:x5 + x1:x3:x5 +
> ...
>     x1:x3:x4:x5:x6:x7:x8 + x2:x3:x4:x5:x6:x7:x8 + x1:x2:x3:x4:x5:x6:x7:x8
>
> Did you actually want lm(y~(x1+x2+x3+x4+x5+x6+x7+x8)^2) ?
>
>>
>>
>>
>>
>>
>> Uwe Ligges <ligges at statistik.uni-dortmund.de>
>> 05.10.2005 12:13
>>
>>        Til:    Prof Brian Ripley <ripley at stats.ox.ac.uk>
>>        cc:     hallgeir.grinde at elkem.no, R-bugs at biostat.ku.dk
>>        Emne:   Re: [Rd] R crashes for large formulas in lm() (PR#8180)
>>
>>
>> Prof Brian Ripley wrote:
>>
>>> On Wed, 5 Oct 2005 hallgeir.grinde at elkem.no wrote:
>>>
>>>
>>>> Full_Name: Hallgeir Grinde
>>>> Version: 2.1.1
>>>> OS: Windows XP
>>>> Submission from: (NULL) (144.127.1.1)
>>>>
>>>>
>>>> While using lm(y~(x*z*c*...*v)^2) R crashes/closes if the numbers of
>> variables
>>>> are at least 8.
>>>
>>>
>>> OK, let's try to reproduce that:
>>>
>>>
>>>> x1 <- runif(1000)
>>>> x2 <- runif(1000)
>>>> x3 <- runif(1000)
>>>> x4 <- runif(1000)
>>>> x5 <- runif(1000)
>>>> x6 <- runif(1000)
>>>> x7 <- runif(1000)
>>>> x8 <- runif(1000)
>>>> y <- rnorm(1000)
>>>> fit <- lm(y~(x1*x2*x3*x4*x5*x6*x7*x8)^2)
>>>
>>>
>>> No crash, a quite reasonable fit.
>>>
>>> Can we please have a reproducible example, as we do ask?
>>>
>>
>> Hmm, crashes for me as well with R-2.1.1 and R-2.2.0 beta (2005-09-27
>> r35682M) on WinNT 4.0, SP6.
>>
>>
>> Let's make it reproducible:
>>
>> set.seed(123)
>> x1 <- runif(1000)
>> x2 <- runif(1000)
>> x3 <- runif(1000)
>> x4 <- runif(1000)
>> x5 <- runif(1000)
>> x6 <- runif(1000)
>> x7 <- runif(1000)
>> x8 <- runif(1000)
>> y <- rnorm(1000)
>> fit <- lm(y~(x1*x2*x3*x4*x5*x6*x7*x8)^2)
>>
>>
>> Uwe Ligges
>>
>>
>>
>>                                NOTICE
>>               Please immediately e-mail back to sender
>>               if you are not the intended recipient.
>>
>>               Thereafter delete the e-mail along with
>>               any attachments without making copies.
>>
>>               Elkem reserves all rights of privilege,
>>               confidentiality and copyright.
>>
>>
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list