[Rd] maximum supported lag is 350

Matteo Bertini matteo at naufraghi.net
Thu Jul 23 19:01:54 CEST 2009


Il 22-07-2009 16:26, Matteo Bertini ha scritto:
> I have found a strange error:
>
>  > fit = Arima(flow, c(1,0,1), list(order=c(0,1,1), period=96*7))
> Error in makeARIMA(trarma[[1L]], trarma[[2L]], Delta, kappa) :
> maximum supported lag is 350
>
> Is in fact quite common to have a lag > 350 using a weekly period in
> 15min steps = 672 (standard in traffic flow prediction litterature for
> example).
>
> Is there any reasons against doing what is suggested in the comment and
> avoid the hardcoded limit?
>
> Thanks,
> Matteo Bertini
>
>
>
> # https://svn.r-project.org/R/trunk/src/library/stats/src/arima.c
> SEXP getQ0(SEXP sPhi, SEXP sTheta)
> {
>
> [----8<----]
>
> /* This is the limit using an int index. We could use
> size_t and get more on a 64-bit system,
> but there seems no practical need. */
> if(r > 350) error(_("maximum supported lag is 350"));
>

Can someone review this analysis?
I can propose a patch to fix the problem if it is ok.

As I can see from the code:

     /* NB: nrbar could overflow */
     int r = max(p, q + 1);
     size_t np = r * (r + 1) / 2, nrbar = np * (np - 1) / 2;

where we want

     nrbar <= SIZE_MAX/2

considering that:

     np = r * (r + 1) / 2 < (r + 1)^2 / 2 = r2

and

     nrbar = np * (np - 1) / 2 < np^2 / 2 < r2^2 / 2

we have a protective:

     nrbar < (r + 1)^4 / 8

and we can check for:

     (r + 1)^4 / 8 <= SIZE_MAX/2

matteo at zarathustra:~/Documents/src$ cat size_max.c
#include <stdio.h>
#include <stdint.h>
#include <math.h>

#define MAX_LAG (int)(sqrt(sqrt(8)) * sqrt(sqrt(SIZE_MAX/2)) - 1)

int main(void) {
	int r;
	size_t np, nrbar;
	printf("MAX_LAG = %d\n", MAX_LAG);
	for (r=MAX_LAG-5; r<MAX_LAG+5; ++r) {
		np = (size_t)r * (r + 1) / 2; // <-- size_t cast
		nrbar = np * (np - 1) / 2;
		printf("r=%d, np=%zd, nrbar=%zd\n", r, np, nrbar);
	}
}
matteo at zarathustra:~/Documents/src$ gcc size_max.c && ./a.out
MAX_LAG = 361
r=356, np=63546, nrbar=2019015285
r=357, np=63903, nrbar=2041764753
r=358, np=64261, nrbar=2064705930
r=359, np=64620, nrbar=2087839890
r=360, np=64980, nrbar=2111167710
r=361, np=65341, nrbar=2134690470
r=362, np=65703, nrbar=10925605
r=363, np=66066, nrbar=34841497
r=364, np=66430, nrbar=58955587
r=365, np=66795, nrbar=83268967
matteo at zarathustra:~/Documents/src$ gcc -arch x86_64 size_max.c && ./a.out
MAX_LAG = 92680
r=92675, np=4294374150, nrbar=9220824667946924175
r=92676, np=4294466826, nrbar=9221222657660023725
r=92677, np=4294559503, nrbar=9221620660256523753
r=92678, np=4294652181, nrbar=9222018675736702290
r=92679, np=4294744860, nrbar=9222416704100837370
r=92680, np=4294837540, nrbar=9222814745349207030
r=92681, np=4294930221, nrbar=9223212799482089310
r=92682, np=4295022903, nrbar=238829644986445
r=92683, np=4295115586, nrbar=636909547728097
r=92684, np=4295208270, nrbar=1035002335816507


Thanks,
Matteo Bertini


PS
I was thinking the right limit was something like
nrbar <= SIZE_MAX
but perhaps I was missing some detail given the overflow occurs at 
SIZE_MAX/2



More information about the R-devel mailing list