[R] How numerical data is stored inside ts time series objects

Paul Paul.Domaskis at gmail.com
Tue Apr 21 01:04:45 CEST 2015


I'm getting familiar with the stl function in the stats packcage by
trying it on an example from Brockwell & Davis's 2002 "Introduction to
Times Series and Forcasting".  Specifically, I'm using a subset of his
red wine sales data.  It's a detour from the stl material at
http://www.stat.pitt.edu/stoffer/tsa3/R_toot.htm (at some point, I
have to stop simply following and try to make it work with new data).

I need a minimum of 36 wine sales data points in the series, since stl
otherwise complains about the data being less than 2 cycles.  The data
is in ~/tmp/wine.txt:

    464
    675
    703
    887
    1139
    1077
    1318
    1260
    1120
    963
    996
    960
    530
    883
    894
    1045
    1199
    1287
    1565
    1577
    1076
    918
    1008
    1063
    544
    635
    804
    980
    1018
    1064
    1404
    1286
    1104
    999
    996
    1015

My sourced test code is buried in a repeat loop so that I can use a
break command to circumvent the final error-causing statement that I'm
trying to figure out:

    repeat{

        # Clear variables (from stackexchange)
        rm( list=setdiff( ls( all.names=TRUE ), lsf.str(all.names=TRUE ) ) )
        ls()

        head( wine <- read.table("~/tmp/wine.txt") )
        ( x <- ts(wine[[1]],frequency=12) )
        ( y <- ts(wine,frequency=12) )
        ( a=stl(x,"per") )
        #break
        ( b=stl(y,"per") )
    }

The final statement causes the error 'Error in stl(y, "per") : only
univariate series are allowed'.  I found an explanation at
http://stackoverflow.com/questions/10492155/time-series-and-stl-in-r-error-
only-univariate-series-are-allowed.
That's how I came up with the assignment to x using wine[[1]].  I
found an explanation to the need for
double square brackets at
http://www.r-tutor.com/r-introduction/list/named-list-members.

My problem is that it's not very clear what is happening inside the ts
structures x and y.  If I simply print them, they look 100% identical:

    | > x
    |    Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
    | 1  464  675  703  887 1139 1077 1318 1260 1120  963  996  960
    | 2  530  883  894 1045 1199 1287 1565 1577 1076  918 1008 1063
    | 3  544  635  804  980 1018 1064 1404 1286 1104  999  996 1015
    | > y
    |    Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
    | 1  464  675  703  887 1139 1077 1318 1260 1120  963  996  960
    | 2  530  883  894 1045 1199 1287 1565 1577 1076  918 1008 1063
    | 3  544  635  804  980 1018 1064 1404 1286 1104  999  996 1015

Whatever their differences, it's not causing R to misinterpret the
data; that is, they each look like in single series of numerical data.

Can anyone illuminate the difference in the data inside the ts data
structures?  The potential incompatibility with stl is just one
symptom.  Right now, the "solution" is black magic to me, and I would
like to get a clearer picture so that I know when else (and how) to
watch out for this.

I've posted this to the R Help mailing list
http://news.gmane.org/gmane.comp.lang.r.general and to stackoverflow
at
http://stackoverflow.com/questions/29759928/how-numerical-data-is-stored-
inside-ts-time-series-objects.



More information about the R-help mailing list