[R] Functional data anlysis for unequal length and unequal width time series

sour@ m@ili@g off i@st@te@edu sour@ m@ili@g off i@st@te@edu
Mon Dec 17 17:50:09 CET 2018


Dear All,
            I apologize if you have already seen in Stack Overflow. I
have not got any response from there so I am posting for help here.

I have data on 1318 time series. Many of these series are of unequal
length. Apart from this also quite a few time points for each of the
series are observed at different time points. For example consider the
following four series

t1 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.67)
V1 <- c(-0.1710, -0.0824, -0.0419, -0.0416, -0.0216, -0.0792, -0.0656,-
0.0273, -0.0589)
ser1 <- cbind(t1, V1)

t2 <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38)
V2 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231, 0.2264)
ser2 <- cbind(t2, V2)

t3 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.65,
25.88, 25.97, 25.99)
V3 <- c(0.0897, -0.0533, -0.3497, -0.5684, -0.4294, -0.1109, 0.0352,
0.0550, -0.0536, 0.0185, -0.0295, -0.0324)
ser3 <- cbind(t3, V3)

t4 <- c(24.5, 24.67, 24.71, 24.98, 25.17)
V4 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231)
ser4 <- cbind(t4, V4)

Here t1, t2, t3, t4 are the time points and V1, V2, V3, V4 are the
observations made at over those time points. The time points in the
actual data are Julian dates so they look like these, just that they
are much larger decimal figures like 2452450.6225.

I am trying to cluster these time series using functional data approach
for which I am using the "funFEM" package in R. Th examples present are
for equispaced and equal length time series so I am not sure how to use
the package for my data. Initially I tried by making all the time
series equal in length to the time series having the highest number of
observations (here equal to ser3) by adding NA's to the time series. So
following this example I made ser2 as

t2_n <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38, 25.50, 25.55, 25.65,
25.88, 25.97, 25.99)
V2_na <- c(V2, rep(NA, 6))
ser2_na <- cbind(t2_n, V2_na)

Note that to make t2 equal to length of t3 I grabbed the last 6 time
points from t3. To make V2 equal in length to V3 I added NA's.

Then I created my data matrix as

dat <- rbind(V1_na, V2_na, V3, V4_na).

The code I used was

require(funFEM)
basis<- create.fourier.basis(c(min(t3), max(t3)), nbasis = 25) 
fdobj <- smooth.basis(c(min(t3), max(t3)) ,dat, basis)$fd

Note that the range is constructed using the maximum and minumum time
point of ser_3 series.

res <- funFEM(fdobj, K = 2:9, model = "all", crit = "bic", init =
"random") 

But this gives me an error

Error in svd(X) : infinite or missing values in 'x'.

Can anyone tell please help me on how to deal with this dataset for
this package or any alternative package?

Sincerly,
Souradeep



More information about the R-help mailing list