[R] Handling nonexistent observations in R for time series analysis and forecasting

David Winsemius dwinsemius at comcast.net
Tue Mar 28 00:51:26 CEST 2017


> On Mar 27, 2017, at 7:15 AM, Paul Bernal <paulbernal07 at gmail.com> wrote:
> 
> Dear friends,
> 
> Hope you are all doing great. I am trying to model historical data on
> transits, and the dates are in the following format: 1985-10-01
> 00:00:00.000 (this would be october, 1985).
> The data comes from an SQL Server Database and there are several missing
> observations. The problem is that, for example, there are dates for which
> no transit was recorded (because no transit took place) and instead of
> having that date recorded with an NA value, that date does not appear,
> resulting in a sequence like this:
> 1985-01-01 00:00:00.000, 1985-02-01 00:00:00.000, 1985-05-01 00:00:00.00
> in this example you start in january 1985, the february 1985, then the next
> available observation is on may 1985.
> I know R´s tsclean(data) function takes care of missing values, but that
> only works if you at least have the non available dates recorded with a
> value of NA, but what if I do not have those missing observations?
> 
> Any help will be greatly appreciated,

And the other readers of this ist will greatly appreciate a working example and plain text postings. Assuming you have these date-times in a dataframe named dat within a column named `time`:

> merge(x=data.frame(time=seq(min(dat$time), max(dat$time), by="month")), y=dat,all.x=TRUE, by.y='time')
        time X.placeholder.
1 1985-01-01    placeholder
2 1985-02-01    placeholder
3 1985-03-01           <NA>
4 1985-04-01           <NA>
5 1985-05-01    placeholder
> 
> Best regards,
> 
> Paul
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list