[R] Adding Records to a Table in R

Paul Bernal paulbernal07 at gmail.com
Wed Nov 8 14:46:12 CET 2017


Dear Eric,

Hope you are doing great. I also tried the following:

#First I created the complete date sequence

TransitDateFrame <- data.frame(TransitDate=seq(as.Date(dataset1[1,1]),
as.Date(dataset1[nrow(dataset1),1]), by = "month"))

#Then I did the merging

 dataset1NEW <- merge(TransitDateFrame, dataset1, by="TransitDate",
all.x=TRUE)

Now it has, as expected the total number of rows. The problem is, it filled
absolutely everything with NAs, and this shouldn´t be the case since there
are dates that actually have data.

why is this happening?

I am attaching the dataset1 table as a .csv document for your reference.
Basically what I want is to bring all the values in dataset1 and only add
the dates missing with value 0.

Best regards,

Paul

2017-11-01 15:21 GMT-05:00 Eric Berger <ericjberger at gmail.com>:

> Hi Paul,
>
> #First I set up some sample data since I don't have a copy of your data
> dtOrig <- as.Date( c("1985-04-01","1985-07-01","1985-12-01","1986-04-01"))
> dfOrig <- data.frame( TransitDate=dtOrig, Transits=c(100,100,500,325),
> CargoTons=c(1000,1080,3785,4200) )
>
> #Generate the complete set of dates as a data frame
> dfDates<- data.frame( TransitDate=seq(from=as.Date("1985-04-01"),by="1
> month",length=13) )
>
> # do the merge adding the "missing" rows (where NA will appear)
> dfNew  <- merge(dfDates, dfOrig, by="TransitDate", all.x=TRUE )
>
> # replace the NA's by zero
> dfNew[is.na(dfNew)] <- 0
>
> HTH,
> Eric
>
>
> On Wed, Nov 1, 2017 at 9:45 PM, Paul Bernal <paulbernal07 at gmail.com>
> wrote:
>
>> Dear R friends,
>>
>> I am currently working with time series data, and I have a table(as data
>> frame) that has looks like this (TransitDate are in format = "%e-%B-%Y") :
>>
>> TransitDate       Transits      CargoTons
>> 1985-04-01        100            2500
>> 1985-05-01        135            4500
>> 1985-06-01        120            1750
>> 1985-07-01        100            3750
>> 1985-08-01        200            1250
>>
>> The problem is, that there are several periods that don´t exist in the
>> table, so it has the following behavior:
>>
>> TransitDate        Transits      CargoTons
>> 1985-04-01        100             1000
>> 1985-07-01        100             1080
>> 1985-12-01        500             3785
>> 1986-04-01        325             4200
>> .
>> .
>> 2017-09-01        400             2350 (*this is the last observation)
>>
>> You can see in the last table fragment that the series jumps from
>> 1985-04-01 to 1985-07-01, then it jumps from there to 1985-12-01 making
>> the
>> time series quite irregular (non-constant chronologically speaking).
>>
>> What I want to do is create a dummy table that has the sequence from the
>> first observation (1985-04-01) up to the last one (2017-09-01) and then
>> develop a code that checks if the dates contained in the dummy table exist
>> in the original table, if they don´t exist then add those dates and put
>> zeroes on the fields.
>>
>> How can I achieve this?
>>
>> Any help will be greatly appreciated,
>>
>> Best regards,
>>
>> Paul
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>


More information about the R-help mailing list