[R] Looping Through DataFrames with Differing Lenghts

Paul Bernal paulbernal07 at gmail.com
Tue Mar 28 16:12:10 CEST 2017


Dear friends Ng Bo Lin, Mark and Ulrik, thank you all for your kind and
valuable replies,

I am trying to reformat a date as follows:

Data<-read.csv("Container.csv")

DataFrame<-data.frame(Data)

DataFrame$TransitDate<-as.Date(DataFrame$TransitDate, "%Y-%m-%d")

#trying to put it in YYYY-MM-DD format

However, when I do this, I get a bunch of NAs for the dates.

I am providing a sample dataset as a reference.

Any help will be greatly appreciated,

Best regards,

Paul

2017-03-28 8:15 GMT-05:00 Ng Bo Lin <ngbolin91 at gmail.com>:

> Hi Paul,
>
> Using the example provided by Ulrik, where
>
> > exdf1 <- data.frame(Date = c("1985-10-01", "1985-11-01", "1985-12-01”,
> "1986-01-01"), Transits = c(NA, NA, NA, NA))
> > exdf2 <- data.frame(Date = c("1985-10-01", "1986-01-01"), Transits =
> c(15,20)),
>
> You could also try the following function:
>
> for (i in 1:dim(exdf1)[1]){
>         if (!exdf1[i, 1] %in% exdf2[, 1]){
>                 exdf2 <- rbind(exdf2, exdf1[i,])
>         }
> }
>
> Basically, what the function does is that it runs through the number of
> rows in exdf1, and checks if the Date of the exdf1 row already exists in
> Date column of exdf2. If so, it skips it. Otherwise, it binds the row to
> df2.
>
> Hope this helps!
>
>
> Side note.: Computational efficiency wise, think Ulrik’s answer is
> probably better. Presentation wise, his is also much better.
>
> Regards,
> Bo Lin
>
> > On 28 Mar 2017, at 5:22 PM, Ulrik Stervbo <ulrik.stervbo at gmail.com>
> wrote:
> >
> > Hi Paul,
> >
> > does this do what you want?
> >
> > exdf1 <- data.frame(Date = c("1985-10-01", "1985-11-01", "1985-12-01",
> > "1986-01-01"), Transits = c(NA, NA, NA, NA))
> > exdf2 <- data.frame(Date = c("1985-10-01", "1986-01-01"), Transits =
> c(15,
> > 20))
> >
> > tmpdf <- subset(exdf1, !Date %in% exdf2$Date)
> >
> > rbind(exdf2, tmpdf)
> >
> > HTH,
> > Ulrik
> >
> > On Tue, 28 Mar 2017 at 10:50 Paul Bernal <paulbernal07 at gmail.com> wrote:
> >
> > Dear friend Mark,
> >
> > Great suggestion! Thank you for replying.
> >
> > I have two dataframes, dataframe1 and dataframe2.
> >
> > dataframe1 has two columns, one with the dates in YYYY-MM-DD format and
> the
> > other colum with number of transits (all of which were set to NA values).
> > dataframe1 starts in 1985-10-01 (october 1st 1985) and ends in 2017-03-01
> > (march 1 2017).
> >
> > dataframe2 has the same  two columns, one with the dates in YYYY-MM-DD
> > format, and the other column with number of transits. dataframe2 starts
> > have the same start and end dates, however, dataframe2 has missing dates
> > between the start and end dates, so it has fewer observations.
> >
> > dataframe1 has a total of 378 observations and dataframe2 has a  total of
> > 362 observations.
> >
> > I would like to come up with a code that could do the following:
> >
> > Get the dates of dataframe1 that are missing in dataframe2 and add them
> as
> > records to dataframe 2 but with NA values.
> >
> > <dataframe1                              <dataframe2
> >
> > Date              Transits                  Date
> > Transits
> > 1985-10-01    NA                         1985-10-01                15
> > 1985-11-01    NA                         1986-01-01                 20
> > 1985-12-01    NA                         1986-02-01                 5
> > 1986-01-01    NA
> > 1986-02-01    NA
> > 2017-03-01    NA
> >
> > I would like to fill in the missing dates in dataframe2, with NA as value
> > for the missing transits, so that I  could end up with a dataframe3
> looking
> > as follows:
> >
> > <dataframe3
> > Date                                Transits
> > 1985-10-01                      15
> > 1985-11-01                       NA
> > 1985-12-01                       NA
> > 1986-01-01                       20
> > 1986-02-01                       5
> > 2017-03-01                       NA
> >
> > This is what I want to accomplish.
> >
> > Thanks, beforehand for your help,
> >
> > Best regards,
> >
> > Paul
> >
> >
> > 2017-03-27 15:15 GMT-05:00 Mark Sharp <msharp at txbiomed.org>:
> >
> >> Make some small dataframes of just a few rows that illustrate the
> problem
> >> structure. Make a third that has the result you want. You will get an
> >> answer very quickly. Without a self-contained reproducible problem,
> > results
> >> vary.
> >>
> >> Mark
> >> R. Mark Sharp, Ph.D.
> >> msharp at TxBiomed.org
> >>
> >>
> >>
> >>
> >>
> >>> On Mar 27, 2017, at 3:09 PM, Paul Bernal <paulbernal07 at gmail.com>
> wrote:
> >>>
> >>> Dear friends,
> >>>
> >>> I have one dataframe which contains 378 observations, and another one,
> >>> containing 362 observations.
> >>>
> >>> Both dataframes have two columns, one date column and another one with
> >> the
> >>> number of transits.
> >>>
> >>> I wanted to come up with a code so that I could fill in the dates that
> >> are
> >>> missing in one of the dataframes and replace the column of transits
> with
> >>> the value NA.
> >>>
> >>> I have tried several things but R obviously complains that the length
> of
> >>> the dataframes are different.
> >>>
> >>> How can I solve this?
> >>>
> >>> Any guidance will be greatly appreciated,
> >>>
> >>> Best regards,
> >>>
> >>> Paul
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/
> >> posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> CONFIDENTIALITY NOTICE: This e-mail and any files and/or attachments
> >> transmitted, may contain privileged and confidential information and is
> >> intended solely for the exclusive use of the individual or entity to
> whom
> >> it is addressed. If you are not the intended recipient, you are hereby
> >> notified that any review, dissemination, distribution or copying of this
> >> e-mail and/or attachments is strictly prohibited. If you have received
> > this
> >> e-mail in error, please immediately notify the sender stating that this
> >> transmission was misdirected; return the e-mail to sender; destroy all
> >> paper copies and delete all electronic copies from your system without
> >> disclosing its contents.
> >>
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>


More information about the R-help mailing list