[R] R issue with unequal large data frames with multiple columns
jholtman at gmail.com
Thu May 2 11:43:49 CEST 2013
Sent from my iPad
On May 2, 2013, at 2:28, Adeel Amin <adeel.amin at gmail.com> wrote:
> I'm a bit of an amateur R programmer. I can do simple R scenarios but my
> handle on complex grammatical issues isn't steady.
> I have 12 CSV files that I've read into dataframes. Each has 8 columns and
> over 2000000 rows. Each dataframe has data associated by time component
> and a date component in the format of:
> X.DATE and then X.TIME
> X.DATE is in the format of MMDDYYYY and X.TIME is format HHMM. The issue
> is that even though each dataframe begins and ends with the same X.DATE and
> X.TIME values, each data frame has different number of rows. One may have
> as many 100000 rows more than the other.
> I want to do two things:
> 1) I want to extract a certain portion of data depending on date and time
> 2) In lock step with number 2 I want to eliminate values from the data
> frame that are a) redundant or b) do not appear in the other data sets.
> When step 2 is done, all the time/date data within all 12 dataframes will
> be the same.
> Suggestions? Thanks R Community --
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help