[R] R issue with unequal large data frames with multiple columns

PIKAL Petr petr.pikal at precheza.cz
Thu May 2 11:47:07 CEST 2013


without real data I can suggest you to look to ?merge. Or maybe ?aggregate.


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Adeel Amin
> Sent: Thursday, May 02, 2013 8:28 AM
> To: r-help at r-project.org
> Subject: [R] R issue with unequal large data frames with multiple
> columns
> I'm a bit of an amateur R programmer.  I can do simple R scenarios but
> my handle on complex grammatical issues isn't steady.
> I have 12 CSV files that I've read into dataframes.  Each has 8 columns
> and over 2000000 rows.  Each dataframe has data associated by time
> component and a date component in the format of:
> X.DATE and then X.TIME
> X.DATE is in the format of MMDDYYYY and X.TIME is format HHMM.  The
> issue is that even though each dataframe begins and ends with the same
> X.DATE and X.TIME values, each data frame has different number of rows.
> One may have as many 100000 rows more than the other.
> I want to do two things:
> 1) I want to extract a certain portion of data depending on date and
> time
> (easy)
> 2) In lock step with number 2 I want to eliminate values from the data
> frame that are a) redundant or b) do not appear in the other data sets.
> When step 2 is done, all the time/date data within all 12 dataframes
> will be the same.
> Suggestions?  Thanks R Community --
> 	[[alternative HTML version deleted]]
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list