[R] Processing key_column, begin_date, end_date in R

Matt Gross grossm at gmail.com
Wed Feb 25 22:18:58 CET 2015


Hi,

I am trying to process a large dataset in R.  The dataset contains the
following three columns:

key_column - a unique key identifier
begin_date - the start date of the active period
end_date - the end date of the active period


Example data is here:

key_column,begin_date,end_date
123456,2013-01-01,2014-01-01
123456,2013-07-01,2014-07-01
789102,2012-03-01,2014-03-01
789102,2015-02-01,2016-02-01
789102,2015-02-06,2016-02-06

I want to build a condensed table of key_column and begin_date's and
end_date's.  As you can see in the example data above, some begin and end
date periods overlap with begin_date and end_date pairs for the same
key_column.  In situations where overlap exists I want to have one record
for the key_column with the min(begin_date) and the max(end_date).

Can anyone help me build the commands to process this data in R?

Thanks,
Matt

-- 
Matt Gross
grossm at gmail.com
503.329.4545

	[[alternative HTML version deleted]]



More information about the R-help mailing list