[R] data.frame: adding a column that is based on ranges of values in another column

Bill.Venables at csiro.au Bill.Venables at csiro.au
Tue Jul 6 01:48:43 CEST 2010


Here is one way

> checkList <- data.frame(Day = c(f.n1, f.n2),
+                         FN = rep(c("FN1","FN2"),
+                         c(length(f.n1), length(f.n2))))
> m <- match(DF$Date, checkList$Day)
> DF <- cbind(DF, Fortnight = checkList$FN[m])
> DF
          X        Y       Date Fortnight
1  114.5508 47.14094 2009-01-01       FN1
2  114.6468 46.98874 2009-01-03       FN1
3  114.6596 46.91235 2009-01-05       FN1
4  114.6957 46.88265 2009-01-10       FN1
5  114.6828 46.80584 2009-01-14       FN1
6  114.8903 46.67022 2009-01-15       FN2
7  114.9519 46.53264 2009-01-16       FN2
8  114.8842 46.47727 2009-01-17       FN2
9  114.8579 46.46457 2009-01-22       FN2
10 114.8489 46.47032 2009-01-29       FN2
> 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Abdi, Abdulhakim
Sent: Tuesday, 6 July 2010 6:01 AM
To: r-help at r-project.org
Subject: [R] data.frame: adding a column that is based on ranges of values in another column

Dear List,

I've been looking tirelessly for a solution to this dilemma but without success. Perhaps someone has an idea that will guide me in the right direction.

Suppose I have the following data.frame:

DF = data.frame(X = c(114.5508, 114.6468, 114.6596, 114.6957, 114.6828, 114.8903, 114.9519, 114.8842,
114.8579, 114.8489), Y = c(47.14094, 46.98874, 46.91235, 46.88265, 46.80584, 46.67022, 46.53264, 46.47727,
46.46457, 46.47032), Date = as.Date(c('2009-01-01', '2009-01-03', '2009-01-05', '2009-01-10', '2009-01-14',
'2009-01-15', '2009-01-16', '2009-01-17', '2009-01-22', '2009-01-29')))

DF
          X        Y       Date
1  114.5508 47.14094 2009-01-01
2  114.6468 46.98874 2009-01-03
3  114.6596 46.91235 2009-01-05
4  114.6957 46.88265 2009-01-10
5  114.6828 46.80584 2009-01-14
6  114.8903 46.67022 2009-01-15
7  114.9519 46.53264 2009-01-16
8  114.8842 46.47727 2009-01-17
9  114.8579 46.46457 2009-01-22
10 114.8489 46.47032 2009-01-29

I also have two objects that contain the dates of the first and last fortnight of the month of January 2009.

s.d1 = '2009-01-01'
e.d1 = '2009-01-14'
f.n1 = seq(from = as.Date(s.d1)  , to =  as.Date(e.d1), by = 1)

f.n1
[1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" "2009-01-05" "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09" "2009-01-10" "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14"

s.d2 = '2009-01-15'
e.d2 = '2009-01-31'
f.n2 = seq(from = as.Date(s.d2)  , to =  as.Date(e.d2), by = 1)

f.n2
[1] "2009-01-15" "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19" "2009-01-20" "2009-01-21" "2009-01-22" "2009-01-23" "2009-01-24" "2009-01-25" "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29" "2009-01-30" "2009-01-31"


I'm trying to add a column called "Fortnight" to the existing data.frame. The components of the new "Fortnight" column are based on the existing "Date" column so that if the value in "Date" falls within the first fortnight (f.n1) then the value of the new "Fortnight" column would be "FN1", and if the value of the "Date" column falls within the second fortnight (f.n2), then the value of the "Fortnight" column would be "FN2", and so on.

The end result should look like:

          X        Y       Date Fortnight
1  114.5508 47.14094 2009-01-01       FN1
2  114.6468 46.98874 2009-01-03       FN1
3  114.6596 46.91235 2009-01-05       FN1
4  114.6957 46.88265 2009-01-10       FN1
5  114.6828 46.80584 2009-01-14       FN1
6  114.8903 46.67022 2009-01-15       FN2
7  114.9519 46.53264 2009-01-16       FN2
8  114.8842 46.47727 2009-01-17       FN2
9  114.8579 46.46457 2009-01-22       FN2
10 114.8489 46.47032 2009-01-29       FN2

I manually entered the above values for the "Fortnight" column to illustrate my point, however, that would be quite tiresome for 500+ rows of data ;-)

The only other similar issue I found on the list was https://stat.ethz.ch/pipermail/r-help/2008-February/153995.html but that particular problem is slightly different than what I'm trying to accomplish here.

I appreciate your time and assistance.

Thanks in advance.

Regards,


Hakim Abdi



_________________________________
Abdulhakim Abdi, M.Sc.
Research Intern

Conservation GIS/Remote Sensing Lab
Smithsonian Conservation Biology Institute
1500 Remount Road
Front Royal, VA 22630
phone: +1 540 635 6578
mobile: +1 747 224 7006
fax: +1 540 635 6506 (Attn:GIS Lab)
email: abdia at si.edu
http://nationalzoo.si.edu/SCBI/ConservationGIS/






	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list