[R] Subsetting data by date

Gabor Grothendieck ggrothendieck at gmail.com
Mon Jul 21 15:55:37 CEST 2008


Try this:

Lines <- "Date,Temp
1-Apr-1997,50
3-Sept-2001,60"

library(zoo)

# function to reduce 4 char mos to 3 char
convert.date <- function(x, format) as.Date(sub("(-...).-", "\\1-", x), format)

# z <- read.zoo("myfile.csv", header = TRUE, sep = ",", FUN =
convert.date, format = "%d-%b-%Y")
z <- read.zoo(textConnection(Lines), header = TRUE, sep = ",", FUN =
convert.date, format = "%d-%b-%Y")

plot(z)

If the dates are actually three letters, i.e. Sep and not Sept, then you could
eliminate convert.date and simplify the read.zoo line to:

z <- read.zoo(textConnection(Lines), header = TRUE, sep = ",", format
= "%d-%b-%Y")

See the zoo package documentation and its three vignettes as well as ?read.zoo
?strptime and ?plot.zoo and also look at the dates article in R News 4/1.


On Mon, Jul 21, 2008 at 9:31 AM, Williams, Robin
<robin.williams at metoffice.gov.uk> wrote:
> Hi all,
>  Firstly I appologise if this question has been answered previously,
> however searching of the archives and the internet generally has not
> yielded any results.
>
>  I am looking in to the effects of summer weather conditions
> (temperature, humidity etc), on the incidences of a breathing disorder
> brought on through smoking (COPD). I am fairly new to R and completely
> new to the idea of writing R scripts, subsetting dataframes etc. I am
> working on a 12 week summer placement at the Met Office, UK, having just
> finished my second year of a mathematics course at university.
>
>  Basically I have data between January 1 1997 and December 31 2007.
> However as I am only interest in the summer months (which I have defined
> to be between May 1 and September 30), I would like to extract the
> relevant data in R in a timely manner. Obviously I could go and open my
> csv files in excel, cut and paste the relevant data, etc, however I
> would like to maximise R's potential as I feel it will stand me in
> better stead in the long run.
>  Currently the dates are in the form
> 1-Apr-1997,
> 3-Sept-2001,
> etc.
>  I will create a data.frame with date as one of the variables, the
> others being (initially) temperature, humidity, and Admissions (the
> number of hospital admissions for COPD exaserbations).
>  Please could somebody tell me if there is a simple way to extract the
> data I want, and if so perhaps a sample command to get me going? Do I
> first need to format the dates to some numeric-only format? As I say, I
> could use Excel to create the files in the right format, but I will be
> dealing with a lot more variables in the future (perhaps up to 8) and so
> this will become a pain-staking process.
>
>  Please reply either on or off list.
>
> Many thanks for any help.
> Robin Williams
> Met Office summer intern - Health Forecasting
> robin.williams at metoffice.gov.uk
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list