[R] filling up holes

Bill.Venables at csiro.au Bill.Venables at csiro.au
Wed Dec 29 04:27:04 CET 2010


Dear 'analyst41' (it would be a courtesy to know who you are)

Here is a low-level way to do it.  

First create some dummy data

> allDates <- seq(as.Date("2010-01-01"), by = 1, length.out = 50) 
> client_ID <- sample(LETTERS[1:5], 50, rep = TRUE)
> value <- 1:50
> date <- sample(allDates)
> clientData <- data.frame(client_ID, date, value)

At this point clientData has 50 rows, with 5 clients, each with a sample of datas.  Everything is in random order execept "value".

Now write a little function to fill out a subset of the data consisting of one client's data only:
 
> fixClient <- function(cData) {
+   dateRange <- range(cData$date)
+   dates <- seq(dateRange[1], dateRange[2], by = 1)
+   fullSet <- data.frame(client_ID = as.character(cData$client_ID[1]),
+                         date = dates, value = NA)
+ 
+   fullSet$value[match(cData$date, dates)] <- cData$value
+   fullSet  
+ }

Now split up the data, apply the fixClient function to each section and re-combine them again:

> allData <- do.call(rbind,
+                    lapply(split(clientData, clientData$client_ID), fixClient))

Check:

> head(allData)
    client_ID       date value
A.1         A 2010-01-04    36
A.2         A 2010-01-05    18
A.3         A 2010-01-06    NA
A.4         A 2010-01-07    NA
A.5         A 2010-01-08    NA
A.6         A 2010-01-09    49
> 

Seems OK.  At this point the data are in sorted order by client and date, but that should not matter.

Bill Venables.

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of analyst41 at hotmail.com
Sent: Wednesday, 29 December 2010 10:45 AM
To: r-help at r-project.org
Subject: [R] filling up holes

I have a data frame with three columns

client ID | date | value


For each cilent ID I want to determine Min date and Max date and for
any dates in between that are missing I want to insert a row

Client ID | date| NA

Any help would be appreciated.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list