[R] filling small gaps of N/A

R. Michael Weylandt michael.weylandt at gmail.com
Tue Apr 3 15:26:20 CEST 2012


Sorry -- left out a major detail: most of these functions have maxgap
arguments which allow you to leave larger gaps of NAs as NAs.

Best,
Michael

On Tue, Apr 3, 2012 at 9:24 AM, R. Michael Weylandt
<michael.weylandt at gmail.com> wrote:
> It seems like you could benefit from using a zoo [time series] object
> to hold your data -- then you have a variety of NA filling functions
> which work for arbitrarily long gaps. E.g.,
>
> library(zoo)
> x <- zoo(1:100, Sys.Date() + 1:100)
> x[2:60] <- NA
>
> # Most of these look the same because the data is simple: will give
> different results for more complicated examples
> na.approx(x)
> na.locf(x)
> na.spline(x)
> na.aggregate(x)
> na.fill # Takes more arguments
>
> Hope this helps,
> Michael
>
> On Tue, Apr 3, 2012 at 4:52 AM, jeff6868
> <geoffrey_klein at etu.u-bourgogne.fr> wrote:
>> Hi everybody,
>>
>> I'm a new R french user. Sorry if my english is not perfect. Hope you'll
>> understand my problem ;)
>>
>> I have to work on temperature data (35000 lines in one file) containing some
>> missing data (N/A). Sometimes I have only 2 or 3 N/A following each other,
>> but I have also sometimes 100 or 200 N/A following each other. Here's an
>> example of my data, when I have only small gaps of missing data (2 or 3
>> N/A):
>>
>> 09/01/2008 12:00   2   1.93   2.93   4.56   5.43
>> 09/01/2008 12:15   2   *3.93*   3.25   4.93   5.56
>> 09/01/2008 12:30   2    NA   3.5   5.06   5.56
>> 09/01/2008 12:45   2    NA   3.68 5.25   5.68
>> 09/01/2008 13:00   2   *4.93 *  3.87   5.56   5.93
>> 09/01/2008 13:15   2   5.93   4.25   5.75   6.06
>> 09/01/2008 13:30   2   3.93   4.56   5.93   6.18
>>
>> My question is: how can I replace these small gaps of N/A by numeric values?
>> I would like a fonction which only replace the small gaps (2 or 3 N/A) in my
>> data, but not the big gaps (more than 5 N/A following each other).
>>
>> For the moment, i'm trying to do it by working with the time gap between the
>> 2 numeric values surrounding the N/A as following:
>>
>> imputation <- function(x){
>>    met = NULL
>>
>>    temp <- met[1] <- x[1]
>>
>>    ind_temp <- 1
>>
>>    tps <- time(x)
>>
>>    for (i in 2:(length(x)) ){
>>    if((tps[i]-tps[ind_temp] > 1)&(tps[i]-tps[ind_temp] <=
>> 4)&(is.na(x[i]))){
>>    met[i] <- na.approx(x)
>>    }
>>    else {
>>    temp <- met[i] <- x[i]
>>    ind_temp <- i
>>    }
>>    }
>>
>>    return(met)
>>    }
>>
>> In this example, I would like to apply the function: na.approx(x) on my N/A,
>> but only when I have maximum 4 N/A following each other.
>> There's no error, but it doesn't work (it was working in the other way, when
>> I had to detect aberrant data and replace it by N/A, but not now). It is
>> maybe not the good way to solve this problem. I don't have a lot of
>> experience in R. Maybe there is an easier way to do it...
>> Does somebody have an idea about it for helping me?
>> Thanks a lot!
>>
>>
>> --
>> View this message in context: http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528184.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list