[R] filling small gaps of N/A

R. Michael Weylandt michael.weylandt at gmail.com
Tue Apr 3 15:24:08 CEST 2012


It seems like you could benefit from using a zoo [time series] object
to hold your data -- then you have a variety of NA filling functions
which work for arbitrarily long gaps. E.g.,

library(zoo)
x <- zoo(1:100, Sys.Date() + 1:100)
x[2:60] <- NA

# Most of these look the same because the data is simple: will give
different results for more complicated examples
na.approx(x)
na.locf(x)
na.spline(x)
na.aggregate(x)
na.fill # Takes more arguments

Hope this helps,
Michael

On Tue, Apr 3, 2012 at 4:52 AM, jeff6868
<geoffrey_klein at etu.u-bourgogne.fr> wrote:
> Hi everybody,
>
> I'm a new R french user. Sorry if my english is not perfect. Hope you'll
> understand my problem ;)
>
> I have to work on temperature data (35000 lines in one file) containing some
> missing data (N/A). Sometimes I have only 2 or 3 N/A following each other,
> but I have also sometimes 100 or 200 N/A following each other. Here's an
> example of my data, when I have only small gaps of missing data (2 or 3
> N/A):
>
> 09/01/2008 12:00   2   1.93   2.93   4.56   5.43
> 09/01/2008 12:15   2   *3.93*   3.25   4.93   5.56
> 09/01/2008 12:30   2    NA   3.5   5.06   5.56
> 09/01/2008 12:45   2    NA   3.68 5.25   5.68
> 09/01/2008 13:00   2   *4.93 *  3.87   5.56   5.93
> 09/01/2008 13:15   2   5.93   4.25   5.75   6.06
> 09/01/2008 13:30   2   3.93   4.56   5.93   6.18
>
> My question is: how can I replace these small gaps of N/A by numeric values?
> I would like a fonction which only replace the small gaps (2 or 3 N/A) in my
> data, but not the big gaps (more than 5 N/A following each other).
>
> For the moment, i'm trying to do it by working with the time gap between the
> 2 numeric values surrounding the N/A as following:
>
> imputation <- function(x){
>    met = NULL
>
>    temp <- met[1] <- x[1]
>
>    ind_temp <- 1
>
>    tps <- time(x)
>
>    for (i in 2:(length(x)) ){
>    if((tps[i]-tps[ind_temp] > 1)&(tps[i]-tps[ind_temp] <=
> 4)&(is.na(x[i]))){
>    met[i] <- na.approx(x)
>    }
>    else {
>    temp <- met[i] <- x[i]
>    ind_temp <- i
>    }
>    }
>
>    return(met)
>    }
>
> In this example, I would like to apply the function: na.approx(x) on my N/A,
> but only when I have maximum 4 N/A following each other.
> There's no error, but it doesn't work (it was working in the other way, when
> I had to detect aberrant data and replace it by N/A, but not now). It is
> maybe not the good way to solve this problem. I don't have a lot of
> experience in R. Maybe there is an easier way to do it...
> Does somebody have an idea about it for helping me?
> Thanks a lot!
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528184.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list