[R] Combine two dataframe with different row number and interpolation between values

PIKAL Petr petr@p|k@| @end|ng |rom prechez@@cz
Wed Aug 31 08:51:46 CEST 2022


Hallo

And missing value interpolation is rather tricky business dependent on what
is underlying process.

Maybe na.locf from zoo package?

Or approxfun?, splinefun?

Cheers
Petr


> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of javad bayat
> Sent: Wednesday, August 31, 2022 8:09 AM
> To: r-help using r-project.org
> Subject: [R] Combine two dataframe with different row number and
> interpolation between values
> 
>  Dear all,
> I am trying to combine two large dataframe in order to make a dataframe
with
> exactly the dimension of the second dataframe.
> The first df is as follows:
> 
> df1 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 2920), d =
> rep(c(1:365,1:365,1:365,1:365,1:365),each=8),
>       h = rep(c(seq(3,24, by = 3),seq(3,24, by = 3),seq(3,24, by =
3),seq(3,24, by =
> 3),seq(3,24, by = 3)),365),
>       ws = rnorm(1:14600, mean=20))
> > head(df1)
>      y       d   h        ws
> 1  2010  1  3     20.71488
> 2  2010  1  6     19.70125
> 3  2010  1  9     21.00180
> 4  2010  1 12     20.29236
> 5  2010  1 15     20.12317
> 6  2010  1 18     19.47782
> 
> The data in the "ws" column were measured with 3 hours frequency and I
need
> data with one hour frequency. I have made a second df as follows with one
hour
> frequency for the "ws" column.
> 
> df2 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 8760), d =
> rep(c(1:365,1:365,1:365,1:365,1:365),each=24),
>       h = rep(c(1:24,1:24,1:24,1:24,1:24),365), ws = "NA")
> > head(df2)
>       y      d    h   ws
> 1  2010  1    1   NA
> 2  2010  1    2   NA
> 3  2010  1    3   NA
> 4  2010  1    4   NA
> 5  2010  1    5   NA
> 6  2010  1    6   NA
> 
> What I am trying to do is combine these two dataframes so as to the rows
in
> df1 (based on the values of "y", "d", "h" columns) that have values
exactly
> similar to df2's rows copied in its place in the new df (df3).
> For example, in the first dataframe the first row was measured at 3
o'clock on
> the first day of 2010 and this row must be placed on the third row of the
second
> dataframe which has a similar value (2010, 1, 3). Like the below
> table:
>       y      d    h   ws
> 1  2010  1    1   NA
> 2  2010  1    2   NA
> 3  2010  1    3   20.71488
> 4  2010  1    4   NA
> 5  2010  1    5   NA
> 6  2010  1    6   19.70125
> 
> But regarding the values of the "ws" column for df2 that do not have value
(at 4
> and 5 o'clock), I need to interpolate between the before and after values
to fill in
> the missing data of the "ws".
> I have tried the following codes but they did not work correctly.
> 
> > df3 = merge(df1, df2, by = "y")
> Error: cannot allocate vector of size 487.9 Mb or
> > library(dplyr)
> > df3<- df1%>% full_join(df2)
> 
> 
> Is there any way to do this?
> Sincerely
> 
> 
> 
> 
> 
> --
> Best Regards
> Javad Bayat
> M.Sc. Environment Engineering
> Alternative Mail: bayat194 using yahoo.com
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list