[R] replacing values of rows with identical row names in two dataframes

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Sat May 7 04:20:11 CEST 2016


Please use reply-all to keep the mailing list in the loop,  and use plain text rather than HTML to make sure the your message gets through uncorrupted. 

?merge
?lapply

# untested
# align rows by date
df1a <- merge( df1, df2, by="date", all.x=TRUE )
# like-named columns have .x or .y appended
df1an0  <- grep( "\\.x$", names( df1a ), values=TRUE )
df1an <- substr( df1an0, 1, nchar( df1an0 ) - 2 )
# make a list of updated columns
df1b <- lapply( df1an, function(nm) { 
   nmx  <- paste0( nm, ".x" )
   nmy  <- paste0( nm, ".y" )
   ifelse( is.na( df1a[[ nmx ]] ), df1a[[ nmy ]], df1a[[ nmx ]] )
 }
# set the names of the fixed columns
df1b <- setNames( df1b, df1an )
# figure out the names of the non-duped columns
df1an1 <- grep( "\\.[xy]$", names( df1a ), invert =TRUE )
# make a new data frame
df1c  <- data.frame( df1a[ , df1an1, drop=FALSE ], df1b )

-- 
Sent from my phone. Please excuse my brevity.

On May 6, 2016 4:32:15 PM PDT, Saba Sehrish <sabasehrish at yahoo.com> wrote:
>No. If there is some other way, i would like to go for it.
>RegardsSaba 
>
>On Saturday, 7 May 2016, 11:30, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us> wrote:
> 
>
> Why would you want to use a for loop? Is this homework? 
>-- 
>Sent from my phone. Please excuse my brevity.
>
>On May 6, 2016 4:15:09 PM PDT, Saba Sehrish via R-help
><r-help at r-project.org> wrote:
>
>
>Hi 
>
>I have two dataframes(df1, df2) with equal number of columns (1566) but
>lesser rows in df2 (2772 in df1 and 40 in df2). Row names are 
>identical in both dataframes (date). I want to replace NAs of df1 with
>the values of df2 for all those rows having identical row names (date)
>but 
>without affecting already existing values in those rows of df1. 
>
>Please see below: 
>
>df1: 
>date     11A  11A   21B   3CC   3CC 
>20040101  100   150   NA   NA   140 
>20040115   200   NA   200   NA   NA 
>20040131   NA   165   180   190   190 
>20040205   NA   NA   NA   NA   NA 
>20040228   NA   NA   NA   NA   NA 
>20040301  150   155   170   150   160 
>20040315   NA   NA   180   190   200 
>20040331   NA   NA   NA   175   180 
>
>df2: 
>date     11A  11A   21B   3CC   3CC 
>20040131   170   NA   NA   NA   NA 
>20040228   140   145   165   150   155 
>20040331   NA  
>145   160   NA   NA 
>
>I want the resulting dataframe to be: 
>
>df3: 
>date         11A  11A   21B   3CC   3CC 
>20040101      100   150   NA   NA   140 
>20040115      200   NA   200   NA   NA 
>20040131      170   165   180   190   190 
>20040205      NA   NA   NA   NA   NA 
>20040228      140   145   165   150   155 
>20040301      150   155   170   150   160 
>20040315      NA   NA   180   190   200 
>20040331      NA   145   160   175   180 
>
>If it is possible, I would prefer to use "for loop" and "which"
>function to achieve the result. 
>
>Please guide me in this regard. 
>
>Thanks 
>Saba
>
>
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
>
>   

	[[alternative HTML version deleted]]



More information about the R-help mailing list