[R] Adding a new conditional column to a list of dataframes

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Sun Apr 15 16:37:24 CEST 2018


On 15/04/2018 7:08 AM, Allaisone 1 wrote:
> 
> Hi all ..,
> 
> 
> I have a list of 7000 dataframes with similar column headers and I wanted to add a new column to each dataframe based on a certain condition which is the same for all dataframes.
> 
> 
> When I extract one dataframe and apply my code it works very well as follows :-
> 
> 
> First suppose this is my first dataframe in the list
> 
>> OneDF <- Mylist[[1]]
> 
>> OneDF
> 
> 
> ID       Pdate                  Tdate
> 
> 1         2010-09-30       2011-05-10
> 
> 2         2011-11-07       2009-09-31
> 
> 3         2012-01-05        2008-06-23
> 
> 
> To add a new column where "C" has to be written in that column only if the date in
> 
> "Tdate" column is less than the first date(row) in "Pdate" column.Otherwise "NA" is written.
> 
> I have written this code to do so :-
> 
> 
> OneDF$NewCol [ OneDF[ ,3] <  OneDF[ 1,2] ] <- "C"
> 
> 
> This gave me what I want as follows :-
> 
> 
> ID       Pdate                  Tdate                      NewCol
> 
> 1         2010-09-30       2011-05-10                NA
> 
> 2         2011-11-07       2009-09-31                  C
> 
> 3         2012-01-05        2008-06-23                 C
> 
> 
> However, when I tried to apply this code in a function and then apply this function
> 
> to all dataframes using lapply() function , I do not get what I want.
> 
> 
> I wrote this function first :-
> 
> 
> MyFunction <- function(x) x$NewCol [ x[ ,3] <  x[ 1,2] ] <- "C"
> 
> 
> Then I wrote this code to apply my function to all dataframes in "Mylist" :
> 
> 
> NewList <- lapply(names(Mylist), function(x) MyFunction(Mylist[[x]]))
> 
> 
> This returned a list of 7000 elements and each of which contain "C'' letter. Each
> 
> dataframe has become a vector of "C'' letter which is totally away from what I need.
> 
>   I expected to see a list of my 7000 dataframes and each of which looks like the output
> 
> I have shown above with the new column.
> 
> 
> I spent a lot of time trying to know what  is the mistake I have made in these last two codes
> 
> but was not able to know the issue.
> 
> 
> Could you please let me know my mistake and how to correct my syntax ?


Your function should return x after modifying it.  As it is, it returns 
the value of x$NewCol [ x[ ,3] <  x[ 1,2] ] <- "C", which is "C".  So 
change it to

MyFunction <- function(x) {
   x$NewCol [ x[ ,3] <  x[ 1,2] ] <- "C"
   x
}

Duncan Murdoch




More information about the R-help mailing list