[R] Adding collumn to existing data frame

Ralf B ralf.bierig at gmail.com
Wed Aug 4 04:54:18 CEST 2010


Actually it does -- one has to use feed the result back into the
original variable:

add.col <- function(df, vec, namevec){
	if (nrow(df) < length(vec) ){ df <-  # pads rows if needed
      	rbind(df, matrix(NA, length(vec)-nrow(df), ncol(df),
            	dimnames=list( NULL, names(df) ) ) )
	}
      length(vec) <- nrow(df) # pads with NA's
      df[, namevec] <- vec; # names new col properly
      return(df)
}

mydata <- NULL
mydata <- data.frame(userid = c(5, 6, 5, 6, 5, 6), taskid = c(1, 1, 2, 2, 3, 3),
      stuff = 11:16)
mydata  <- add.col(mydata, c(1,2,3,4),"test1")
mydata  <- add.col(mydata, c(1,2,3,4,5,6,7,8),"test2")
mydata


Thanks a lot, David and all others here you made the effort!
Ralf


On Tue, Aug 3, 2010 at 10:37 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Aug 3, 2010, at 10:35 PM, David Winsemius wrote:
>
>>
>> On Aug 3, 2010, at 8:32 PM, Ralf B wrote:
>>
>>> Hi experts,
>>>
>>> I am trying to write a very flexible method that allows me to add a
>>> new column to an existing data frame. This is what I have so far:
>>>
>>> add.column <- function(df, new.col, name) {
>>>        n.row <- dim(df)[1]
>>>        length(new.col) <- n.row
>>>        names(new.col) <- name
>>>        return(cbind(df, new.col))
>>> }
>>>
>>> df <- NULL
>>> df <- data.frame(a=c(1,2,3))
>>> df
>>> # corect: added NA to new collumn
>>> df <- add.column(df,c(1,2),'myNewColumn2')
>>> df
>>> # problem: not added, data frame should be extended with NAs
>>> add.column(df,c(1,2,3,4),'myNewColumn3')
>>> df
>>>
>>>
>>> However, there are two problems:
>>>
>>> 1) The column name is not renamed accurately but always set to
>>> 'new.col' . Surely this could be done outside the function, but it
>>> would be better if its self contained.
>>
>> Try this:
>>
>> add.col <- function(df, vec, namevec){
>>                         length(vec) <- nrow(df) # pads with NA's
>>                         cbind(df, namevec=vec)} # names new col properly
>>
> Actually it doesn't name column correctky...  see below for a method with "[
> <-" .
>
>>> 2) It does not work for cases where new.col is longer than the length
>>> of the data frame. In such cases, I would like to add NA's to the data
>>> frame if it has less rows.
>>
>> Don't have a compact answer to this. (Tried re-dimensioning with "dim()
>> <-"  but it was not accepted by the interpreter.  Would need to add a test
>> at the beginning and then pad with rows of NA's using rbind before cbinding
>> as above.
>>
>> add.col <- function(df, vec, namevec){
>>              if (nrow(df) < length(vec) ){ df <-  # pads rows if needed
>>                    rbind(df, matrix(NA, length(vec)-nrow(df), ncol(df),
>>                                     dimnames=list( NULL, names(df) ) ) ) }
>>              length(vec) <- nrow(df) # pads with NA's
>>              df[, namevec] <- vec; # names new col properly
>>        return(df)}
>>
>>>
>>> Any ideas to to solve this?
>>
>> Has not been tested with columns of varying types.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>



More information about the R-help mailing list