[R] Merging two columns of unequal length

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Tue Dec 13 15:23:23 CET 2016


I frequently work with mismatched-length data, but I think I would rarely want this behaviour because there is no compelling reason to believe that all of the NA values should wind up at the end of the data as you suggest. Normally there is a second column that controls where things should line up, and the merge function handles that reliably. If merge is not appropriate then I usually regard that as a warning that those data should perhaps be rbinded or stacked rather than cbinded.

I think Hadley Wickham's paper on tidy data [1] describes this philosophy well. 

[1] https://www.jstatsoft.org/article/view/v059i10
-- 
Sent from my phone. Please excuse my brevity.

On December 13, 2016 2:15:15 AM PST, William Michels via R-help <r-help at r-project.org> wrote:
>You should review "The Recycling Rule in R" before attempting to
>perform functions on 2 or more vectors of unequal lengths:
>
>https://cran.r-project.org/doc/manuals/R-intro.html#The-recycling-rule
>
>Most often, the "Recycling Rule" does exactly what the researcher
>intends (automatically). And in many cases, performing functions on
>data of unequal (or not evenly divisible) lengths is either 1) an
>indication of problems with the input data, or 2) an indication that
>the researcher is unnecessarily 'forcing' data into a rectangular data
>structure, when another approach might be better (e.g. the use of the
>tapply function).
>
>However, if you see no other way, the functions "cbind.na" and/or
>"rbind.na" available from Andrej-Nikolai Spiess perform binding of
>vectors without recycling:
>
>http://www.dr-spiess.de/Rscripts.html
>
>All you have to do is download and source the correct R-script, and
>call the function:
>
>> cbind(1:5, 1:2)
>     [,1] [,2]
>[1,]    1    1
>[2,]    2    2
>[3,]    3    1
>[4,]    4    2
>[5,]    5    1
>
>Warning message:
>In cbind(1:5, 1:2) :
>  number of rows of result is not a multiple of vector length (arg 2)
>
>> source("/Users/myhomedirectory/Downloads/cbind.na.R")
>> cbind.na(1:5, 1:2)
>     [,1] [,2]
>[1,]    1    1
>[2,]    2    2
>[3,]    3   NA
>[4,]    4   NA
>[5,]    5   NA
>>
>
>This issue arises so often, Dr. Spiess's two scripts "rbind.na" and
>"cbind.na" have my vote for inclusion into the base-R distribution.
>
>Best of luck,
>
>W Michels, Ph.D.
>
>
>On Mon, Dec 12, 2016 at 3:41 PM, Bailey Hewitt <bailster at hotmail.com>
>wrote:
>>
>> Dear R Help,
>>
>>
>> I am trying to put together two columns of unequal length in a data
>frame. Unfortunately, so far I have been unsuccessful in the functions
>I have tried (such as cbind). The code I am currently using is : (I
>have highlighted the code that is not working)
>>
>>
>> y<- mydata[,2:75]
>>
>> year <- mydata$Year
>>
>> res <- data.frame()
>>
>> for (i in 1:74){
>>
>>   y.val <- y[,i]
>>
>>   lake.lm= lm(y.val ~ year)
>>
>>   lake.res=residuals(lake.lm)
>>
>>   new.res <- data.frame(lake.res=lake.res)
>>
>>   colnames(new.res) <- colnames(y)[i]
>>
>> #cbind doesn't work because of the unequal lengths of my data columns
>>
>>   res <- cbind(res, new.res)
>>
>>   print(res)
>>
>> }
>>
>>
>> mydata is a csv file with "Year" from 1950 on as my first column and
>then each proceeding column has a lake name and a day of year (single
>number) in each row.
>>
>>
>> Please let me know if there is any more information I can provide as
>I am new to emailing in this list. Thank you for your time!
>>
>>
>> Bailey Hewitt
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list