[R] combining unequal dataframes based on a common grouping factor

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Wed Dec 3 22:40:58 CET 2014


Posting in HTML format doesn't work nearly as well as you think it does... Your email is pretty mixed up. Please use plain text format and use dput to make your data usable in R.

I expect the best answer to your problem is going to be to use the merge function instead of your for loops.. but the actual data can affect how well any solution works so giving us dput output is crucial.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

On December 3, 2014 1:14:16 PM PST, Brock Huntsman <brockhunts at gmail.com> wrote:
>I apologize if this is a relatively easy problem, but have been stuck
>on
>this issue for a few days. I am attempting to combine values from 2
>separate dataframes. Each dataframe contains a shared identifier
>(GROUP).
>Dataframe 1 (3272 rows x 3 columns) further divides this shared
>grouping
>factor into unique identifiers (ID), as well as contains the proportion
>of
>the GROUP area of which the unique identifier consists (PROP_AREA).
>Dataframe 2 (291 x 14976) in addition to consisting of the shared
>identifier, also has numerous columns consisting of values (VALUE1,
>VALUE2). I would like to multiply the PROP_AREA in dataframe 1 by each
>value in dataframe 2 (VALUE1 through VALUE14976) based on the GROUP
>factor,
>constructing a final dataframe of size 3272 x 14976. An example of the
>data
>frames are as follows:
>
>
>frame1:
>
>ID
>
>GROUP
>
>PROP_AREA
>
>1
>
>A
>
>0.33
>
>2
>
>A
>
>0.33
>
>3
>
>A
>
>0.33
>
>4
>
>B
>
>0.50
>
>5
>
>B
>
>0.50
>
>6
>
>C
>
>1.00
>
>7
>
>D
>
>1.00
>
>
>
>frame2:
>
>GROUP
>
>VALUE1
>
>VALUE2
>
>A
>
>10
>
>5
>
>B
>
>20
>
>10
>
>C
>
>30
>
>15
>
>D
>
>40
>
>20
>
>
>
> Desired dataframe
>
>frame3:
>
>ID
>
>VALUE1
>
>VALUE2
>
>1
>
>3.3
>
>1.65
>
>2
>
>3.3
>
>1.65
>
>3
>
>3.3
>
>1.65
>
>4
>
>10
>
>5
>
>5
>
>10
>
>5
>
>6
>
>30
>
>15
>
>7
>
>40
>
>20
>
>
>
>
>
>I assume I would need to use the %in% function or if statements, but am
>unsure how to write the code. I have attempted to construct a for loop
>with
>an if statement, but have not been successful as of yet.
>
>
>for(i in 1:nrow(frame1)) {
>
>  for(j in 2:ncol(frame2)) {
>
>    if (frame1$GROUP[i] == frame2$GROUP[i]) {
>
>      frame3[i,j+1] <- frame1$PROP_AREA[i]*frame2[i,j+1]
>
>    }
>
>  }
>
>}
>
>
>Any advice on suggested code or packages to read up on would be much
>appreciated.
>
>Brock
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list