[R] help with merging two dataframes function of "egrep"-like formulas
Bogdan Tanasa
t@n@@@ @end|ng |rom gm@||@com
Thu Jul 19 04:50:56 CEST 2018
it looks great, thank you very much Jeff for your time and kind help !
On Wed, Jul 18, 2018 at 7:51 PM, Jeff Newmiller <jdnewmil using dcn.davis.ca.us>
wrote:
> The traditional (SQL) way to attack this problem is to make the data
> structure simpler so that faster comparisons can be utilized:
>
> ################
> A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
> B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
>
> library(dplyr)
> #>
> #> Attaching package: 'dplyr'
> #> The following objects are masked from 'package:stats':
> #>
> #> filter, lag
> #> The following objects are masked from 'package:base':
> #>
> #> intersect, setdiff, setequal, union
> library(tidyr)
> Bx <- ( B
> %>% mutate( z_B = as.character( z ) )
> %>% rename( t_B = t )
> %>% separate_rows( z, sep="::" )
> )
> Bx
> #> z t_B z_B
> #> 1 a*b 1 a*b::x*y
> #> 2 x*y 1 a*b::x*y
> #> 3 c 2 c
> #> 4 3
> #> 5 g*h 4 g*h
> result <- ( A
> %>% mutate( z = as.character( z ) )
> %>% rename( t_A = t )
> %>% inner_join( Bx, by="z" )
> )
> result
> #> z t_A t_B z_B
> #> 1 a*b 1 1 a*b::x*y
>
> #' Created on 2018-07-18 by the [reprex package](http://reprex.tidyver
> se.org) (v0.2.0).
> ################
>
> Note that this is preferable if you can avoid ever creating the complex
> data z in B, but Bx is much more flexible and less error prone than B.
> (Especially if you don't have to create B$z_B at all, but have some other
> unique identifier(s) for the groupings represented by each row in B.)
>
>
> On Wed, 18 Jul 2018, Bogdan Tanasa wrote:
>
> Thanks a lot ! It looks that I am getting the same results with :
>>
>> B %>% regex_left_join(A, by = c(z = 'z'))
>>
>> On Wed, Jul 18, 2018 at 3:57 PM, Riley Finn <rileyfinn3 using gmail.com> wrote:
>>
>> please may I ask for a piece of advise regarding merging two dataframes :
>>>
>>>> A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
>>>> B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
>>>> function of the criteria :
>>>> if "the elements in the 1st column of A could be found among the
>>>> elements
>>>> of the 1st column of B" i.e.
>>>> for the example above, we shall combine in the results only the row with
>>>> "a*b" of A with the row with "a*b::x*y" of B.
>>>>
>>>
>>>
>>> This may be what you are looking for:
>>>
>>> library(fuzzyjoin)
>>>
>>> The inner join returns just the one row where the string matches.
>>> B %>%
>>> regex_inner_join(A, by = c(z = 'z'))
>>>
>>> While the full join returns NA's where the string does not match.
>>> B %>%
>>> regex_full_join(A, by = c(z = 'z'))
>>>
>>> On Wed, Jul 18, 2018 at 5:20 PM Bogdan Tanasa <tanasa using gmail.com> wrote:
>>>
>>> Dear all,
>>>>
>>>> please may I ask for a piece of advise regarding merging two dataframes
>>>> :
>>>>
>>>> A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
>>>>
>>>> B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
>>>>
>>>> function of the criteria :
>>>>
>>>> if "the elements in the 1st column of A could be found among the
>>>> elements
>>>> of the 1st column of B" i.e.
>>>>
>>>> for the example above, we shall combine in the results only the row with
>>>> "a*b" of A with the row with "a*b::x*y" of B.
>>>>
>>>> thank you,
>>>>
>>>> bogdan
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/
>>>> posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> ------------------------------------------------------------
> ---------------
> Jeff Newmiller The ..... ..... Go Live...
> DCN:<jdnewmil using dcn.davis.ca.us> Basics: ##.#. ##.#. Live
> Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
> ------------------------------------------------------------
> ---------------
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list