[R] Ids with matching number combinations?

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Fri Oct 7 17:20:28 CEST 2022


The merge function doesn't require a package.

But inner_join may be faster than merge.

On October 7, 2022 8:16:11 AM PDT, "Ebert,Timothy Aaron" <tebert using ufl.edu> wrote:
>Would an inner_join work? If not, please describe why so that we can improve our answer. This answer requires the dplyr package.
>https://statisticsglobe.com/r-dplyr-join-inner-left-right-full-semi-anti
>
>Regards,
>Tim
>
>-----Original Message-----
>From: R-help <r-help-bounces using r-project.org> On Behalf Of PIKAL Petr
>Sent: Friday, October 7, 2022 10:02 AM
>To: Marine Andersson <marine.andersson using ki.se>; r-help using r-project.org
>Subject: Re: [R] Ids with matching number combinations?
>
>[External Email]
>
>Hallo Marine
>
>Could you please make your example more reproducible by using set.seed (and maybe smaller)?
>
>If I understand correctly, you want to know if let say row 1 items from df2
>(8,16) are both in item column of specific id?
>
>If I am correct in guessing, I cannot find another solution than split your df according to id x <- split(df, df$id)[[1]]
>
>and for each row of df2 test if within the specified id you can find both numbers.
>sum(is.element(df2[1,], x$item))==2
>[1] FALSE
>
>So basically 2 cycles, one for df ids and the other for df2 rows.
>
>But maybe somebody will give you more ingenious answer.
>
>Cheers
>Petr
>
>
>> -----Original Message-----
>> From: R-help <r-help-bounces using r-project.org> On Behalf Of Marine 
>> Andersson
>> Sent: Friday, October 7, 2022 1:58 PM
>> To: r-help using r-project.org
>> Subject: [R] Ids with matching number combinations?
>>
>> Hi,
>>
>> If I have two datasets like this:
>> df=data.frame("id"=rep(1:10,10, each=10), "item1"=sample(1:20, 100,
>> replace=T)
>> df2=data.frame("a"=c(8, 8,10,9, 5, 1,2,1), "b"=c(16,18,11, 19,18,
>11,17,12))
>>
>> How do I find out which ids in the df dataset that has a match for 
>> both
>the
>> numbers occuring in the same row in the df2 dataframe? In the output I
>would
>> like to get the matching id and the rownumber from the df2.
>>
>> Output something like this
>> Id                        Rownr
>> 2                         1
>> 5                         1
>> 7                         4
>>
>> My actual problem is more complex with even more columns to be matched 
>> and the datasets are large, hence the solution needs to be efficient.
>>
>> Kind regards,
>>
>>
>>
>>
>>
>> N?r du skickar e-post till Karolinska Institutet (KI) inneb?r detta 
>> att KI
>kommer
>> att behandla dina personuppgifter. H?r finns information om hur KI
>behandlar
>> personuppgifter<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fki.se%2Fmedarbetare%2Fintegritetsskyddspolicy&data=05%7C01%7Ctebert%40ufl.edu%7C7346e6bc695846d7264508daa86cac5a%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638007482638899002%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=%2F17yDHhLyAUZFLVC9g73jTSLvncGW89KB5SiBpMo1u8%3D&reserved=0>.
>>
>>
>> Sending email to Karolinska Institutet (KI) will result in KI 
>> processing
>your
>> personal data. You can read more about KI's processing of personal 
>> data here<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fki.se%2Fen%2Fstaff%2Fdata-protection-policy&data=05%7C01%7Ctebert%40ufl.edu%7C7346e6bc695846d7264508daa86cac5a%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638007482638899002%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=E6valRizvf2Ff5TSUUp6ut30E6D3BF%2BiMNDrmDOZxfs%3D&reserved=0>.
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
>> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl
>> .edu%7C7346e6bc695846d7264508daa86cac5a%7C0d4da0f84a314d76ace60a62331e
>> 1b84%7C0%7C0%7C638007482638899002%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
>> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C
>> &sdata=hKFMIJOKSUHjM7GRTn9RkTAMocHRQQwO5lB6tUMe%2FUI%3D&reserv
>> ed=0
>> PLEASE do read the posting guide
>https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C7346e6bc695846d7264508daa86cac5a%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638007482638899002%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=dM%2BvORlAYU2%2F0uHF9d%2F3sEl4GdurEGDjgk%2Bs6QxazZQ%3D&reserved=0
>> and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.



More information about the R-help mailing list