[R] create groups from data with duplicates, such that each group has a duplicate represented once

Thu Jan 17 09:55:59 CET 2019

Hi

Instead of attachment which is usually removed you should use dput

Something like output from
dput(head(yourdata,30))

To remove duplicate values see

unique or duplicated

Cheers
Petr

> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of Kevin Wamae
> Sent: Thursday, January 17, 2019 1:29 AM
> To: r-help using r-project.org
> Subject: [R] create groups from data with duplicates, such that each group has
> a duplicate represented once
>
> Hi, I have a sequencing run with ~3000 samples (attached dataset). The
> samples were initially tagged and amplified by PCR in duplicate. The tags used
> range from MID01 to MID26.
>
> MID01-MID13 were used for pair 1 while MID14-MID26 were used for pair 2.
> The tags are re-used to allow samples to be pooled.
>
> The pooling process will involve mixing samples with MID01-26 into the first
> group, the next group samples with MID01-26 into the second group and so on.
>
> I'm hoping to get an R script that can create these groups such that for each
> group, any of the Tags appears only once. An example is shown below.
>
> ID
>
> TagA
>
> TagB
>
> group
>
> 180
>
> MID03
>
> MID10
>
> group1
>
> 181
>
> MID04
>
> MID06
>
> group1
>
> 182
>
> MID05
>
> MID07
>
> group1
>
> 183
>
> MID03
>
> MID09
>
> group2
>
> 184
>
> MID04
>
> MID10
>
> group2
>
> 185
>
> MID05
>
> MID06
>
> group2
>
> 186
>
> MID01
>
> MID06
>
> group3
>
> 187
>
> MID02
>
> MID07
>
> group3
>
> 188
>
> MID03
>
> MID08
>
> group3
>
>
>
> ___________________________________________________________________
> ___
>
> This e-mail contains information which is confidential. It is intended only for
> the use of the named recipient. If you have received this e-mail in error, please
> let us know by replying to the sender, and immediately delete it from your
> system.  Please note, that in these circumstances, the use, disclosure,
> distribution or copying of this information is strictly prohibited. KEMRI-
> Wellcome Trust Programme cannot accept any responsibility for the  accuracy
> or completeness of this message as it has been transmitted over a public
> network. Although the Programme has taken reasonable precautions to ensure
> no viruses are present in emails, it cannot accept responsibility for any loss or
> damage arising from the use of the email or attachments. Any views expressed
> in this message are those of the individual sender, except where the sender
> specifically states them to be the views of KEMRI-Wellcome Trust Programme.
> ___________________________________________________________________
> ___
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/