[R] data.table/ifelse conditional new variable question

Jorge I Velez jorgeivanvelez at gmail.com
Sun Aug 17 00:48:31 CEST 2014


Dear Kate,

Assuming you have nuclear families, one option would be:

x <- read.table(textConnection("Family.ID Sample.ID Relationship
14           62  sibling
14          94  father
14           63  sibling
14           59 mother
17         6004  father
17           6003 mother
17         6005   sibling
17         368   sibling
130           202 mother
130           203  father
130           204   sibling
130           205   sibling
130           206   sibling
222         9 mother
222         45  sibling
222         34  sibling
222         10  sibling
222         11  sibling
222         18  father"), header = TRUE)
closeAllConnections()

xs <- with(x, split(x, Family.ID))
res <- do.call(rbind, lapply(xs, function(l){
l$PID <- l$MID <- 0
 father <- with(l, Relationship == 'father')
 mother <- with(l, Relationship == 'mother')
 l$PID[l$Relationship == 'sibling'] <- l$Sample.ID[father]
 l$MID[l$Relationship == 'sibling'] <- l$Sample.ID[mother]
l
 }))
res

HTH,
Jorge.-


Best regards,
Jorge.-



On Sun, Aug 17, 2014 at 5:42 AM, Kate Ignatius <kate.ignatius at gmail.com>
wrote:

> Hi,
>
> I have a data.table question (as well as if else statement query).
>
> I have a large list of families (file has 935 individuals that are
> sorted by famiy of varying sizes).  At the moment the file has the
> columns:
>
> SampleID FamilyID Relationship
>
> To prevent from having to make a pedigree file by hand - ie adding a
> PaternalID and a MaternalID one by one I want to try write a script
> that will quickly do this for me  (I eventually want to run this
> through a program such as plink)   Is there a way to use data.table
> (maybe in conjucntion with ifelse to do this effectively)?
>
> An example of the file is something like:
>
> Family.ID Sample.ID Relationship
> 14           62  sibling
> 14          94  father
> 14           63  sibling
> 14           59 mother
> 17         6004  father
> 17           6003 mother
> 17         6005   sibling
> 17         368   sibling
> 130           202 mother
> 130           203  father
> 130           204   sibling
> 130           205   sibling
> 130           206   sibling
> 222         9 mother
> 222         45  sibling
> 222         34  sibling
> 222         10  sibling
> 222         11  sibling
> 222         18  father
>
> But the goal is to have a file like this:
>
> Family.ID Sample.ID Relationship PID MID
> 14           62  sibling 94 59
> 14          94  father 0 0
> 14           63  sibling 94 59
> 14           59 mother 0 0
> 17         6004  father 0 0
> 17           6003 mother 0 0
> 17         6005   sibling 6004 6003
> 17         368   sibling 6004 6003
> 130           202 mother 0 0
> 130           203  father 0 0
> 130           204   sibling 203 202
> 130           205   sibling 203 202
> 130           206   sibling 203 202
> 222         9 mother 0 0
> 222         45  sibling 18 9
> 222         34  sibling 18 9
> 222         10  sibling 18 9
> 222         11  sibling 18 9
> 222         18  father 0 0
>
> I've tried searches for this but with no luck.  Greatly appreciate any
> help - even if its just a link to a great example/solution!
>
> Thanks!
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list