[R] data.table/ifelse conditional new variable question

(Ted Harding) Ted.Harding at wlandres.net
Sun Aug 17 20:41:48 CEST 2014


On 17-Aug-2014 03:50:33 John McKown wrote:
> On Sat, Aug 16, 2014 at 9:02 PM, Kate Ignatius <kate.ignatius at gmail.com>
> wrote:
> 
>> Actually - your code is not wrong... because this is a large file I
>> went through the file to see if there was anything wrong with it -
>> looks like there are two fathers or three mothers in some families.
>> Taking these duplicates out fixed the problem.
>>
>> Sorry about the confusion!  And thanks so much for your help!
>>
>>
> Kate,
> I hope you don't mind, but I have a curiosity question on my part.
> Were the families with multiple fathers or mothers a mistake, just
> duplicates (same Family.ID & Sample.ID), or more like an "intermixed"
> family due to divorce and remarriage. Or even, like in some countries,
> a case of polygamy? Sorry, I just get curious about the strangest
> things sometimes.
> -- 
> There is nothing more pleasant than traveling and meeting new people!
> Genghis Khan
> 
> Maranatha! <><
> John McKown

When Kate first posted her query, similar thoughts to John's occurred
to me. The potential for convoluted ancestry and kinship is enormous!

For perhaps (or perhaps not) ultimate convolution, try reconstructing
a canine pedigree from a breeding register of thoroughbreds, where
again the primary data is for each individual is
  * ID of individual
  * ID of litter the individual was born in ("family")
  * ID of male parent
  * ID of female parent
(as, for instance, registered with the UK Kennel Club).

Similar convolutions can be found with race-horses.

But even humans can compete. Here is a little challenge for anyone
who has an R program that will work out a pedigree from data such as
described above. I have used Kate's notation. Individuals are numbered
from 1 up (with a gap): Sample.ID; Families from 101 up: Family.ID.
Relationships are "sibling", "father", "mother".

ID for father/mother may be "NA" (data not given).

Family.ID Sample.ID Relationship
101       01        sibling
101       02        father
101       03        mother

102       02        sibling
102       04        father
102       05        mother

103       03        sibling
103       06        father
103       07        mother

104       04        sibling
104       08        father
104       09        mother

104       05        sibling
104       08        father
104       09        mother

104       06        sibling
104       08        father
104       09        mother

104       15        sibling
104       08        father
104       09        mother

105       07        sibling
105       04        father
105       15        mother

106       08        sibling
106       16        father
106       17        mother

106       18        sibling
106       16        father
106       17        mother

106       19        sibling
106       16        father
106       17        mother

107       09        sibling
107       18        father
107       19        mother

108       16        sibling
108       NA        father
108       NA        mother

109       17        sibling
109       NA        father
109       NA        mother

That's the data. Now a little quiz question: Can you guess the
identity of the person with sample.ID = 01 ?

Best wishes to all,
Ted.

-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 17-Aug-2014  Time: 19:41:38
This message was sent by XFMail



More information about the R-help mailing list