[R] mutually exclusive events

Bert Gunter gunter.berton at gene.com
Sat Aug 2 21:09:39 CEST 2014


Homework?

There is a no homework policy here.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Sat, Aug 2, 2014 at 11:47 AM, Don McKenzie <dmck at u.washington.edu> wrote:
> David’s answer assumes a more complicated objective, but obviously we are both unclear as to what you want.  Are you trying to find out which clusters have a unique pattern of mutation? (probably all of them, with so few clusters and so many genes?)
>
> For either objective, this is not a statistical test, but a problem of identification.  For the simpler question, create a data frame with each row being the 150 1s and 0s associated with each cluster, and use duplicated() to identify unique rows. (unique rows will return “FALSE”)
>
> Untested
>
> On Aug 2, 2014, at 11:41 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
>>
>> On Aug 2, 2014, at 11:11 AM, Adrian Johnson wrote:
>>
>>> Hi:
>>>
>>> I am trying to identify mutually exclusive events from the following
>>> example:
>>>
>> #-------------
>> dat <- read.table(text="Cluster      Gene      Mutated    not_mutated
>>  1             G1             1              0
>>  1             G2             1              0
>>  1             G3             0              1
>>  1             G4             0              1
>>  1             G5             1              0
>>  2             G1             0              1
>>  2             G2             1              0
>>  2             G3             1              0
>>  2             G4             0              0
>>  2             G5             1              0", header=TRUE, stringsAsFactors=FALSE)
>>
>> with(dat, table(Cluster, Gene, Mutated)  )
>> #----------------
>> , , Mutated = 0
>>
>>       Gene
>> Cluster G1 G2 G3 G4 G5
>>      1  0  0  1  1  0
>>      2  1  0  0  1  0
>>
>> , , Mutated = 1
>>
>>       Gene
>> Cluster G1 G2 G3 G4 G5
>>      1  1  1  0  0  1
>>      2  0  1  1  0  1
>> #--------------
>> Or:
>> xtabs(Mutated ~ Cluster+Gene, data=dat)
>> #----------------
>>       Gene
>> Cluster G1 G2 G3 G4 G5
>>      1  1  1  0  0  1
>>      2  0  1  1  0  1
>>
>>
>> I'm a bit unclear about your goals. Are you trying to identify the "Gene"s that have only one "Cluster" mutated as the "G1-G3" events and the Gene's that have either-Cluster but not both as the "G2-G5" events?
>>
>> If so you can choose the columns that have a sum of 2 for the first and columns with sum of 1 for the second.
>>>
>>>
>>> In cluster 1 :  G1, G2, G5 are mutated
>>>
>>> In cluster 2:    G2, G3, G5 are mutated.
>>>
>>>
>>> I am interested in finding such G2-G5 event and G1-G3 events.
>>>
>>> In total I have a 8 clusters and 150 gene (1200 rows x 4 columns).
>>>
>>> What test could be appropriate to identify such pairs.
>>>
>>> In my naive understanding would a fishers-exact test give such
>>> combinations.
>>
>> It's even less clear what sort of "test" you propose. `fisher.test` is a test of association. It doesn't identify combinations.
>>>
>>> Thanks a lot.
>>>
>>> -Adrian
>>>
>>>      [[alternative HTML version deleted]]
>>
>> This is a plain text mailing list.
>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> Don McKenzie
> Research Ecologist
> Pacific Wildland Fire Sciences Lab
> US Forest Service
>
> Affiliate Professor
> School of Environmental and Forest Sciences
> University of Washington
> dmck at uw.edu
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list