[R] mutually exclusive events

John McKown john.archie.mckown at gmail.com
Sat Aug 2 22:19:43 CEST 2014


On Sat, Aug 2, 2014 at 1:11 PM, Adrian Johnson
<oriolebaltimore at gmail.com> wrote:
> Hi:
>
> I am trying to identify mutually exclusive events from the following
> example:
>
>
> Cluster      Gene      Mutated    not-mutated
>   1             G1             1              0
>   1             G2             1              0
>   1             G3             0              1
>   1             G4             0              1
>   1             G5             1              0
>   2             G1             0              1
>   2             G2             1              0
>   2             G3             1              0
>   2             G4             0              0
>   2             G5             1              0
>
>
> In cluster 1 :  G1, G2, G5 are mutated
>
> In cluster 2:    G2, G3, G5 are mutated.
>
>
> I am interested in finding such G2-G5 event and G1-G3 events.
>
> In total I have a 8 clusters and 150 gene (1200 rows x 4 columns).
>
> What test could be appropriate to identify such pairs.
>
> In my naive understanding would a fishers-exact test give such
> combinations.
>
> Thanks a lot.
>
> -Adrian

I am having trouble visualizing your data. How about a sample? The
easy is to do something like:

temp <- head(realData,10);
dput(temp);

Then cut'n'paste the output from the dput() into another email here.

But, asuming I have a bit of a grasp, you have four columns (example
only shows 3). If you have a set of columns which are 0 & 1 or FALSE
and TRUE, then you can create a "temp" column which encodes tehm
simply by considering them to be binary digits in a number. I.e.
tempColumn = 1 * column1 + 2 * column2 + 4*column3 + 8*column4. You
can the "group" the data by this value. All rows with the same value
are in the same "group". But I don't know what you want your output to
look like. As an aside any value other than 0, 1, 2,4, or 8 could be
considered invalid because it means that more than one column is TRUE,
which violates your constraint.


-- 
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! <><
John McKown



More information about the R-help mailing list