[R] Fisher

Ambrosini Alessandro klavan at tiscalinet.it
Fri May 24 13:06:10 CEST 2002

```Hello.
I had a big collection of Web pages. Now I have this collection divided into
clusters. Every page can be relevant or not. I made a table:
relevant     non relevant
cluster1    1               20
cluster2    0               15
cluster3    3               35
.        .               .
.        .               .
.        .               .
I cluster1 I have 21 Web pages, 1 relevant and 20 no.
I want to find if relevant pages tend to stay in some clusters, and so I
want to find if there is a dipendence relevant-cluster. The problem is that
I have not much relevant pages for cluster. They are 1,2,3 max 5 for cluster
and so I can't use Chi- square of Pearson.
Tell me one thing: suppose to have
>a
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,]    0    0    1    0    0    0    0    0    0     1     0
[2,]   21   20   33   17   12   18   12   10   11    10    28

In this case every column is a cluster, the first row has the relevant pages
...
if I do

fisher.test(a)

Fisher's Exact Test for Count Data

data:  a
p-value = 0.5611
alternative hypothesis: two.sided

how can I interpret this output? How can I read the p-value? Have I to
compare it with something? In the case of perfect dependence, is p-value=1 ?
cant solve the problem by myself. My work can not go on if I don't solve the
problem.
Thank you
Alessandro

