[R] Help with significance. T-test?

Tue Jul 28 18:30:45 CEST 2009

Look up the McNemar test. That sounds right...

Daniel 

-------------------------
cuncta stricte discussurus
-------------------------

-----Ursprüngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von mik07
Gesendet: Tuesday, July 28, 2009 10:49 AM
An: r-help at r-project.org
Betreff: [R] Help with significance. T-test?

Hi,

this is more a general statistics question I think.

I am working on a system which automatically answers user questions (such
systems are commonly called "Question Answering systems").
I evaluated different versions of the same system on a publicly available
test set.
This set contains 500 question. Naturally, for each question the answer can
be wrong or right, which is coded as "0" (wrong) or "1" (correct). By adding
up all values, and dividing them by the number of questions in the test set
(that's 500), one gets a measure for how well the system performs, commonly
called accuracy.
As mentioned I evaluated two different versions of the system, and received
two different accuracy values. Now I want to know whether the difference is
statistically significant. 

Can I use a t-test? I know it has certain requirements, for example a
somewhat normal distribution. That's difficult of course when the values in
question are only "0" and "1"...

Has anybody any ideas?

Thanks a lot,
Mika

PS:

The data I have looks something like this (of course I actually have 500
values, not only 10):

results1:  0,1,1,1,0,1,1,0,1,0    accuracy: 0.6
results2:  0,0,1,1,0,0,1,1,1,0    accuracy: 0.5
--
View this message in context:
http://www.nabble.com/Help-with-significance.-T-test--tp24699690p24699690.ht
ml
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.