[R] Mann-Whitney U

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Aug 15 01:23:38 CEST 2007


Prof Brian Ripley wrote:
> On Tue, 14 Aug 2007, Natalie O'Toole wrote:
>
>   
>> Hi,
>>
>> Could someone please tell me how to perform a Mann-Whitney U test on a
>> dataset with 2 groups where one group has more data values than another?
>>
>> I have split up my 2 groups into 2 columns in my .txt file i'm using with
>> R. Here is the code i have so far...
>>
>> group1 <- c(LeafArea2)
>> group2 <- c(LeafArea1)
>> wilcox.test(group1, group2)
>>
>> This code works for datasets with the same number of data values in each
>> column, but not when there is a different number of data values in one
>> column than another column of data.
>>     
>
> There is an example of that scenario on the help page for wilcox.test, so 
> it does 'work'.  What exactly went wrong for you?
>
>   
>> Is the solution that i have to have a null value in the data column with
>> the fewer data values?
>>
>> I'm testing for significant diferences between the 2 groups, and the
>> result i'm getting in R with the uneven values is different from what i'm
>> getting in SPSS.
>>     
>
> We need a worked example.  As the help page says, definitions do differ. 
> If you can provide a reproducible example in R and the output from SPSS we 
> may be able to tell you how to relate that to what you see in R.
>
> [...]
>
>   
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>     
>
> As it says, we really need such code (and the output you get) to be able 
> to help you.
>
>   
Also, "two variables of different length in two columns" is not a good 
idea. If you read in things in parallel columns, it would usually imply 
paired data. If one column is shorter, you may be reading different data 
than you think. Check e.g. the "sleep" data for a better format.



More information about the R-help mailing list