[R] Comparing distributions

Thu Jun 24 04:52:49 CEST 2010

The diagram only serves as a rough example to give you an idea.

To be more precise I would like to give more detail: The data
represents movements from two types of pointing device (e.g. mouse,
pointer, ) along an axis. The data has diffreent parameters -- such as
different pointing devices, different axis, split by different
experiment conditions etc. but the problem is always the same: I would
like find out if their distributions correlate and would like to have
some kind of 'objective' (Yes, I know -- nothing is objective -- but
eye-balling isn't either.) measure, test, etc. These would be
accompanied by Q-Q plots and density plots to get a general feeling of
what is going on and become part of the discussion. I don't expect a
solution from here, but perhaps a general direction where I could find
my kind of problem being understood.

Ralf

On Wed, Jun 23, 2010 at 10:07 PM, Robert A LaBudde <ral at lcfltd.com> wrote:
> Your "*" curve apparently dominates your "+" curve.
>
> If they have the same total number of data each, as you say, they both
> cannot sum to the same value (e.g., N = 10000 or 1.000).
>
> So there is something going on that you aren't mentioning.
>
> Try comparing CDFs instead of pdfs.
>
> At 03:33 PM 6/23/2010, Ralf B wrote:
>>
>> I am trying to do something in R and would appreciate a push into the
>> right direction. I hope some of you experts can help.
>>
>> I have two distributions obtrained from 10000 datapoints each (about
>> 10000 datapoints each, non-normal with multi-model shape (when
>> eye-balling densities) but other then that I know little about its
>> distribution). When plotting the two distributions together I can see
>> that the two densities are alike with a certain distance to each other
>> (e.g. 50 units on the X axis). I tried to plot a simplified picture of
>> the density plot below:
>>
>>
>>
>>
>> |
>> |                                                         *
>> |                                                      *     *
>> |                                                   *    +   *
>> |                                              *     +     +  *
>> |                     *        +           *   +            +  *
>> |                 *        +*     +   *  +                   + *
>> |              *       +       *     +                           +*
>> |           *       +                                               +*
>> |        *       +                                                    +*
>> |     *      +                                                          +
>> *
>> |  *      +
>> + *
>> |___________________________________________________________________
>>
>>
>> What I would like to do is to formally test their similarity or
>> otherwise measure it more reliably than just showing and discussing a
>> plot. Is there a general approach other then using a Mann-Whitney test
>> which is very strict and seems to assume a perfect match. Is there a
>> test that takes in a certain 'band' (e.g. 50,100, 150 units on X) or
>> are there any other similarity measures that could give me a statistic
>> about how close these two distributions are to each other ? All I can
>> say from eye-balling is that they seem to follow each other and it
>> appears that one distribution is shifted by a amount from the other.
>> Any ideas?
>>
>> Ralf
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ================================================================
> Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: ral at lcfltd.com
> Least Cost Formulations, Ltd.            URL: http://lcfltd.com/
> 824 Timberlake Drive                     Tel: 757-467-0954
> Virginia Beach, VA 23464-3239            Fax: 757-467-2947
>
> "Vere scire est per causas scire"
> ================================================================
>
>