[R] ecdf

gj gawesh at gmail.com
Mon Oct 17 02:48:41 CEST 2011


David is right. I am looking for the ecfd for fs$numstudents. The
other column is just an id.

I guess I don't know how to read the R documentation when it comes to functions.

looking at the documentation, i now notice that it says "Compute an
empirical cummulative distribution function and not a vector.

But still I would had assumed that in ecdf(x) ... the x is the argument.

So ecdf(fs$numstudents)(unique(fs$numstudents))
     ===============  ==================
          function                       arguments

Yes? But I can't read that from the documentation? I suspect it has
something to those dots .... in the arguments which I don't
understand.

Why it says usage ecdf(x) when it's clearly not the case?

I don't get it.

Gawesh


On Sun, Oct 16, 2011 at 11:02 PM, David Winsemius
<dwinsemius at comcast.net> wrote:
>
> On Oct 16, 2011, at 3:53 PM, Dennis Murphy wrote:
>
>> Hi:
>>
>> I don't understand what you're attempting to do. Wouldn't courseid be
>> a categorical variable with a numeric label? If that is so, why are
>> you trying to compute an EDF? An EDF computes cumulative relative
>> frequency of a random variable, which by definition is numeric. If we
>> were talking about EDFs for a distribution of student course grades on
>> a numeric point system by course, that would make some sense, but I
>> don't see how the course IDs themselves qualify as being on an
>> interval scale of measurement. Could you clarify your intent?
>
> Huh? gawesh asked for ecdf on numstrudents (not courseid)  ... pretty
> clearly a numeric value for which an ECDF should make sense.
>
> --
> David.
>
> --
>>
>> Dennis
>>
>> On Sun, Oct 16, 2011 at 8:31 AM, gj <gawesh at gmail.com> wrote:
>>>
>>> Hi,
>>> Newbie here. I read the R for Beginners but i still don't get this.
>>>
>>> I have the following data (this is just an example) in a CSV file:
>>>
>>>   courseid numstudents
>>>       101         209
>>>       141          13
>>>       246         140
>>>       263           8
>>>       321          10
>>>       361          10
>>>       364          28
>>>       365          25
>>>       366          23
>>>       367          34
>>>
>>> I load my data using:
>>>
>>> fs<-read.csv(file="C:\\num_students_inallmodules.csv",header=T, sep=',')
>>>
>>> I want to get the ecdf. So, I looked at the ?ecdf which says
>>> usage:ecdf(x)
>>>
>>> So I expected ecdf(fs$numstudents) to work
>>>
>>> Instead it just returned:
>>> Call: ecdf(fs$numstudents)
>>>  x[1:210] =      1,      2,      3,  ...,   3717,   4538
>>>
>>> After Googling, got this to work:
>>> ecdf(fs$numstudents)(unique(fs$numstudents))
>>>
>>> But I don't understand why if the ?ecdf says usage is ecdf(x) ... I
>>> need to use ecdf(fs$numstudents)(unique(fs$numstudents)) to get this
>>> to work?
>>>
>>> Can somebody explain this to me?
>>>
>>> Regards
>>> Gawesh
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>



More information about the R-help mailing list