[Rd] [R] Semantics of sequences in R

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Mon Feb 23 11:31:16 CET 2009


Berwin A Turlach wrote:
> On Mon, 23 Feb 2009 08:52:05 +0100
> Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
>  
>   
>> Berwin A Turlach wrote:
>>     
>>> G'day Stavros,
>>>       
>> <snip>
>>     
>>>> In many cases, the orthogonal design is pretty straightforward.
>>>> And in the cases where the operation is currently an error (e.g.
>>>> sort(list(...))), I'd hope that wouldn't break existing code. [...]
>>>>     
>>>>         
>>> This could actually be an example that would break a lot of existing
>>> code.
>>>
>>> sort is a generic function, and for sort(list(...)) to work, it
>>> would have to dispatch to a function called sort.list; and as
>>> Patrick Burns' "The R Inferno" points out, such a function exists
>>> already and it is not for sorting list.  
>>>   
>>>       
>> and you mean that sort.list not being applicable to lists is a) good
>> design, and b) something that by noe means should be fixed, right?
>>     
>
> I neither said nor meant this and I do not see how what I said could be
> interpreted in such a way.  I was just commenting to Stavros that the
> example he picked, hoping that it would not break existing code, was
> actually a bad one which potentially will break a lot (?) of existing
> code.
>   

would it, really?  if sort.list were, in addition to sorting atomic
vectors (can-be-considered-lists), able to sort lists, how likely would
this be to break old code?  can you give one concrete example, and
suggest how to estimate how much old code would involve the same issue?

sort.list, to be applied to an atomic vector, must be called explicitly
on the vector, because calling sort will not automatically dispatch to
sort.list (right?).   so allowing sort.list to sort lists does not
change anything in this respect -- except for that, as i suggested, if
sort.list were requiring an explicit comparator, you'd have to add one
wherever sort.list is called, but to accomodate for old code sort.list
could actually check whether the argument is not an atomic vector.

how much old code could be relying on the fact that sort.list raises an
error when given a list?  i suspect it's fairly unlikely that any single
piece of code does;  and if so, allowing sort.list to sort lists would
not change anything here either.




> Also, until reading Patrick Burns' "The R Inferno" I was not aware of
> sort.list.  That function had not registered with me since I hardly
> used it.  

which hints that "potentially will break a lot (?) of existing code" is
a rather unlikely event.

> And I also have no need of calling sort() on lists.  For em a
> lists is a flexible enough data structure such that defining a sort()
> command for them makes no sense; it could only work in very specific
> circumstances.
>   

i don't understand the first part:  "flexible enough data structure such
that defining a sort() command for them makes no sense" makes no sense.

as to "it could only work in very specific circumstances" -- no, it
would work for any list whatsoever, provided the user has a correctly
implemented comparator.  for example, i'd like to sort a list of vectors
by the vectors' length -- is this a very exotic idea?


>   
>>> In fact, currently you get:
>>>
>>> R> cc <- list(a=runif(4), b=rnorm(6))
>>> R> sort(cc)
>>> Error in sort.list(cc) : 'x' must be atomic for 'sort.list'
>>> Have you called 'sort' on a list?
>>>   
>>>       
>> one of the most funny error messages you get in r.  note also that,
>> following rolf turner's lists and vectors unproven theorem, a vector
>> can be considered a list 
>>     
>
> I do not remember the exact context of Rolf's comments, but I believe
> he was talking in a more general sense and not in technical terms. 

indeed, he was blurring the concepts instead of referring to concrete
documentation with clear specified meaning of the terms he used.


>  I
> find it perfectly valid, even when talking about R, to say something
> like "vectors are stored as a list of numbers in consecutive memory
> locations in memory".  

yes;  and you can always say that 'vectors can be considered electrical
charges', or better, 'vectors can be considered electrical charges, in
some sense'.

what sense of 'list' are you using here?  i'd rather use the term
'array', unless confusing the user is the real purpose.  (and to be
really picky, you do not store numbers.)


> Clearly, in a phrase like this, we are not
> talking about "vectors" and "list" as defined by the "R Language
> Definition" or "R Internals", or what functions like is.vector(),
> is.list() &c return for various R objects.
>   

clearly, you can say anything you like, and then add 'i was not talking
about x as defined by y'.  the art  is to talk about x as defined by y.

> BTW, as I mentioned once before, you might want to consider to lose
> these chips on your shoulders.
>   

berwin, it's been a tradition on this list to discourage people from
commenting on the design and implementation of r whenever they think
it's wrong.  you really should be doing the opposite.  as a chinese
proverb says, a gem cannot be polished without friction.  friction seems
to be what you fear a lot.


>   
>> -- hence sort.list should raise the error on any vector input, no?
>>     
>
> You will have to take that up with the designers of sort.list.
>   
>   
>>> Thus, to make sort(list()) work, you would have to rename the
>>> existing sort.list and then change every call to that function to
>>> the new name. I guess this might break quite a few packages on CRAN.
>>>   
>>>       
>> scary!  it's much preferred to confuse new users.
>>     
>
> I usually learn a lot when I get confused about some issues/concept.
> Confusion forces one to sit down, think deeply and, thus, gain some
> understanding.  So I am not so much concerned with new users being
> confused.  It is, of course, a problem if the new user never comes out
> of his or her confusion.
>   

the problem, is, r users have to learn lots and lots of *bad* and
*messy* design to get up and running without devils catching them behind
every corner.  in principle, you're absolutely right;  the problem lies
in the amount of effort a user has to make to avoid confusion while
using r  (where 'using' means a bit more than simply fitting and
plotting a model).

cheers,
vQ



More information about the R-devel mailing list