[Rd] [R] Semantics of sequences in R

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Mon Feb 23 13:27:08 CET 2009


Berwin A Turlach wrote:

<snip>

>> can you give one concrete example, and suggest how to estimate how
>> much old code would involve the same issue?
>>     
>
> Check out the svn source of R, run configure, do whatever change you
> want to sort.list, "make", "make check FORCE=FORCE".  That should give
> you an idea how much would break.  
>   

it's not just making changes to sort.list, berwin.  sort.list calls
.Internal order, and this one would have to be modified in order to
accommodate for the additional comparator argument.  not that it is
impossible or even difficult, i just haven't found time yet to learn how
to implement internal functions in r.

> Additionally, you could try to install all CRAN packages with your
> modified version and see how many of them break when their
> examples/demos/&c is run.  
>   

that's not a good benchmark;  this are third-party stuff, and where
people are willing to rely on poor design they should be prepared to
suffer.  but maybe not in r, where protection of old code seems more
important than progress.


> AFAIK, Brian is doing something like this on his machine.  I am sure
> that if you ask nicely he will share his scripts with you.
>   

:)


> If this sounds too time consuming, you might just want to unpack the
> sources and grep for "sort.list" on all .R files;  I am sure you know
> how to use find and grep to do this.
>   

of course i've done it in the first place;  there are 52 such entries in
r-devel, and i can't see any where allowing sort.list to sort lists
would break the code.  it does not mean, of course, that it wouldn't.

>   
>>> Also, until reading Patrick Burns' "The R Inferno" I was not aware
>>> of sort.list.  That function had not registered with me since I
>>> hardly used it.  
>>>       
>> which hints that "potentially will break a lot (?) of existing code"
>> is a rather unlikely event.
>>     
>
> Only for code that I wrote; other people's need and knowledge of R may
> vary.
>   

hence 'hints', not 'proves'.

>  
>   
>>> And I also have no need of calling sort() on lists.  For em a
>>> lists is a flexible enough data structure such that defining a
>>> sort() command for them makes no sense; it could only work in very
>>> specific circumstances.
>>>   
>>>       
>> i don't understand the first part:  "flexible enough data structure
>> such that defining a sort() command for them makes no sense" makes no
>> sense.
>>     
>
> lists are very flexible structure whose component must not be of equal
> type.  So how do you want to compare components?  How to you compare a
> vector of numbers to a vector of character strings?  Or a list of
> lists?  
>   

*very* easy:  by applying a suitable comparator.

in your specific example, the possibilities are virtually limitless. 
you can convert the numbers to strings, or parse the strings into
numbers.  you can compute the lengths of the lists, compare their
elements pairwise, or whatever you wish.  all this done by a comparator
which is a function fullfilling just one requirement:  that it returns,
say, -1, 0, or 1, depending on how the two items it gets compare.  (and
it has to work for the type of items you happen to have in your lists --
all this *your* business, not sort's.)

judging from your question, you couldn't possibly see sorting routines
in other languages.


> Or should the sorting be on the length of the components?  

why not?

> Or their
> names?  

why not?

> Or should sort(myList) sort each component of myList?  

that's a design decision.  you can always have a parameter like
recursive=TRUE/FALSE, no?  so you could sort or not both between and
within lists.  what's the problem, again?

> But for
> that case we have already lapply(myList, sort).
>   

so?

>   
>> as to "it could only work in very specific circumstances" -- no, it
>> would work for any list whatsoever, provided the user has a correctly
>> implemented comparator.  for example, i'd like to sort a list of
>> vectors by the vectors' length -- is this a very exotic idea?
>>     
>
> No, if that is what you want.  And I guess it is one way of sorting a
> list.  The question is what should be the default way?  
>   

one possible answer is: none.  (i have already given this answer
previously, if you read carefully it's still there).  sort.list *should*
demand an additional comparator argument.  at least, it should demand it
if the argument to be sorted is a list, rather than a non-list vector
(if you still need to use sort.list on non-lists).

>   
>>> BTW, as I mentioned once before, you might want to consider to lose
>>> these chips on your shoulders.
>>>   
>>>       
>> berwin, it's been a tradition on this list to discourage people from
>> commenting on the design and implementation of r whenever they think
>> it's wrong.  
>>     
>
> I am not aware of any such tradition and I subscribed to R-help on 15
> April 1998.  
>
> The point is rather that by commenting only one will not achieve much,
> in particular if the comments look more like complaints and the same
> comments are done again and again (along with dragging up previous
> comments or comments received on previous comments).
>   

again and again because you seem to be immune to critique.  open you
mind, and it will suffice complain just once.  besides, i am certainly
*not* just complaining.  i am providing concrete arguments, examples,
and suggestions.  you're being unreasonably unfair.

> R is open source.  Check out the svn version, fix what you consider
> needs fixing, submit a patch, convince R core that the patch fixes a
> real problem/is an improvement/does not break too much.  Then you have
> a better chance in achieving something.  
>   

no, berwin.  this is a serious bug in thinking.  people should be
allowed -- *encouraged* -- to discuss the design *before* they even
attempt to write patches.  writing one patch which will never be
considered -- well, never responded to -- is about enough to stop people
from sending patches.  maybe that's what you want, anyway -- the fewer
incoming patches the more you're entitled to think your product is just
great.


> Alternatively, if it turns out that something that bugs you cannot be
> changed without breaking too much existing code, start from scratch
> that with a better design.  Apparently the GAP project
> (http://www.gap-system.org/) is doing something like this, as
> someone closely associated with that project once told me.  While
> developing a version of GAP they collect information on how to improve
> the design, data structures &c; then, at some point, they start to
> write the next version from scratch.
>   

can't see it online.


>   
>   
>>>> scary!  it's much preferred to confuse new users.
>>>>         
>>> I usually learn a lot when I get confused about some issues/concept.
>>> Confusion forces one to sit down, think deeply and, thus, gain some
>>> understanding.  So I am not so much concerned with new users being
>>> confused.  It is, of course, a problem if the new user never comes
>>> out of his or her confusion.
>>>       
>> the problem, is, r users have to learn lots [...]
>>     
>
> Indeed, and I guess in this age of instant gratification that that is a
> real bummer for new users.
>   

why be rude to your users?  "I am not so much concerned with new users
being confused." is an explanation -- but maybe you really should?

vQ



More information about the R-devel mailing list