[R] two questions for R beginner

David Winsemius dwinsemius at comcast.net
Wed Mar 3 03:53:53 CET 2010


On Mar 2, 2010, at 8:01 AM, Paul Hiemstra wrote:

> Brandon Zicha wrote:
>> Hey Paul,
> Hey Brandon, (adding R-help in the cc)
>
> I agree with you that the documentation of R could be better,  
> especially with more examples in code showing not only the common  
> cases, but also more esoteric cases. It would be great if everyone  
> invested a lot of time to write awesome documentation, but this is  
> not the case. I just objected to the tone (I tought :)) I spotted.  
> Some more comments are inline:
>>
>> Accepting the main point of my post - that the often VERY  
>> incomplete help files appended to packages can be a major stumbling  
>> block for getting up and running in R - I take your point.  I  
>> probably went a bit to far with my language there.
>>
>> I would point out though that a great many parts of research (like  
>> writing a bibliography - or searching for citations of any kind  
>> usually) aren't much fun, but are an important part of research  
>> related work.  Likewise, complete documentation (by which I hardly  
>> mean a paper - looking at STATA help files as a minimum would be a  
>> good start) is part of programming.  I agree that one needs to  
>> employ some level of judgement, otherwise you will get helpfile  
>> that says "First turn on the computer... then click the 'R'  
>> Icon...."  But, I have myself created one or two STATA functions  
>> that I have put up for public use - so I know how not fun, but  
>> necessary complete documentation is.  Further, I didn't say that  
>> writing documentation doesn't take time.  Everything takes time. My  
>> point was that relative to actually creating the application -  
>> writing more complete documentation takes very little time. If one  
>> invests the time to do the 'fun' stuff of writing a new package for  
>> R, it seems reasonable that taking the (proportionately) little  
>> time to write a nicer help file would be the most 'professional'  
>> thing to do.  But, this could be my illusion that all researchers  
>> seem themselves as professionals - rather than an anarchic egoistic  
>> enclave of independent self-interested paper producers.
> This is what scientists get judged upon, not on how much software  
> they publish and how good their documentation is. Furthermore, it is  
> quite hard for a hardcore R programmer to judge what people find har  
> about their software.
>> I am notorious for assuming greater standards as an acceptable  
>> 'norm' than my community at large :-)  Furthermore, you are  
>> absolutely right that my standards are apparently even to high for  
>> many commercial applications!  R help is sometimes downright good!
>>
>> So, if I accept that I am demanding S.O.B. and tone down my  
>> thoughts of proper documentation and professionalism and adopt the  
>> (probably more) reasonable perspective you do at the end of  "well,  
>> this is the world we live in... and come on it's free" I totally  
>> agree that I probably went too far!  But, better yet, I think that  
>> this observation you make suggests a solution: Perhaps R could use  
>> a more integrated and organized open source help system. I can  
>> think of a few possibilities - the easiest being a wiki version of  
>> R help.  This way users could add useful information to help files  
>> - such as more examples, tricks, tips, and known problems.  This  
>> would take advantage of the open source, free, user-community  
>> centered aspects of R, and permit those with an interest in helping  
>> beginners to post notes for beginners - on the help files.  I know  
>> that if such a wiki existed I would have posted my recent example  
>> of constrain optimization I just did recently.   It wouldn't be too  
>> difficult to add a function wikihelp(X) that would open the wiki  
>> help page rather than the standard help documentation.  Currently,  
>> help on any given command is scattered all over help fora all about  
>> the web.  A central, indexed, and easily referenced help system  
>> might be a solution.  Heck, such a system could go a step further  
>> and link R-help listserv archives by command thus centralizing and  
>> integrating the open-source user-built information resource of the  
>> listserv into help().  How many e-mails to this listserv begin with  
>> 'I just spent a few hours cruising the help forums related to 'X'  
>> and couldn't find an answer.'
> Sounds like a good addition, allowing people to add to the  
> documentation as they see fit. There is ofcourse the R wiki, but  
> this is not widely used and not firmly embedded into R itself. But  
> how would we keep such a system you propose manageable, preventing  
> it from becoming an enormous mess. Maybe some kind of moderation?
>>
>> I note that STATA has all their help files for the latest version  
>> of stata available on the web (http://www.stata.com/help.cgi? 
>> contents).  How difficult would a similar system - only with R,  
>> editable and with links to supplementary information - be to set up?


I cannot comment on how difficult it would was to set up, but I must  
disagree that it does not exist for R. The default for RSiteSearch is  
Jon Baron's search utility. It appears to have been relatively  
recently modified so that it searches functions and not r-help but in  
that form it addresses your expectations quite well. I suspect that  
the cognoscenti could offer other search strategies that would be  
eqaully effective.

http://finzi.psych.upenn.edu/search.html

(I will also comment that when I was referred to the Stata  
documentation, my small taste left me thinking R help files were far  
superior to what I found for Stata.)

-- 
David.


>> I can't imagine it would be horribly expensive in terms of set up  
>> costs.
> A problem is that there is no company that markets R that could set  
> this up, the community is much looser, much more open source.  
> Probably the R core team would be the closest thing we have.
>>
>> What do you think?
>>
>> Best,
>>
>> Brandon Z
>>
>>
>> On Mar 2, 2010, at 1:16 PM, Paul Hiemstra wrote:
>>
>>> Brandon Zicha wrote:
>>>>>>> What were your biggest misconceptions or
>>>>>>> stumbling blocks to getting up and running
>>>>>>> with R?
>>>>
>>>> Easy.  I terms of materials I have been unable to find good books  
>>>> that introduce users to R from the perspective of someone  
>>>> familiar only with packages like SPSS or STATA, or not familiar  
>>>> with statistics packages at all.  Even introduction texts use  
>>>> jargon without introducing it.
>>>>
>>>> I think that R-help files should be more thorough than they are,  
>>>> and contain more examples.  I thought that STATA help files were  
>>>> sparse!  The notion that 'R is a user community and thus they do  
>>>> this in their spare time' is no excuse for those creating new  
>>>> tools for R not developing complete help files.  It doesn't take  
>>>> that much time relative to actually creating the new function.
>>> Hi Brandon,
>>>
>>> I would disagree with your point that documentation doesn't take  
>>> much time. Writing documentation that is suitable for both the  
>>> advanced user (being a reference, and thus preferably short) and  
>>> the beginning user (being sort of a tutorial, and thus  
>>> prefererably longer) is quite a challenge, comparable to writing a  
>>> good paper. Apart from the fact that it takes quite a while, it is  
>>> also not much fun. Often people develop packages for their own  
>>> research and put the software online so others can benefit, they  
>>> don;t need the documentation themselves and don't get paid to  
>>> write the documentation.
>>>
>>> So saying 'it's no excuse' really goes too far in my view. R is  
>>> free, you did not pay several thousands of euros giving you the  
>>> right for good support. Even the support is free through the  
>>> mailing list. You can get a paid version of R at Revelution  
>>> Computing. Then you can call them if there are problems. I'm not  
>>> meaning to offend anybody, but I didn't agree with "is no excuse  
>>> for those creating new tools for R not developing complete help  
>>> files".  Partly the strength of R is in the open source, but  
>>> sometimes, as with documentation, this can bite you. But I think  
>>> the R docs aren't that bad, I've seen proprietary software that a  
>>> worse job than R.
>>>
>>> my 2euro on the subject :),
>>>
>>> Cheers,
>>> Paul
>>>>
>>>> In terms of actual R use - creating, using, and manipulating data  
>>>> are the biggest frustration for those of the 'spreadsheet  
>>>> generation'.  I get the impression that one needs to not merely  
>>>> understand, but be fully fluent in the jargon of matrix  
>>>> mathematics to even know what is going on half the time.  I find  
>>>> myself - even now - using 'rules of thumb' that 'seemed to work'  
>>>> rather than fully understanding what I am doing.  It is  
>>>> particularly discouraging when many of those 'intro books'  
>>>> suggest using something besides R for data manipulation - how  
>>>> clumsy is that!?
>>>>
>>>> I find the actual programming syntax itself is the easiest part  
>>>> to master.  It is certainly more flexible - but without a  
>>>> particularly sufficient increase in complexity - than trying to  
>>>> write script in SPSS and STATA.
>>>>
>>>> Brandon Zicha
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>> -- 
>>> Drs. Paul Hiemstra
>
> -- 
> Drs. Paul Hiemstra



More information about the R-help mailing list