[R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb

Henrik Bengtsson hb at biostat.ucsf.edu
Thu May 19 04:17:42 CEST 2011


On Wed, May 18, 2011 at 6:28 PM, David Scott <d.scott at auckland.ac.nz> wrote:
>  Another style guide is at:
> http://www1.maths.lth.se/help/R/RCC/
>
> Listed as a first draft and dated 2005, but still worth a read. Has some
> references also.

That URL obsolete (I need to have it removed) - a more recent/stable
URL is [5] below.

LIST OF CONVENTIONS/STYLES FOR R:

[1] R coding standards in the R Internals manual
  http://www.cran.r-project.org/doc/manuals/R-ints.html#R-coding-standards

[2] Bioconductor coding standards
  http://wiki.fhcrc.org/bioc/Coding_Standards

[3] Google R style
[http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html]

[4] R style guide by Hadley Wickham (based on [3])
  http://had.co.nz/stat405/resources/r-style-guide.html

[5] R Coding Conventions (RCC) - a draft by Henrik Bengtsson
  http://aroma-project.org/developers/RCC

[6] The Aroma R Coding Conventions (Aroma RCC) by Henrik Bengtsson
(based on [5])
  http://aroma-project.org/developers/AromaRCC

Note that there are often different objectives driving the different
coding styles, which is why it makes little sense to debate certain
items. As an example, one convention may favor portability to another
language limiting it to not use S4 (just an example).

/Henrik

>
> I think I recall Hadley having a style guide which he requested his students
> followed, but I didn't like it too much (sorry Hadley) .
>
> I am with Bill that style guides should be consulted and their
> recommendations considered, but it is personal preference as to which rules
> one accepts. I don't find it objectionable if someone has written in a style
> I don't particularly like, but it is objectionable if no thought has been
> given to programming style.
>
> David Scott
>
>
> On 19/05/11 10:26, Bill.Venables at csiro.au wrote:
>>
>> Hi Bert,
>>
>> I think people should know about the Google Sytle Guide for R because, as
>> I said, it represents a thoughtful contribution to the debate.  Most of its
>> advice is very good (meaning I agree with it!) but some is a bit too much
>> (for example, the blanket advice never to use S4 classes and methods -
>> that's just resisting progress, in my view).  The advice on using<- for the
>> (normal) assingment operator rather than = is also good advice, (according
>> to me), but people who have to program in both C and R about equally often
>> may find it a bit tedious.  We can argue over that one.
>>
>> I suggest it has a place in the R FAQ but with a suitable warning that
>> this is just one view, albeit a thougtful one.  I don't think it need be
>> included in the posting guide, though.  It would take away some of the fun.
>>  :-)
>>
>> Bill Venables.
>>
>> -----Original Message-----
>> From: Bert Gunter [mailto:gunter.berton at gene.com]
>> Sent: Wednesday, 18 May 2011 11:47 PM
>> To: Venables, Bill (CMIS, Dutton Park)
>> Cc: r-help at r-project.org
>> Subject: R Style Guide -- Was Post-hoc tests in MASS using glm.nb
>>
>> Thanks Bill. Do you and others think that a link to this guide (or
>> another)should be included in the Posting Guide and/or R FAQ?
>>
>> -- Bert
>>
>> On Tue, May 17, 2011 at 4:07 PM,<Bill.Venables at csiro.au>  wrote:
>>>
>>> Amen to all of that, Bert.  Nicely put.  The google style guide (not
>>> perfect, but a thoughtful contribution on these kinds of issues, has
>>> avoiding attach() as its very first line.  See
>>> http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html)
>>>
>>> I would add, though, that not enough people seem yet to be aware of
>>> within(...), a companion of with(...) in a way, but used for modifying data
>>> frames or other kinds of list objects.  It should be seen as a more flexible
>>> replacement for transform() (well, almost).
>>>
>>> The difference between with() and within() is as follows:
>>>
>>> with(data, expr, ...)
>>>
>>> allows you to evaluate 'expr' with 'data' providing the primary source
>>> for variables, and returns *the evaluated expression* as the result.  By
>>> contrast
>>>
>>> within(data, expr, ...)
>>>
>>> again uses 'data' as the primary source for variables when evaluating
>>> 'expr', but now 'expr' is used to modify the varibles in 'data' and returns
>>> *the modified data set* as the result.
>>>
>>> I use this a lot in the data preparation phase of a project, especially,
>>> which is usually the longest, trickiest, most important, but least discussed
>>> aspect of any data analysis project.
>>>
>>> Here is a simple example using within() for something you cannot do in
>>> one step with transform():
>>>
>>> polyData<- within(data.frame(x = runif(500)), {
>>>  x2<- x^2
>>>  x3<- x*x2
>>>  b<- runif(4)
>>>  eta<- cbind(1,x,x2,x3) %*% b
>>>  y<- eta + rnorm(x, sd = 0.5)
>>>  rm(b)
>>> })
>>>
>>> check:
>>>
>>>> str(polyData)
>>>
>>> 'data.frame':   500 obs. of  5 variables:
>>>  $ x  : num  0.5185 0.185 0.5566 0.2467 0.0178 ...
>>>  $ y  : num [1:500, 1] 1.343 0.888 0.583 0.187 0.855 ...
>>>  $ eta: num [1:500, 1] 1.258 0.788 1.331 0.856 0.63 ...
>>>  $ x3 : num  1.39e-01 6.33e-03 1.72e-01 1.50e-02 5.60e-06 ...
>>>  $ x2 : num  0.268811 0.034224 0.309802 0.060844 0.000315 ...
>>> Bill Venables.
>>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>>> On Behalf Of Bert Gunter
>>> Sent: Wednesday, 18 May 2011 12:08 AM
>>> To: Peter Ehlers
>>> Cc: R list
>>> Subject: Re: [R] Post-hoc tests in MASS using glm.nb
>>>
>>> Folks:
>>>
>>>> Only if the user hasn't yet been introduced to the with() function,
>>>> which is linked to on the ?attach page.
>>>>
>>>> Note also this sentence from the ?attach page:
>>>>  ".... attach can lead to confusion."
>>>>
>>>> I can't remember the last time I needed attach().
>>>>
>>>> Peter Ehlers
>>>
>>> Yes. But perhaps it might be useful to flesh this out with a bit of
>>> commentary. To this end, I invite others to correct or clarify the
>>> following.
>>>
>>> The potential "confusion" comes from requiring R to search for the
>>> data. There is a rigorous process by which this is done, of course,
>>> but it requires that the runtime environment be consistent with that
>>> process, and the programmer who wrote the code may not have control
>>> over that environment. The usual example is that one has an object
>>> named,say,  "a" in the formula and in the attached data and another
>>> "a" also in the global environment. Then the wrong "a" would be found.
>>> The same thing can happen if another data set gets attached in a
>>> position before the one of interest. (Like Peter, I haven't used
>>> attach() in so long that I don't know whether any warning messages are
>>> issued in such cases).
>>>
>>> Using the "data = " argument when available or the with() function
>>> when not avoids this potential confusion and tightly couples the data
>>> to be analyzed with the analysis.
>>>
>>> I hope this clarifies the previous posters' comments.
>>>
>>> Cheers,
>>> Bert
>>>
>>>> [... non-germane material snipped ...]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>> --
>>> "Men by nature long to get on to the ultimate truths, and will often
>>> be impatient with elementary studies or fight shy of them. If it were
>>> possible to reach the ultimate truths without the elementary studies
>>> usually prefixed to them, these would not be preparatory studies but
>>> superfluous diversions."
>>>
>>> -- Maimonides (1135-1204)
>>>
>>> Bert Gunter
>>> Genentech Nonclinical Biostatistics
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>
> --
> _________________________________________________________________
> David Scott     Department of Statistics
>                The University of Auckland, PB 92019
>                Auckland 1142,    NEW ZEALAND
> Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
> Email:  d.scott at auckland.ac.nz,  Fax: +64 9 373 7018
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list