[R] R Style Guide -- Was Post-hoc tests in MASS using glm.nb

Steve_Friedman at nps.gov Steve_Friedman at nps.gov
Wed May 18 16:06:57 CEST 2011


This is the first time I've seen an R Style Guide.  I will admit that I
haven't looked for one previously, but nevertheless I still haven't seen
one. My code style simply evolved (perhaps, chugged along) by reading posts
from other users who post to the r-help community.

I regularly program with a colleague who is a Java software development
specialist, hacking together code that we both develop.   Since his coding
style differs substantially from mine and the conventions described for R
we end up modifying my code to follows his convention.  For example, he
typically likes to name variables in this form: "variable_" , which the
guide frowns on.

I think this guide will be very helpful.  First for me to become more
proficient and conventional following R stylistics.  Secondly, he will see
why R users do things the way R.  The guide should be helpful.

I appreciate you posting the link to the guide. Much appreciated.

Steve

Steve Friedman Ph. D.
Ecologist  / Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

Steve_Friedman at nps.gov
Office (305) 224 - 4282
Fax     (305) 224 - 4147


                                                                           
             Bert Gunter                                                   
             <gunter.berton at ge                                             
             ne.com>                                                    To 
             Sent by:                  Bill.Venables at csiro.au              
             r-help-bounces at r-                                          cc 
             project.org               r-help at r-project.org                
                                                                   Subject 
                                       [R] R Style Guide -- Was Post-hoc   
             05/18/2011 09:47          tests in MASS using glm.nb          
             AM                                                            
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Thanks Bill. Do you and others think that a link to this guide (or
another)should be included in the Posting Guide and/or R FAQ?

-- Bert

On Tue, May 17, 2011 at 4:07 PM,  <Bill.Venables at csiro.au> wrote:
> Amen to all of that, Bert.  Nicely put.  The google style guide (not
perfect, but a thoughtful contribution on these kinds of issues, has
avoiding attach() as its very first line.  See
http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html)
>
> I would add, though, that not enough people seem yet to be aware of
within(...), a companion of with(...) in a way, but used for modifying data
frames or other kinds of list objects.  It should be seen as a more
flexible replacement for transform() (well, almost).
>
> The difference between with() and within() is as follows:
>
> with(data, expr, ...)
>
> allows you to evaluate 'expr' with 'data' providing the primary source
for variables, and returns *the evaluated expression* as the result.  By
contrast
>
> within(data, expr, ...)
>
> again uses 'data' as the primary source for variables when evaluating
'expr', but now 'expr' is used to modify the varibles in 'data' and returns
*the modified data set* as the result.
>
> I use this a lot in the data preparation phase of a project, especially,
which is usually the longest, trickiest, most important, but least
discussed aspect of any data analysis project.
>
> Here is a simple example using within() for something you cannot do in
one step with transform():
>
> polyData <- within(data.frame(x = runif(500)), {
>  x2 <- x^2
>  x3 <- x*x2
>  b <- runif(4)
>  eta <- cbind(1,x,x2,x3) %*% b
>  y <- eta + rnorm(x, sd = 0.5)
>  rm(b)
> })
>
> check:
>
>> str(polyData)
> 'data.frame':   500 obs. of  5 variables:
>  $ x  : num  0.5185 0.185 0.5566 0.2467 0.0178 ...
>  $ y  : num [1:500, 1] 1.343 0.888 0.583 0.187 0.855 ...
>  $ eta: num [1:500, 1] 1.258 0.788 1.331 0.856 0.63 ...
>  $ x3 : num  1.39e-01 6.33e-03 1.72e-01 1.50e-02 5.60e-06 ...
>  $ x2 : num  0.268811 0.034224 0.309802 0.060844 0.000315 ...
>>
>
> Bill Venables.
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Bert Gunter
> Sent: Wednesday, 18 May 2011 12:08 AM
> To: Peter Ehlers
> Cc: R list
> Subject: Re: [R] Post-hoc tests in MASS using glm.nb
>
> Folks:
>
>> Only if the user hasn't yet been introduced to the with() function,
>> which is linked to on the ?attach page.
>>
>> Note also this sentence from the ?attach page:
>>  ".... attach can lead to confusion."
>>
>> I can't remember the last time I needed attach().
>>
>> Peter Ehlers
>
> Yes. But perhaps it might be useful to flesh this out with a bit of
> commentary. To this end, I invite others to correct or clarify the
> following.
>
> The potential "confusion" comes from requiring R to search for the
> data. There is a rigorous process by which this is done, of course,
> but it requires that the runtime environment be consistent with that
> process, and the programmer who wrote the code may not have control
> over that environment. The usual example is that one has an object
> named,say,  "a" in the formula and in the attached data and another
> "a" also in the global environment. Then the wrong "a" would be found.
> The same thing can happen if another data set gets attached in a
> position before the one of interest. (Like Peter, I haven't used
> attach() in so long that I don't know whether any warning messages are
> issued in such cases).
>
> Using the "data = " argument when available or the with() function
> when not avoids this potential confusion and tightly couples the data
> to be analyzed with the analysis.
>
> I hope this clarifies the previous posters' comments.
>
> Cheers,
> Bert
>
>>
>> [... non-germane material snipped ...]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> "Men by nature long to get on to the ultimate truths, and will often
> be impatient with elementary studies or fight shy of them. If it were
> possible to reach the ultimate truths without the elementary studies
> usually prefixed to them, these would not be preparatory studies but
> superfluous diversions."
>
> -- Maimonides (1135-1204)
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
"Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions."

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://devo.gene.com/groups/devo/depts/ncb/home.shtml

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list