[R] Equivalent to Stata egen

David Winsemius dwinsemius at comcast.net
Thu Apr 16 19:39:44 CEST 2009


Terse is OK by me as long as I get told what goes in (allowable data  
types, argument names and effects) and what comes out. What seemed to  
be lacking in that Stata doc for egen was a description of the purpose  
or behavior and then could find no description of the values produced.  
Perhaps it is because Stata has an approach that everything is a  
rectangular array? Is everything assumed to create a new column of  
data as in SAS?

At any rate it looked to this casual non-user, reading that document,  
that egen creates a new variable aligned with its argument variables  
by applying various functions within groupings. That is pretty much  
what ave does. "ave" is not restricted to mean as a functional  
argument. As I said it was a guess.

The texts I used to get up to speed in R are several downloaded from  
the Contributed documents (including anything written by Venables),  
V&R MASS v 2, Harrell's RMS, Sarkar's Lattice, Chambers&Hastie SMiS  
and reading a lot of Q&A on this list.

-- 
David Winsemius
On Apr 16, 2009, at 11:57 AM, Stas Kolenikov wrote:

> http://www.stata.com/help.cgi?egen -- it creates new variables dealing
> with some special relatively non-standard tasks that don't boil down
> to a one-line arithmetic expressions. For that reason, there will be
> no equivalent to -egen- in general, as it has so many functions that
> are so different. -rowtotal- is of course just a shorthand for sum(),
> except for treatment of missing values ( ifelse(is.na(x),0,x ). But
> -anycount- is a moderately complicated double cycle over variables and
> list of values (40 lines of underlying Stata code, including parsing
> and labeling the resulting variables)... which will probably become a
> triple R cycle including the cycle over observations, although the
> latter can probably be avoided.
>
> Yes, R documentation looks exteremely terse to me as a regular Stata
> user. I am used to seeing the concpets explained well, even in the
> help files, and certainly more so in the shelved books. As every
> option and every part of the syntax is devoted at least three to five
> sentences, and the most common uses are exemplified, I can usually
> figure out how to run a particular task relatively quickly. (The data
> management tricks, which is what Peter was asking about above, are
> probably an exception: you either know them, or you don't. In this
> example, I don't know the corresponding R tricks, although I can
> probably brute force the solution if I needed to.) The fraction of
> commands in R that I personally have been coming across that are
> comparably well documented is about a quarter. For other, it is either
> a guesswork+CRANning+googling around or "Forget it, I'll just go back
> to Stata to do it" after a few futile attempts. May be I just don't
> know where to look for the good stuff, but it is certainly outside R
> as a package+its documentation.
>
> On 4/15/09, David Winsemius <dwinsemius at comcast.net> wrote:
>> Peter Kraglund Jacobsen <peter <at> kraglundjacobsen.dk> writes:
>>
>>>
>>> What are the R equivalents to the Stata command egen?
>>>
>>> egen temp = anycount(t0vas t30vas t60vas t120vas t240vas t360vas),
>>> values(0,1,2,3,4,5,6,7,8,9,10)
>>> egen temp2 = rowtotal(t0vas t30vas t60vas t120vas t240vas t360vas)
>>>
>>
>>
>> And people call R documentation cryptic! As far as I can tell the  
>> corresponding
>> function would be ave, but that is only a guess since there really  
>> is not much
>> help regarding egen's purpose from the voluminous Stat documentation.
>>
>>
>> --
>> David Winsemius
>>> ______________________________________________
>>> R-help <at> r-project.org mailing list
>>
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> -- 
> Stas Kolenikov, also found at http://stas.kolenikov.name
> Small print: I use this email account for mailing lists only.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list