[R] simple generation of artificial data with defined features

drflxms drflxms at googlemail.com
Sun Aug 24 11:31:29 CEST 2008


Hi Christoph,

perfect! Your code worked out of the box (copy and paste ;-). I had
expected at least some lines of code, but this is really easy!

So once you get used to command line, this is much more flexible (and
comfortable!) than all these coloured windows. Can't tell you how happy
I am, that I seem to make it away from these terrible SPSS-license
hassle. Taking into account that these are my first weeks with R, and
that I learn (My)SQL in parallel (using RMySQL) and did know not too
much (not to say nothing) about statistics before (like nearly all
medical doctors...), I even would't say, the learning curve is too steep.

Anyway thank you for your quick and efficient help for a newbie!

One of the reasons for my delayed answer is some trouble I experienced
in the following steps. I'll write a concluding e-mail to all about that
soon.
Greetings from Munich,

Felix

Christoph Meyer schrieb:
> Hi,
>
> to add voter.id and election.year to your data frame you could try:
>
> el.dt.exp$voter.id=seq(1:nrow(el.dt.exp))
>
> el.dt.exp$election.year=2005
>
> Cheers,
>
> Christoph Meyer
>
>
> ***************************************************************
> Dr. Christoph Meyer
> Institute of Experimental Ecology
> University of Ulm
> Albert-Einstein-Allee 11
> D-89069 Ulm
> Germany
> Phone:  ++49-(0)731-502-2675
> Fax:    ++49-(0)731-502-2683
> Mobile: ++49-(0)1577-156-7049
> E-mail: christoph.meyer at uni-ulm.de
> http://www.uni-ulm.de/index.php?id=7885
> ***************************************************************
>
> Saturday, August 23, 2008, 1:25:05 PM, you wrote:
>
>   
>> Dear Mr. Christos Hatzis,
>>     
>
>   
>> thank you so much for your answer which is in my eyes just brilliant! I
>> followed it step by step (great and detailed explanation) and nearly
>> everything is fine. - Except a problem in the very end, I haven't found
>> a solution for until now. (Despite playing arround quite a lot...)
>> Please let me explain:
>>     
>
>   
>>> election.2005 <- c(16194,13136,3494,3838,4648,4118) #cut of last 3
>>>       
>> digits, cause my laptop can't handle millions of rows...
>>     
>>> attr(election.2005, "class") <- "table"
>>> attr(election.2005, "dim") <- c(1,6)
>>> attr(election.2005, "dimnames") <- list(c("votes"), c("spd", "cdu",
>>>       
>> "csu", "gruene", "fdp", "pds"))
>>     
>>> head(election.2005)
>>>       
>>         spd   cdu  csu gruene  fdp  pds
>> votes 16194 13136 3494   3838 4648 4118
>>     
>>> el.dt <- as.data.frame(election.2005)
>>> el.dt.exp <- el.dt[rep(1:nrow(el.dt), el.dt$Freq), -ncol(el.dt)]
>>> dim(el.dt.exp)
>>>       
>> [1] 45428     2
>>     
>>> head(el.dt.exp)
>>>       
>>      Var1 Var2
>> 1   votes  spd
>> 1.1 votes  spd
>> 1.2 votes  spd
>> 1.3 votes  spd
>> 1.4 votes  spd
>> 1.5 votes  spd
>>     
>
>   
>> My problem now is, that I would need either an autoincrementing
>> identifier instead of "votes" in Var1 or the possibility to access the
>> numbering by a column name (i.e. Var0). In addition I need a 3rd
>> Variable for the year oft the election (2005, which is the same for all,
>> but needed later on). So this is what it should look like:
>>     
>
>   
>>      voter.id     party     election.year
>> 1       1        spd            2005
>> 1.1     2         spd          2005
>> 1.2     3        spd           2005
>> 1.3     4        spd            2005
>> 1.4     5        spd            2005
>> 1.5     6        spd            2005
>>     
>
> ...
>
>
>
>
>
>



More information about the R-help mailing list