[R] create a dummy variables for companies with complete history.

David L Carlson dcarlson at tamu.edu
Wed Jun 24 22:36:59 CEST 2015


You may want to consider another way of getting your answer that takes advantage of some of R's features:

> # Make some example data
> cods <- LETTERS[1:10] # Ten companies
> yrs <- 2010:2014 # 5 years
> set.seed(42) # Set random seed so we all get the same values
> # Chances of revenue for a given year are 95%
> rev <- round(rbinom(50, 1, .95)*runif(50, 25, 50), 2)
> z <- data.frame(expand.grid(year=yrs, cod=cods)[, 2:1], rev)
> # Remove years with missing (0) revenue
> z <- z[z$rev > 1, ]
> str(z)
'data.frame':   45 obs. of  3 variables:
 $ cod : Factor w/ 10 levels "A","B","C","D",..: 1 1 1 1 1 2 2 2 2 2 ...
 $ year: int  2010 2011 2012 2013 2014 2010 2011 2012 2013 2014 ...
 $ rev : num  33.3 33.7 35 44.6 26 ...
> 
> # Construct the dummy variable
> tbl <- xtabs(~cod+year, z)
> tbl
   year
cod 2010 2011 2012 2013 2014
  A    1    1    1    1    1
  B    1    1    1    1    1
  C    1    1    1    1    1
  D    1    0    1    1    1
  E    1    1    0    1    1
  F    1    1    1    1    1
  G    1    1    1    1    1
  H    1    1    1    1    1
  I    1    1    1    0    1
  J    0    1    1    0    1
> dummy <- as.integer(apply(tbl, 1, all))
> dummy
 [1] 1 1 1 0 0 1 1 1 0 0

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Michael Dewey
Sent: Wednesday, June 24, 2015 2:12 PM
To: giacomo begnis; r-help at r-project.org
Subject: Re: [R] create a dummy variables for companies with complete history.

Comments below

On 24/06/2015 19:26, giacomo begnis wrote:
> Hi, I have a dataset  (728 obs) containing three variables code of a company, year and revenue. Some companies have a complete history of 5 years, others have not a complete history (for instance observations for three or four years).I would like to determine the companies with a complete history using a dummy variables.I have written the following program but there is somehting wrong because the dummy variable that I have create is always equal to zero.Can somebody help me?Thanks, gm
>
> z<-read.table(file="c:/Rp/cddat.txt", sep="", header=T)
> attach(z)
> n<-length(z$cod)  // number of obs dataset
>

Could also use nrow(z)

> d1<-numeric(n)   // dummy variable
>
> for (i in 5:n)  {
>     if (z$cod[i]==z$cod[i-4])             // cod is the code of a company

              { d1[i]<=1} else { d1[i]<=0}          // d1=1 for a 
company with complete history, d1=0 if the history is not complete  }d1

Did you really type <= which means less than or equals to? If so, try 
replacing it with <- and see what happens.

> When I run the program d1 is always equal to zero. Why?
> Once I have create the dummy variable with subset I obtains the code of the companies with a complete history and finally with a merge  I determine a panel of companies with a complete history.But how to determine correctly d1?My best regards, gm
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list