[R] Computing growth rate

Brijesh Mishra brijeshkmishra at gmail.com
Thu Dec 15 04:40:51 CET 2016


Hi,

I am trying to calculate growth rate (say, sales, though it is to be
computed for many variables) in a panel data set. Problem is that I
have missing data for many firms for many years. To put it simply, I
have created this short dataframe (original df id much bigger)

df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))

# this gives me
co_code1 fyear1 sales1
1      1100   1990   1000
2      1100   1991   1100
3      1100   1992   1200
4      1100   1993   1300
5      1100   1994   1400
6      1100   1995   1500
7      1100   1996   1600
8      1200   1990   1000
9      1200   1991   1100
10     1200   1992   1200
11     1200   1993   1300
12     1200   1994   1400
13     1200   1995   1500
14     1200   1996   1600
15     1300   1990   1000
16     1300   1991   1100
17     1300   1992   1200
18     1300   1993   1300
19     1300   1994   1400
20     1300   1995   1500
21     1300   1996   1600

# I am now removing a couple of rows
df1<-df1[-c(5, 8), ]
# the result is
   co_code1 fyear1 sales1
1      1100   1990   1000
2      1100   1991   1100
3      1100   1992   1200
4      1100   1993   1300
6      1100   1995   1500
7      1100   1996   1600
9      1200   1991   1100
10     1200   1992   1200
11     1200   1993   1300
12     1200   1994   1400
13     1200   1995   1500
14     1200   1996   1600
15     1300   1990   1000
16     1300   1991   1100
17     1300   1992   1200
18     1300   1993   1300
19     1300   1994   1400
20     1300   1995   1500
21     1300   1996   1600
# so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
removed. If I try,
d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)

# this apparently gives wrong results for the year 1995 (as shown
below) as growth rates are computed considering yearly increment.

   co_code1 fyear1 sales1    growth
1      1100   1990   1000        NA
2      1100   1991   1100 10.000000
3      1100   1992   1200  9.090909
4      1100   1993   1300  8.333333
5      1100   1995   1500 15.384615
6      1100   1996   1600  6.666667
7      1200   1991   1100        NA
8      1200   1992   1200  9.090909
9      1200   1993   1300  8.333333
10     1200   1994   1400  7.692308
11     1200   1995   1500  7.142857
12     1200   1996   1600  6.666667
13     1300   1990   1000        NA
14     1300   1991   1100 10.000000
15     1300   1992   1200  9.090909
16     1300   1993   1300  8.333333
17     1300   1994   1400  7.692308
18     1300   1995   1500  7.142857
19     1300   1996   1600  6.666667
# I thought of using the formula only when the increment of fyear1 is
only 1 while in a co_code1, by using this formula

d<-ddply(df1,
         "co_code1",
         transform,
         if(diff(fyear1)==1){
           growth=(exp(diff(log(df1$sales1)))-1)*100
         } else{
           growth=NA
         })

But, this doesn't work. I am getting the following error.

In if (diff(fyear1) == 1) { :
  the condition has length > 1 and only the first element will be used
(repeated a few times).

# I have searched for a solution, but somehow couldn't get one. Hope
that some kind soul will guide me here.

Regards,

Brijesh K Mishra
Indian Institute of Management, Indore
India



More information about the R-help mailing list