[R] Computing growth rate

David Stevens david.stevens at usu.edu
Thu Dec 15 15:32:15 CET 2016


Berend - Unless you need the change in sales year by year, you might 
consider looking at each company's sales over the years and use 
regression or other type of trend analysis to get an overall trend... 
Or, if not, simply divide diff(sales) by diff(fyear1) for each company 
so at least you get the average over the missing years.

David


On 12/15/2016 7:18 AM, Berend Hasselman wrote:
>> On 15 Dec 2016, at 04:40, Brijesh Mishra <brijeshkmishra at gmail.com> wrote:
>>
>> Hi,
>>
>> I am trying to calculate growth rate (say, sales, though it is to be
>> computed for many variables) in a panel data set. Problem is that I
>> have missing data for many firms for many years. To put it simply, I
>> have created this short dataframe (original df id much bigger)
>>
>> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
>> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>>
>> # this gives me
>> co_code1 fyear1 sales1
>> 1      1100   1990   1000
>> 2      1100   1991   1100
>> 3      1100   1992   1200
>> 4      1100   1993   1300
>> 5      1100   1994   1400
>> 6      1100   1995   1500
>> 7      1100   1996   1600
>> 8      1200   1990   1000
>> 9      1200   1991   1100
>> 10     1200   1992   1200
>> 11     1200   1993   1300
>> 12     1200   1994   1400
>> 13     1200   1995   1500
>> 14     1200   1996   1600
>> 15     1300   1990   1000
>> 16     1300   1991   1100
>> 17     1300   1992   1200
>> 18     1300   1993   1300
>> 19     1300   1994   1400
>> 20     1300   1995   1500
>> 21     1300   1996   1600
>>
>> # I am now removing a couple of rows
>> df1<-df1[-c(5, 8), ]
>> # the result is
>>    co_code1 fyear1 sales1
>> 1      1100   1990   1000
>> 2      1100   1991   1100
>> 3      1100   1992   1200
>> 4      1100   1993   1300
>> 6      1100   1995   1500
>> 7      1100   1996   1600
>> 9      1200   1991   1100
>> 10     1200   1992   1200
>> 11     1200   1993   1300
>> 12     1200   1994   1400
>> 13     1200   1995   1500
>> 14     1200   1996   1600
>> 15     1300   1990   1000
>> 16     1300   1991   1100
>> 17     1300   1992   1200
>> 18     1300   1993   1300
>> 19     1300   1994   1400
>> 20     1300   1995   1500
>> 21     1300   1996   1600
>> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
>> removed. If I try,
>> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)
>>
>> # this apparently gives wrong results for the year 1995 (as shown
>> below) as growth rates are computed considering yearly increment.
>>
>>    co_code1 fyear1 sales1    growth
>> 1      1100   1990   1000        NA
>> 2      1100   1991   1100 10.000000
>> 3      1100   1992   1200  9.090909
>> 4      1100   1993   1300  8.333333
>> 5      1100   1995   1500 15.384615
>> 6      1100   1996   1600  6.666667
>> 7      1200   1991   1100        NA
>> 8      1200   1992   1200  9.090909
>> 9      1200   1993   1300  8.333333
>> 10     1200   1994   1400  7.692308
>> 11     1200   1995   1500  7.142857
>> 12     1200   1996   1600  6.666667
>> 13     1300   1990   1000        NA
>> 14     1300   1991   1100 10.000000
>> 15     1300   1992   1200  9.090909
>> 16     1300   1993   1300  8.333333
>> 17     1300   1994   1400  7.692308
>> 18     1300   1995   1500  7.142857
>> 19     1300   1996   1600  6.666667
>> # I thought of using the formula only when the increment of fyear1 is
>> only 1 while in a co_code1, by using this formula
>>
>> d<-ddply(df1,
>>          "co_code1",
>>          transform,
>>          if(diff(fyear1)==1){
>>            growth=(exp(diff(log(df1$sales1)))-1)*100
>>          } else{
>>            growth=NA
>>          })
>>
>> But, this doesn't work. I am getting the following error.
>>
>> In if (diff(fyear1) == 1) { :
>>   the condition has length > 1 and only the first element will be used
>> (repeated a few times).
>>
>> # I have searched for a solution, but somehow couldn't get one. Hope
>> that some kind soul will guide me here.
>>
> In your case use ifelse() as explained by Rui.
> But it can be done more easily since the fyear1 and co_code1 are synchronized.
> Add a new column to df1 like this
>
> df1$growth <- c(NA,
>           ifelse(diff(df1$fyear1)==1,
>                      (exp(diff(log(df1$sales1)))-1)*100,
>                      NA
>                      )
>          )
>
> and display df1. From your request I cannot determine if this is what you want.
>
> regards,
>
> Berend Hasselman
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
David K Stevens, P.E., Ph.D.
Professor and Head, Environmental Engineering
Civil and Environmental Engineering
Utah Water Research Laboratory
8200 Old Main Hill
Logan, UT  84322-8200
435 797 3229 - voice
435 797 1363 - fax
david.stevens at usu.edu



More information about the R-help mailing list