# [R] keep average values and delete duplicate rows

arun smartpink111 at yahoo.com
Sun Oct 28 07:20:08 CET 2012

```HI,

I guess it is a bit confusing as to what you want.  In the example dataset, there was no democracy_index, but in the result you want it.  Regarding the median calculation, I guess you want to calculate the median for each country.  I created one more country (China) with fake data.

May be this helps:
Country  log_GDP yr
USA       9.27824    1950
USA       9.38968    1955
USA       9.415136  1960
USA       9.594625  1965
USA       9.70207    1970
USA       9.800418  1975
USA       9.96813    1980
USA       10.07001  1985
USA       10.18331  1990
USA       10.25446  1995
USA       10.4131    2000
China      7.5       1950
China      7.32      1955
China       7.33  1960
China       7.6  1965
China       7.8    1970
China       8.0   1975
China       8.2    1980
China       8.3  1985
China       8.5  1990
China       8.6  1995
China       8.7    2000
dat2<-with(dat1,aggregate(log_GDP,by=list(Country=Country),mean))
colnames(dat2)[2]<-"Mean"
dat3<-with(dat1,aggregate(log_GDP,by=list(Country=Country),median))
colnames(dat3)[2]<-"Median"
dat4<-merge(dat3,dat2)
dat4\$HighIncome<-ifelse(dat4\$Mean>dat4\$Median,dat4\$Country[dat4\$Mean>dat4\$Median],NA)
dat4\$LowIncome<-ifelse(dat4\$Mean>dat4\$Median,NA,dat4\$Country[!dat4\$Mean>dat4\$Median])
dat5<-dat4[,-2]
dat5
#  Country     Mean HighIncome LowIncome
#2   China 7.986364       <NA>     China
#3     USA 9.824471        USA      <NA>

res<-merge(dat1,dat5)
#  Country  log_GDP   yr     Mean HighIncome LowIncome
A.K.

----- Original Message -----
From: fuckecon <iamstanhu at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Sunday, October 28, 2012 12:16 AM
Subject: [R] keep average values and delete duplicate rows

Hello experts,

I am sorry that my subject line is confusing, because I am confused as nuts.
Let me take a shot at explaining what I am trying to do.

I have a data set of log GDP, education, democracy index, and a whole bunch
of variables for every country from 1950 to  2000. Each country accounts for
10 observations with each observation representing the mean GDP for each 5
year interval.

Example:

Country  log GDP yr
USA       9.27824    1950
USA       9.38968    1955
USA       9.415136  1960
USA       9.594625  1965
USA       9.70207    1970
USA       9.800418  1975
USA       9.96813    1980
USA       10.07001  1985
USA       10.18331  1990
USA       10.25446  1995
USA       10.4131    2000

For log GDP:

I want to create a new object in R with one line for each country and  the
average log GDP from the 10 5yr interval observations. With the subset I
want to then create a table with 3 columns and 4 rows.

(I have no idea how to write the codes to create the new object. Friend said

Columns
1) All countries
2) High income countries
3) Low income countries

Rows
1) Democracy index
2) Log GDP
3) Obs
4) Countries

To create a high and low income columns, I am using the median as the
boundary. (i.e. high income for gdp > median of the mean for each country,
low income for gdp <= median of the mean for each country.)

I hope someone can understand what I am writing here and help me out with
it.

Thanks so much!

--
View this message in context: http://r.789695.n4.nabble.com/keep-average-values-and-delete-duplicate-rows-tp4647677.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help