[R] keep average values and delete duplicate rows

fuckecon iamstanhu at gmail.com
Sun Oct 28 09:10:42 CET 2012


Hi Arun,

Thanks for replying.

Sorry I didn't list it, I do have democracy index in my dataset.

The full  set includes these columns:

Country Code (3 letter abbreviation)
Country
Education
Freedom house demo index
log population
log real gdp
nominal savings
polity demo index
year (5 yr interval: 1950, 1955, 1960....)
sample (not sure what it's for yet)
world income instrument

This is a set of panel data for some 200 countries with 10 obs each. each
country has it's own missing data for various columns.

I imported the csv file into an object in R called Table 1.

Here is the few lines I wrote so far:

#Importing data from excel to R

Table1 <- read.csv("5YearPanel.csv")
Table1 <- data.frame(Table1)
Table1

# Deleting Netherlands data from Table1 and naming the new table deDutch

deDutch <- subset(Table1, country!="Netherlands")
deDutch

What I am trying to do next is to clean the data in R as follows:

1) Take avg values of each column for each country.
2) Store these values in a new object.
3) Based on the median income, I want to divide them into a subset called
high income (i.e. >median), and a subset called low income (i.e. <=median).
4) Once I get it cleaned, I believe I can start running regressions with the
data.

I'll look at your comments and try things out first.

Thank you!







--
View this message in context: http://r.789695.n4.nabble.com/keep-average-values-and-delete-duplicate-rows-in-R-tp4647677p4647681.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list