[R] Create a time-series from cross-sectional data that has each year as a separate column

David Winsemius dwinsemius at comcast.net
Mon Sep 13 14:15:29 CEST 2010


On Sep 13, 2010, at 2:31 AM, Gabriel Bergin wrote:

> Hi,
>
> I have a dataset from ILO, originally in csv-format, that I have  
> read into
> R. It is cross-sectional time-series data, so I have a bunch of  
> variables
> and dummy variables that I need to extract data from for the entire  
> time
> period. However, the years are separated by columns instead of rows,  
> as is
> usually the case in R. This is what it looks like:
>
>> str(laborstafinMFBA)
> 'data.frame': 152 obs. of  39 variables:
> $ COUNTRY                : Factor w/ 164 levels  
> "Albania","Algeria",..: 2 4
> 7 8 9 10 11 11 12 13 ...
> $ CODE.COUNTRY           : Factor w/ 163 levels "AE","AG","AI",..:  
> 44 7 8 5
> 12 11 10 10 13 23 ...
> $ SOURCE                 : Factor w/ 7 levels "Administrative  
> reports",..:
> 3 3 3 3 3 3 3 3 3 3 ...
> $ CODE.SOURCE            : Factor w/ 7 levels "A","B","BA","CA",..:  
> 3 3 3 3
> 3 3 3 3 3 3 ...
> etc..
> $ D1990                  : num  NA NA NA NA NA ...
> $ D1991                  : num  NA NA 101 NA NA ...
> $ D1992                  : num  NA 38.4 111.2 NA NA ...
> $ D1993                  : num  NA NA 94.4 NA NA ...
> $ D1994                  : num  NA NA 133.69 NA 1.42 ...
> $ D1995                  : num  NA NA 121 NA NA ...
> $ D1996                  : num  NA NA 176 NA NA ...
> $ D1997                  : num  NA NA 195.31 NA 1.51 ...
> $ D1998                  : num  NA NA 202 NA NA ...
> $ D1999                  : num  NA NA 201 NA NA ...
> $ D2000                  : num  NA NA 207 NA NA ...
> $ D2001                  : num  68.1 NA 198.3 NA NA ...
> $ D2002                  : num  NA NA 186 NA NA ...
> $ D2003                  : num  67.6 NA 148.8 NA NA ...
> $ D2004                  : num  68.8 NA 143.7 NA NA ...
> $ D2005                  : num  NA NA 163 NA NA ...
> $ D2006                  : num  NA NA 189 NA NA ...
> $ D2007                  : num  NA NA NA 14 1.91 ...
>
> How do I transform this into something that I can make a time-series  
> of?

?reshape
package reshape    (not the same as the function "reshape")
package sqldf

There might also be the possibility of using t() on just the D<year>  
columns but it might not give desired results on the association of  
COUNTRY with SOURCE

It looks as though there are a mixture of data types and that you  
might want to pull out all the types of SOURCE separately before  
transforming or go to a database-oriented solution.


>
> Sincerely,
> Gabriel Bergin
> gabriel at bergin.se
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list