[R] Dataset Transformation

Gabor Grothendieck ggrothendieck at gmail.com
Mon Oct 11 16:08:16 CEST 2010


On Mon, Oct 11, 2010 at 9:35 AM, Santosh Srinivas
<santosh.srinivas at gmail.com> wrote:
> Repost .. since the previous msg had problems
>
> I need to transpose the following input dataset  into an output dataset like
> below
>
> Input
> Date            TICKER          Price
> 11/10/2010              A               0.991642
> 11/10/2010              B               0.475023
> 11/10/2010              C               0.218642
> 11/10/2010              D               0.365135
> 12/10/2010              A               0.687873
> 12/10/2010              B               0.47006
> 12/10/2010              C               0.533542
> 12/10/2010              D               0.812439
> 13/10/2010              A               0.210848
> 13/10/2010              B               0.699799
> 13/10/2010              C               0.546003
> 13/10/2010              D               0.152316
>
> Output needed
>
> Date    A       B       C       D
> 11/10/2010      0.991642        0.475023        0.218642        0.365135
> 12/10/2010      0.687873        0.47006 0.533542        0.812439
> 13/10/2010      0.210848        0.699799        0.546003        0.152316
>
> I tried using the aggregate function but not quite getting the method.
>


1. Try this:

Lines <- " Date            TICKER          Price
11/10/2010              A               0.991642
11/10/2010              B               0.475023
11/10/2010              C               0.218642
11/10/2010              D               0.365135
12/10/2010              A               0.687873
12/10/2010              B               0.47006
12/10/2010              C               0.533542
12/10/2010              D               0.812439
13/10/2010              A               0.210848
13/10/2010              B               0.699799
13/10/2010              C               0.546003
13/10/2010              D               0.152316"

DF <- read.table(textConnection(Lines), header = TRUE)
DF$Date <- as.Date(DF$Date,"%d/%m/%Y")
DFout <- reshape(DF, dir = "wide", timevar = "TICKER", idvar = "Date")
names(DFout) <- sub("Price.", "", names(DFout))


2. or using read.zoo in the zoo package we can read it in and reshape
it all at once:

library(zoo)
z <- read.zoo(textConnection(Lines), header = TRUE,
		split = 2, format = "%d/%m/%Y")

At this point z is a zoo object in wide format:

> z
                  A        B        C        D
2010-10-11 0.991642 0.475023 0.218642 0.365135
2010-10-12 0.687873 0.470060 0.533542 0.812439
2010-10-13 0.210848 0.699799 0.546003 0.152316

Since this is a multivariate time series you might want to just leave
it as a zoo object since you then get all of the facilities of zoo,
e.g.

plot(z) # multi-panel
plot(z, screen = 1) # all in one panel

but if you want it as a data frame then convert it like this:

> data.frame(Index = index(z), coredata(z))
       Index        A        B        C        D
1 2010-10-11 0.991642 0.475023 0.218642 0.365135
2 2010-10-12 0.687873 0.470060 0.533542 0.812439
3 2010-10-13 0.210848 0.699799 0.546003 0.152316

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list