[R] Data frame "pivoting"

Patrick Hausmann patrick.hausmann at uni-bremen.de
Thu May 6 12:05:13 CEST 2010


Hi Angelo,

try

x <- structure(list(ID = c("A1", "A1", "A1", "A1", "A1", "A2", "A2",
"A3", "A3", "A3", "A3", "A3"), YEAR = c(2007, 2007, 2007, 2008,
2008, 2007, 2008, 2007, 2007, 2008, 2008, 2008), PROPERTY = c("P1",
"P2", "P3", "P1", "P2", "P5", "P6", "P1", "P3", "P1", "P2", "P6"
), VALUE = c(1, 2, 3, 10, 20, 50, 20, 1, 30, 10, 4, 25)), .Names = c("ID",
"YEAR", "PROPERTY", "VALUE"), row.names = c(NA, 12L), class = "data.frame")

# package reshape
library(reshape)
xm <- melt(x, id.var=c("ID", "YEAR", "PROPERTY"))

# with cast (reshape)
cast(xm, ID ~ YEAR ~ PROPERTY)

ftable(cast(xm, ID ~ YEAR ~ PROPERTY))

# with xtabs - 0 != NA
xtabs(value ~ ID + YEAR + PROPERTY, data = xm)

ftable( xtabs(value ~ ID + YEAR + PROPERTY, data = xm) )

ftable(addmargins(xtabs(value ~ ID + YEAR + PROPERTY, data = xm)))

HTH
Patrick

Am 06.05.2010 09:06, schrieb ANGELO.LINARDI at bancaditalia.it:
>
> Dear R experts,
>
> I am trying to solve this problem, related to the possibility of
> changing the shape of a data frame using a "pivoting-like" function.
> I have a dataframe df of observations as follows:
>
> ID		VALIDITY YEAR		PROPERTY	PROPERTY VALUE
> A1		2007				P1		V1
> A1		2007				P2		V2
> A1		2007				P3		V3
> A1		2008				P1		V10
> A1		2008				P2		V20
> A2		2007				P5		V50
> A2		2008				P6		V20
> A3		2007				P1		V1
> A3		2007				P3		V30
> A3		2008				P1		V10
> A3		2008				P2		V4
> A3		2008				P6		V25
>
> (you can imagine that this data is collected every year from a sample of
> people with several "measures" - weight, number of children, income...
> It can happen that some properties could be missing from some IDs).
> I have to obtain a data frame like this:
>
>
> ID	VALIDITY YEAR	P1	P2	P3	P4	P5	P6
> A1	2007			V1	V2	V3	-	-
> -
> A1	200			V10	V20	-	-	-
> -
> A2	2007			-	-	-	-	V50
> -
> A2	2008			-	-	-	-	-
> V60
> A3	2007			V1	-	V30	-	-
> -
> A3	2008			V10	V4	-	-	-
> V25
>
>
> I started using the operator "by" obtaining the different "slices" of
> data:
>
> by(df,df$PROPERTY,list)
>
> but then ?
>
> I also tried using tapply:
>
> tapply(df$CID,df$PROPERTY,list)
>
> obtaining a list but I am not able to go on.
>
> Can you help me ?
>
> Thank you in advance
>
> Angelo Linardi
>
>
>
> ** Le e-mail provenienti dalla Banca d'Italia sono trasmesse in buona fede e non
> comportano alcun vincolo ne' creano obblighi per la Banca stessa, salvo che cio' non
> sia espressamente previsto da un accordo scritto.
> Questa e-mail e' confidenziale. Qualora l'avesse ricevuta per errore, La preghiamo di
> comunicarne via e-mail la ricezione al mittente e di distruggerne il contenuto. La
> informiamo inoltre che l'utilizzo non autorizzato del messaggio o dei suoi allegati
> potrebbe costituire reato. Grazie per la collaborazione.
> -- E-mails from the Bank of Italy are sent in good faith but they are neither binding on
> the Bank nor to be understood as creating any obligation on its part except where
> provided for in a written agreement. This e-mail is confidential. If you have received it
> by mistake, please inform the sender by reply e-mail and delete it from your system.
> Please also note that the unauthorized disclosure or use of the message or any
> attachments could be an offence. Thank you for your cooperation. **
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list