[R] how to create a pivot table in r?

Marna Wagley m@rn@@w@g|ey @end|ng |rom gm@||@com
Fri Feb 14 18:38:33 CET 2020


Dear Peter and Rui,
There are many dates (value) in some of the cells, If we want to chose only
one date (either oldest or newest) from that cell, how can we make a table
with that condition?
For example,
Using the following code;
M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list,
simplify=F))
data.frame(M)
  site1 site4 site5 site7
id1 NULL NULL NULL 6/13/13
id2 NULL NULL NULL 07/03/14,  05/17/14
id4 NULL 5/8/14 NULL NULL
id5 NULL NULL 6/13/14 NULL
id6 05/30/14, 06/28/13 NULL NULL NULL
id7 NULL NULL 6/25/13 NULL

If we want to take only one value (oldest date) if two or more dates/values
(classes) are found in the certain cells. For example would like to get the
following table (given below).
  site1 site4 site5 site7
id1 NULL NULL NULL 6/13/13
id2 NULL NULL NULL 07/03/14
id4 NULL 5/8/14 NULL NULL
id5 NULL NULL 6/13/14 NULL
id6 6/28/13 NULL NULL NULL
id7 NULL NULL 6/25/13 NULL

here is the code and an example data

daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L
), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class = "factor"),
    ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label =
c("site1",
    "site4", "site5", "site7"), class = "factor"), ObsDate =
structure(c(4L,
    8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14",
    "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13",
    "07/03/14"), class = "factor")), .Names = c("id", "ObsSite",
"ObsDate"), class = "data.frame", row.names = c(NA, -8L))
daT
daT$date <- mdy(daT$ObsDate)
tmp <- split(daT, daT$id)
head(tmp)
M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list,
simplify=F))
data.frame(M)
M[["id2", "site7"]]
# but I wanted to get the following table
daT2<-structure(list(X = structure(1:6, .Label = c("id1", "id2", "id4",
"id5", "id6", "id7"), class = "factor"), site1 = structure(c(2L,
2L, 2L, 2L, 1L, 2L), .Label = c("6/28/13", "NULL"), class = "factor"),
    site4 = structure(c(2L, 2L, 1L, 2L, 2L, 2L), .Label = c("5/8/14",
    "NULL"), class = "factor"), site5 = structure(c(3L, 3L, 3L,
    1L, 3L, 2L), .Label = c("6/13/14", "6/25/13", "NULL"), class =
"factor"),
    site7 = structure(c(2L, 1L, 3L, 3L, 3L, 3L), .Label = c("5/17/14",
    "6/13/13", "NULL"), class = "factor")), .Names = c("X", "site1",
"site4", "site5", "site7"), class = "data.frame", row.names = c(NA,
-6L))
daT2
Thank you

====


On Thu, Feb 6, 2020 at 8:48 AM Marna Wagley <marna.wagley using gmail.com> wrote:

> Thank You Profs. Dalgaard and Barradas for the code, both codes worked
> perfectly for the data and I am going to use it in my big data set.
> Thanks once again.
>
>
> On Thu, Feb 6, 2020 at 2:02 AM peter dalgaard <pdalgd using gmail.com> wrote:
>
>> There is also
>>
>> > with(daT, tapply(as.character(ObsDate), list(id, ObsSite),
>> function(x)format(list(x))))
>>     site1                site4      site5      site7
>> id1 NA                   NA         NA         "06/13/13"
>> id2 NA                   NA         NA         "07/03/14, 05/17/14"
>> id4 NA                   "05/08/14" NA         NA
>> id5 NA                   NA         "06/13/14" NA
>> id6 "05/30/14, 06/28/13" NA         NA         NA
>> id7 NA                   NA         "06/25/13" NA
>>
>> ...with the added bonus that if you leave out the format() business, you
>> get a data structure that doesn't print as nicely, but can be used for
>> further computations:
>>
>> > with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list,
>> simplify=FALSE))
>>     site1  site4  site5  site7
>> id1 NULL   NULL   NULL   List,1
>> id2 NULL   NULL   NULL   List,1
>> id4 NULL   List,1 NULL   NULL
>> id5 NULL   NULL   List,1 NULL
>> id6 List,1 NULL   NULL   NULL
>> id7 NULL   NULL   List,1 NULL
>> > M <- with(daT, tapply(as.character(ObsDate), list(id, ObsSite), list,
>> simplify=FALSE))
>> > M[["id2", "site7"]]
>> [[1]]
>> [1] "07/03/14" "05/17/14"
>>
>> -pd
>>
>> > On 6 Feb 2020, at 01:37 , Marna Wagley <marna.wagley using gmail.com> wrote:
>> >
>> > daT<-structure(list(id = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 6L
>> > ), .Label = c("id1", "id2", "id4", "id5", "id6", "id7"), class =
>> "factor"),
>> >    ObsSite = structure(c(4L, 4L, 4L, 2L, 3L, 1L, 1L, 3L), .Label =
>> > c("site1",
>> >    "site4", "site5", "site7"), class = "factor"), ObsDate =
>> > structure(c(4L,
>> >    8L, 2L, 1L, 5L, 3L, 7L, 6L), .Label = c("05/08/14", "05/17/14",
>> >    "05/30/14", "06/13/13", "06/13/14", "06/25/13", "06/28/13",
>> >    "07/03/14"), class = "factor")), .Names = c("id", "ObsSite",
>> > "ObsDate"), class = "data.frame", row.names = c(NA, -8L))
>> > daT
>> > daT$date <- mdy(daT$ObsDate)
>>
>> --
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>

	[[alternative HTML version deleted]]



More information about the R-help mailing list