[R] aggregate dataframe by multiple factors

Karim Mezhoud kmezhoud at gmail.com
Fri Nov 18 08:27:25 CET 2016


Dear all,

the dat  has missing values NA,

    first.Name   Name Department  DCE   DP       date
5      Auction Videos        YME 0.57 0.56 2013-09-30
18       Amish  Wives        TAS 0.59 0.56 2013-09-30
34     Ancient Nation        QLH 0.54 0.58 2013-09-30
53     Auction Videos        YME   NA   NA 2013-12-28
66       Amish  Wives        TAS   NA   NA 2013-12-28
82     Ancient Nation        QLH 0.28 0.29 2013-12-28
102    Auction Videos        YME 0.57 0.56 2014-03-30
115      Amish  Wives        TAS 0.59 0.56 2014-03-30
131    Ancient Nation        QLH 0.54 0.58 2014-03-30
150    Auction Videos        YME   NA   NA 2014-06-28
163      Amish  Wives        TAS   NA   NA 2014-06-28
179    Ancient Nation        QLH 0.28 0.29 2014-06-28


agg <- as.data.frame(aggregate(dat[ , c("DCE","DP")], by=
list(dat$first.Name, dat$Name, dat$Department) , "sort"))


agg has list of value. I would separate value in different columns.

  Group.1 Group.2 Group.3                    DCE                     DP
1 Ancient  Nation     QLH 0.28, 0.28, 0.54, 0.54 0.29, 0.29, 0.58, 0.58
2   Amish   Wives     TAS             0.59, 0.59             0.56, 0.56
3 Auction  Videos     YME             0.57, 0.57             0.56, 0.56

The  goal:

Group.1 Group.2 Group.3  DCE.1 DCE.2 DCE.3  DCE.4  DP.1  DP.2  DP.3  DP.4
1 Ancient  Nation     QLH    0.28     0.28    0.54     0.54     0.29, 0.29,
0.58, 0.58
2   Amish   Wives     TAS        NA     NA     0.59, 0.59           NA
NA  0.56, 0.56
3 Auction  Videos     YME         NA   NA      0.57, 0.57             NA
NA  0.56, 0.56



dat <- structure(list(first.Name = structure(c(3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L), .Label = c("Amish", "Ancient", "Auction",
"Ax", "Bachelorette", "Basketball", "BBQ", "Cake", "Celebrity",
"Chef", "Clean", "Colonial", "Comedy", "Comic", "Crocodile",
"Dog", "Empire", "Extreme", "Farm", "Half Pint", "Hollywood",
"House", "Ice Road", "Jersey", "Justice", "Love", "Mega", "Model",
"Modern", "Mountain", "Mystery", "Myth", "New York", "Paradise",
"Pioneer", "Queer", "Restaurant", "Road", "Royal", "Spouse",
"Star", "Storage", "Survival", "The Great American", "Tool",
"Treasure", "Wedding", "Wife"), class = "factor"), Name = structure(c(43L,
47L, 29L, 43L, 47L, 29L, 43L, 47L, 29L, 43L, 47L, 29L), .Label =
c("Aliens",
"Behavior", "Casino", "Casting Call", "Challenge", "Contest",
"Crashers", "Crew", "Dad", "Dancing", "Date", "Disasters", "Dynasty",
"Family", "Garage", "Greenlight", "Gypsies", "Haul", "Hot Rod",
"Inventor", "Jail", "Job", "Justice", "Marvels", "Master", "Mates",
"Model", "Moms", "Nation", "Ninja", "Patrol", "People", "Pitmasters",
"Queens", "Rescue", "Rivals", "Room", "Rooms", "Rules", "Star",
"Stars", "Superhero", "Videos", "VIP", "Wars", "Wishes", "Wives",
"Wrangler"), class = "factor"), Department = structure(c(8L,
6L, 2L, 8L, 6L, 2L, 8L, 6L, 2L, 8L, 6L, 2L), .Label = c("HXW",
"QLH", "RAR", "RYC", "SYI", "TAS", "VUV", "YME"), class = "factor"),
    DCE = c(0.57, 0.59, 0.54, NA, NA, 0.28, 0.57, 0.59, 0.54,
    NA, NA, 0.28), DP = c(0.56, 0.56, 0.58, NA, NA, 0.29, 0.56,
    0.56, 0.58, NA, NA, 0.29), date = structure(c(15978, 15978,
    15978, 16067, 16067, 16067, 16159, 16159, 16159, 16249, 16249,
    16249), class = "Date")), description = "", row.names = c(5L,
18L, 34L, 53L, 66L, 82L, 102L, 115L, 131L, 150L, 163L, 179L), class =
"data.frame", .Names = c("first.Name",
"Name", "Department", "DCE", "DP", "date"))

	[[alternative HTML version deleted]]



More information about the R-help mailing list