[R] Reshape question.

rkevinburton at charter.net rkevinburton at charter.net
Wed Mar 11 22:09:37 CET 2009


Thank you for you reply. I will try this. The inital few rows in the .dat file look like:

Year,DayOfYear,Sku,Quantity,CatId,Category,SubCategory
2009,1,100051,1,10113,"MEN","Historical men's"
2009,1,100130,1,10638,"ACCESSORIES & MAKEUP","ALL Kids Accessories"
2009,1,100916,1,10222,"WOMEN","TV & Movies Women"
2009,1,101241,1,10897,"HOLIDAY","Colonial (Presidents)"
2009,1,101252,1,10640,"ACCESSORIES & MAKEUP","Finishing Touches"
2009,1,101298,1,10865,"HOLIDAY","Easter"
2009,1,101613,1,10410,"GIRLS","Classic Girls"
2009,1,101645,1,10320,"BOYS","Superheroes Boys"
2009,1,101648,1,10320,"BOYS","Superheroes Boys"
2009,1,101718,1,10897,"HOLIDAY","Colonial (Presidents)"
2009,1,101719,1,10897,"HOLIDAY","Colonial (Presidents)"
2009,1,101751,1,10420,"GIRLS","Superheroes Girls"
2009,1,102125,1,10638,"ACCESSORIES & MAKEUP","ALL Kids Accessories"
2009,1,102174,1,10897,"HOLIDAY","Colonial (Presidents)"
2009,1,102558,1,10636,"ACCESSORIES & MAKEUP","Armor/Weapons/Guns"
2009,1,102582,1,10636,"ACCESSORIES & MAKEUP","Armor/Weapons/Guns"
2009,1,102717,1,10862,"HOLIDAY","Christmas"
2009,1,104705,1,10518,"PLUS","Plus Women"
2009,1,104745,6,10748,"HATS, WIGS & MASKS","Wigs - Men's"
2009,1,104745,1,10748,"HATS, WIGS & MASKS","Wigs - Men's"
2009,1,104751,1,10310,"BOYS","Classic Boys"
2009,1,105238,1,10742,"HATS, WIGS & MASKS","Hats-Miscellaneous"
2009,1,105352,10,10742,"HATS, WIGS & MASKS","Hats-Miscellaneous"
2009,1,107420,10,10744,"HATS, WIGS & MASKS","Masks - Miscellaneous"
2009,1,107420,1,10744,"HATS, WIGS & MASKS","Masks - Miscellaneous"
2009,1,107479,1,10743,"HATS, WIGS & MASKS","Masks - Famous"
2009,1,107479,1,10743,"HATS, WIGS & MASKS","Masks - Famous"

If your propose solution works I am confused as to why during the original I was able to specify:

c2009 <- cast(m2009, DayOfYear ~ variable | Category, sum)
t2009 <- cast(m2009, DayOfYear ~ variable, sum)

By combining this into one 'cast' is it better (as in faster)?

Thanks again.

Kevin

---- Tal Galili <tal.galili at gmail.com> wrote: 
> how about:
> c2009 <- cast(m2009, Category + SubCategory +DayOfYear ~ variable , sum)
> ?
> 
> 
> p.s: toy data would be nice to have :)
> 
> 
> 
> 
> 
> 
> On Wed, Mar 11, 2009 at 9:47 PM, <rkevinburton at charter.net> wrote:
> 
> > This hopefully is trivial. I am trying to reshape the data using the
> > reshape package.
> >
> > First I read in the data:
> >
> > a2009 <- read.csv("Total2009.dat", header = TRUE)
> >
> > Then I trim it so that it only contains the columns that I have interested
> > in:
> >
> > m2009 <- melt(a2009, id.var=c("DayOfYear","Category","SubCategory","Sku"),
> > measure.var=c("Quantity"), na.rm=TRUE)
> >
> > Then I start to formulate the data that I will process:
> >
> > c2009 <- cast(m2009, DayOfYear ~ variable | Category, sum)
> >
> > Finally I aggregate the data:
> >
> > t2009 <- cast(m2009, DayOfYear ~ variable, sum)
> >
> > My question is on the third step above (repeated here)
> >
> > c2009 <- cast(m2009, DayOfYear ~ variable | Category, sum)
> >
> > This gets the data assocated with a unique 'Category' name. I want to get
> > the data grouped by 'Category' and 'SubCategory'. The 'SubCategory' is not
> > unique but the combination 'Category' and 'SubCategory' form a unique pair.
> > What would be the formula that would give me the data grouped by Category
> > AND SubCategory? Would it be as simple as:
> >
> > c2009 <- cast(m2009, DayOfYear ~ variable | Category & SubCategory, sum)
> >
> > ?
> >
> > Thank you for your suggestions.
> >
> > Kevin
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> -- 
> ----------------------------------------------
> 
> 
> My contact information:
> Tal Galili
> Phone number: 972-50-3373767
> FaceBook: Tal Galili
> My Blogs:
> www.talgalili.com
> www.biostatistics.co.il




More information about the R-help mailing list