[R] Reshaping data with xtabs giving me 'extra' data

Henrique Dallazuanna wwwhsd at gmail.com
Wed Jan 20 18:22:34 CET 2010


You can use this instead:

with(do.call(rbind, df.list), tapply(Score, list(Date, Show, Time), invisible))

On Wed, Jan 20, 2010 at 3:02 PM, Tony B <tony.breyal at googlemail.com> wrote:
> Thank you for taking the time to reply Henrique. Although your
> solution does take away the zeroes and replaces them with NA's (which
> i prefer), it unfortunately seems to reduce all of the other scores to
> just '1':
>
>> x <- with(do.call(rbind, df.list), tapply(Score, list(Date, Show, Time), length))
>> x[,,"13:30:00"]
>           Being Human Doctor Who Red Dwarf
> 2010-01-19          NA         NA        NA
> 2010-01-20           1          1         1
>
> This is the first time i think i've used tapply, and after doing a
> little searching, i was able to take your sugestion to get the desire
> result:
>
>> x <- with(do.call(rbind, df.list), tapply(Score, list(Date, Show, Time), function(x) { x } ))
>> x[,,"13:30:00"]
>           Being Human Doctor Who Red Dwarf
> 2010-01-19          NA         NA        NA
> 2010-01-20           2          3         1
>
> Cheers!
> Tony Breyal
>
>
>
>
> On 20 Jan, 16:37, Henrique Dallazuanna <www... at gmail.com> wrote:
>> Try with tapply:
>>
>>  with(do.call(rbind, df.list), tapply(Score, list(Date, Time, Show), length))
>>
>>
>>
>> On Wed, Jan 20, 2010 at 10:20 AM, Tony B <tony.bre... at googlemail.com> wrote:
>> > Dear all,
>>
>> > Lets say I have several data frames as follows:
>>
>> >> set.seed(42)
>> >> dates <- as.Date(c("2010-01-19", "2010-01-20"))
>> >> times <- c("09:30:00", "11:30:00", "13:30:00", "15:30:00")
>> >> shows <- c("Red Dwarf", "Being Human", "Doctor Who")
>>
>> >> df1 <- data.frame(Date = dates[1], Time = times[1], Show = shows, Score = 1:3)
>> >> df2 <- data.frame(Date = dates[1], Time = times[2], Show = shows, Score = 1:3)
>> >> df3 <- data.frame(Date = dates[1], Time = times[4], Show = shows, Score = 1:3)
>> >> df4 <- data.frame(Date = dates[2], Time = times[1], Show = shows, Score = 1:3)
>> >> df5 <- data.frame(Date = dates[2], Time = times[2], Show = shows, Score = 1:3)
>> >> df6 <- data.frame(Date = dates[2], Time = times[3], Show = shows, Score = 1:3)
>> >> df7 <- data.frame(Date = dates[2], Time = times[4], Show = shows, Score = 1:3)
>> >> df7
>> >        Date     Time        Show Score
>> > 1 2010-01-20 15:30:00   Red Dwarf     1
>> > 2 2010-01-20 15:30:00 Being Human     2
>> > 3 2010-01-20 15:30:00  Doctor Who     3
>>
>> > I would like to somehow reshape the data into a different format:
>>
>> >> df.list <- list(df1, df2, df3, df4, df5, df6, df7)
>> >> my.df <- Reduce(function(x, y) merge(x, y, all=TRUE), df.list, accumulate=F)
>> >> my.xtab <- xtabs(as.numeric(Score) ~ Date + Show + Time, data = my.df)
>>
>> > This is where my problem occurs. In Time = 13:30:00, there is now data
>> > for "2010-01-19" which was not in any of my original data frames
>> > above:
>>
>> >> # I do not want the zeros below
>> >> my.xtab[,,"13:30:00"]
>> >            Show
>> > Date         Being Human Doctor Who Red Dwarf
>> >  2010-01-19           0          0         0
>> >  2010-01-20           2          3         1
>>
>> > Perhaps I am missing something in the way i call the xtabs function?
>>
>> > Thank you kindly for your time,
>> > Tony Breyal
>>
>> > OS: Windows XP 64bit
>> >> sessionInfo()
>> > R version 2.10.0 (2009-10-26)
>> > i386-pc-mingw32
>>
>> > locale:
>> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>> > States.1252    LC_MONETARY=English_United States.1252
>> > LC_NUMERIC=C                           LC_TIME=English_United States.
>> > 1252
>>
>> > attached base packages:
>> > [1] stats     graphics  grDevices utils     datasets  methods
>> > base
>>
>> > loaded via a namespace (and not attached):
>> > [1] tools_2.10.0
>>
>> > ______________________________________________
>> > R-h... at r-project.org mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> Henrique Dallazuanna
>> Curitiba-Paraná-Brasil
>> 25° 25' 40" S 49° 16' 22" O
>>
>> ______________________________________________
>> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O



More information about the R-help mailing list