[R] Tricky (?) conversion from data.frame to matrix where not all pairs exist

Dennis Murphy djmuser at gmail.com
Wed Jun 22 00:51:37 CEST 2011


Ahhh...you want a matrix. xtabs() doesn't easily allow coercion to a
matrix object, so try this instead:

library(reshape)
as.matrix(cast(df, year ~ block, fill = 0))
     a b c
2000 1 0 5
2001 2 4 6
2002 3 0 0

Hopefully this is more helpful...
Dennis

On Tue, Jun 21, 2011 at 3:35 PM, Dennis Murphy <djmuser at gmail.com> wrote:
> Hi:
>
> xtabs(value ~ year + block, data = df)
>      block
> year   a b c
>  2000 1 0 5
>  2001 2 4 6
>  2002 3 0 0
>
> HTH,
> Dennis
>
> On Tue, Jun 21, 2011 at 3:13 PM, Marius Hofert <m_hofert at web.de> wrote:
>> Dear expeRts,
>>
>> In the minimal example below, I have a data.frame containing three "blocks" of years
>> (the years are subsets of 2000 to 2002). For each year and block a certain "value" is given.
>> I would like to create a matrix that has row names given by all years ("2000", "2001", "2002"),
>> and column names given by all blocks ("a", "b", "c"); the entries are then given by the
>> corresponding value or zero if not year-block combination exists.
>>
>> What's a short way to achieve this?
>>
>> Of course one can setup a matrix and use for loops (see below)... but that's not nice.
>> The problem is that the years are not running from 2000 to 2002 for all three "blocks"
>> (the second block only has year 2001, the third one has only 2000 and 2001).
>> In principle, table() nicely solves such a problem (see below) and fills in zeros.
>> This is what I would like in the end, but all non-zero entries should be given by df$value,
>> not (as table() does) by their counts.
>>
>> Cheers,
>>
>> Marius
>>
>> (df <- data.frame(year=c(2000, 2001, 2002, 2001, 2000, 2001),
>>                  block=c("a","a","a","b","c","c"), value=1:6))
>> table(df[,1:2]) # complements the years and fills in 0
>>
>> year <- c(2000, 2001, 2002)
>> block <- c("a", "b", "c")
>> res <- matrix(0, nrow=3, ncol=3, dimnames=list(year, block))
>> for(i in 1:3){ # year
>>    for(j in 1:3){ # block
>>        for(k in 1:nrow(df)){
>>            if(df[k,"year"]==year[i] && df[k,"block"]==block[j]) res[i,j] <- df[k,"value"]
>>        }
>>    }
>> }
>> res # does the job; but seems complicated
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



More information about the R-help mailing list