[R] Creating matrix from long table in database (pivoting)

Phil Spector spector at stat.berkeley.edu
Tue Mar 2 20:01:22 CET 2010


Jan -
     Here's one way:

> tbl = data.frame(id=c(1,1,1,1,1,2,2,2,2,2),
                  text=c('this','is','the','first','row','this','is','the','second','row'))

> xtabs(~id+text,tbl)
   text
id  first is row second the this
   1     1  1   1      0   1    1
   2     0  1   1      1   1    1

It's a bit tricky to automatically get the column headings to 
be in the order you want.  This comes close:

> tbl$text = factor(tbl$text,levels=tbl$text[!duplicated(tbl$text)])
> xtabs(~id+text,tbl)
   text
id  this is the first row second
   1    1  1   1     1   1      0
   2    1  1   1     0   1      1

Hope this helps.
 					- Phil Spector
 					 Statistical Computing Facility
 					 Department of Statistics
 					 UC Berkeley
 					 spector at stat.berkeley.edu



On Tue, 2 Mar 2010, Jan Hornych wrote:

> Hi all,
>
> I have a table in database that is very long and when simplified it has only
> two columns in it (id, text). id is the row, and text is the column.
> Technically the text is a term and and id is the document.
> If simplifying this and assuming there is only one occurrence of the term
> per the document. I shall be able to convert this into a binary matrix.
> Table looks like this...
>
> *ID** **Text*
> ------------
> 1 this
> 1 is
> 1 the
> 1 first
> 1 row
> 2 this
> 2 is
> 2 the
> 2 send
> 2 row
> ...
>
>
> in R I would like to have it as
>
> *id  this is the first second row*
> ------------------------------------------------
> 1     1  1   1     1          0     1
> 2     1  1   1     0          1     1
>
> it would be simpler for me to do this transformation in R as I guess the
> language is more handy as the SQL. The table in R have few dozen thousand of
> columns and rows as well. I know how to read the data from database, but
> just unsure if there is some suitable transformation available.
>
> Thank you
> Jan
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list