[R] Pointer to covariates?

Göran Broström gb at stat.umu.se
Thu Feb 21 09:37:09 CET 2002


On Wed, 20 Feb 2002, Gabor Grothendieck wrote:

> In the first line, use the dist function, found in library mva,
> to get the distance between each pair of rows.   From this
> calculate an incidence matrix for which element i,j is true if 
> row i in dat equals row j in dat (and false elsewhere).
> 
> In the second line, for each row calculate the indices of 
> the matching rows and take the minimum of those as the key.
> 
> incid <- as.matrix(dist(dat[,-1],method="max"))==0
> keys <- unlist(lapply(apply(incid,1,which),min))

Thank you very much! This is very fast, much faster than my attempts
so far, but it has two drawbacks:

1. It  gives pointers to first occurrences in the _original_ data frame,
not the 'unique' version.

2. The first step results in a _huge_ matrix 'incid', too huge for my 
applications.

However, this is a promising first attempt, and I will try to refine
the idea. Again, thanks!

Göran

> 
> --- Göran Broström <gb at stat.umu.se> wrote:
> >I have a dataframe 'dat' with one response and some covariates. Many 
> >observations  (rows), but only a few unique combinations of 
> >the covariates. Let's say that the response is in column 1, and 
> >the covariates in columns 2:k.
> >
> >I want to do 
> >
> >> covar <- unique.data.frame(dat[, 2:k])
> >> y <- dat[, 1]
> >> keys <- ??????
> >
> >where 'keys' should be a vector of length length(y) and contain the
> >row numbers in 'covar', where the response will find its covariates.
> >
> >Example:
> >
> >> dat
> >  y x1 x2
> >1 1  1  0
> >2 2  0  1
> >3 3  1  0
> >
> >> unique.data.frame(dat[, 2:3])
> >  x1 x2
> >1  1  0
> >2  0  1
> >
> >> keys
> >1  1
> >2  2
> >3  1
> >
> >But how do I get 'keys'?
> >-- 
> > Göran Broström                      tel: +46 90 786 5223
> > professor                           fax: +46 90 786 6614
> > Department of Statistics            http://www.stat.umu.se/egna/gb/
> > Umeå University
> > SE-90187 Umeå, Sweden             e-mail: gb at stat.umu.se
> >
> >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> >r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> >Send "info", "help", or "[un]subscribe"
> >(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 
> _____________________________________________________________
> 
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 

-- 
 Göran Broström                      tel: +46 90 786 5223
 professor                           fax: +46 90 786 6614
 Department of Statistics            http://www.stat.umu.se/egna/gb/
 Umeå University
 SE-90187 Umeå, Sweden             e-mail: gb at stat.umu.se

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list