[R] Removing duplicated rows within a matrix, with missing data as wildcards

hpages at fhcrc.org hpages at fhcrc.org
Fri Mar 9 09:13:40 CET 2007


Quoting Petr Pikal <petr.pikal at precheza.cz>:

> Hi
> 
> its a bit tricky but
> 
> dup<-apply(x, 2, duplicated) #which are dupplucated
> isna<-apply(x, 2, is.na) #which are na
> check<-dup|isna # which are both
> 
> and here is your result
> 
> x[rowSums(check)!=3,]
>      [,1] [,2] [,3]
> [1,]    1    3    2
> [2,]    2    1    3
> [3,]    3    2   NA

Hi,

The above doesn't work. No need to have NAs in x:

  > x <- matrix(c(2,2,1,3,2,3), ncol=2, byrow=TRUE)
  > x
       [,1] [,2]
  [1,]    2    2
  [2,]    1    3
  [3,]    2    3

  > dup <- apply(x, 2, duplicated)
  > x[rowSums(check)!=2 ,]
       [,1] [,2]
  [1,]    2    2
  [2,]    1    3

Look at 'dup':

  > dup
        [,1]  [,2]
  [1,] FALSE FALSE
  [2,] FALSE FALSE
  [3,]  TRUE  TRUE

Yes, each element in the last row is a duplicate in its own col,
but this doesn't mean that the row as a whole is a duplicate.

Cheers,
H.


> 
> 
> Regards
> Petr
> 
> 
> 
> 
> On 8 Mar 2007 at 10:14, stacey thompson wrote:
> 
> Date sent:      	Thu, 8 Mar 2007 10:14:37 -0500
> From:           	"stacey thompson" <stacey.lee.thompson at gmail.com>
> To:             	r-help at stat.math.ethz.ch
> Subject:        	[R] Removing duplicated rows within a matrix,
> 	with missing data as wildcards
> 
> > I'd like to remove duplicated rows within a matrix, with missing data
> > being treated as wildcards.
> > 
> > For example
> > 
> > > x <- matrix((1:3), 5, 3)
> > > x[4,2] = NA
> > > x[3,3] = NA
> > > x
> > 
> >      [,1] [,2] [,3]
> > [1,]    1    3    2
> > [2,]    2    1    3
> > [3,]    3    2   NA
> > [4,]    1   NA    2
> > [5,]    2    1    3
> > 
> > I would like to obtain
> > 
> >       [,1] [,2] [,3]
> > [1,]    1    3    2
> > [2,]    2    1    3
> > [3,]    3    2   NA
> > 
> > >From the R-help archives, I learned about unique(x) and
> > >duplicated(x).
> > However, unique(x) returns
> > 
> > > unique(x)
> > 
> >      [,1] [,2] [,3]
> > [1,]    1    3    2
> > [2,]    2    1    3
> > [3,]    3    2   NA
> > [4,]    1   NA    2
> > 
> > and duplicated(x) gives
> > 
> > > duplicated(x)
> > 
> > [1] FALSE FALSE FALSE FALSE  TRUE
> > 
> > I have tried various na.action 's but with unique(x) I get errors at
> > best.
> > 
> > e.g.
> > > unique(x, na.omit(x))
> > 
> > Error: argument 'incomparables != FALSE' is not used (yet)
> > 
> > How I might tackle this?
> > 
> > Thanks,
> > 
> > -stacey
> > 
> > -- 
> > -stacey lee thompson-
> > Stagiaire post-doctorale
> > Institut de recherche en biologie végétale
> > Université de Montréal
> > 4101 Sherbrooke Est
> > Montréal, Québec H1X 2B2 Canada
> > stacey.thompson at umontreal.ca
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
> 
> Petr Pikal
> petr.pikal at precheza.cz
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list