[BioC] remove NA from named character vector

Iain Gallagher iaingallagher at btopenworld.com
Fri Jul 22 13:28:39 CEST 2011


Hi Axel

I'm sure I knew that! Leaky brain!

Thanks

i

--- On Fri, 22/7/11, axel.klenk at actelion.com <axel.klenk at actelion.com> wrote:

> From: axel.klenk at actelion.com <axel.klenk at actelion.com>
> Subject: Re: [BioC] remove NA from named character vector
> To: "Iain Gallagher" <iaingallagher at btopenworld.com>
> Cc: "bioconductor" <bioconductor at stat.math.ethz.ch>, bioconductor-bounces at r-project.org
> Date: Friday, 22 July, 2011, 12:11
> Hi Iain,
> 
> you cannot test for NA using the == operator, you'll have
> to use is.na(), 
> eg.
> 
> which(is.na(egs))
> 
> or, if you just want to get rid of them:
> 
> na.omit(egs)
> 
> HTH,
> 
>  - axel
> 
> 
> Axel Klenk
> Research Informatician
> Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123
> Allschwil / 
> Switzerland
> 
> 
> 
> 
> From:
> Iain Gallagher <iaingallagher at btopenworld.com>
> To:
> bioconductor <bioconductor at stat.math.ethz.ch>
> Date:
> 22.07.2011 13:03
> Subject:
> [BioC] remove NA from named character vector
> Sent by:
> bioconductor-bounces at r-project.org
> 
> 
> 
> Hi List
> 
> This is likely a trivial problem but it's annoying me. I am
> mapping from 
> Bos taurus ensembl ids to symbols. I can do this in biomaRt
> but use of the 
> org.Bt.eg.db package means I'm not tied to an internet
> connection. 
> 
> A toy example:
> 
> library(org.Bt.eg.db)
> ens <- c('ENSBTAG00000004218', 'ENSBTAG00000004270',
> 'ENSBTAG00000004578', 
> 'ENSBTAG00000004608')
> egs <- unlist(mget(ens, revmap(org.Bt.egENSEMBL),
> ifnotfound=NA))
> 
> egs
> 
> ENSBTAG00000004218 ENSBTAG00000004270 ENSBTAG00000004578 
> ENSBTAG00000004608 
>           "617660"   
>        "407106"     
>            NA "100138951"
> 
> 
> # a named character vector with one NA
> 
> #now get symbols
> syms <- unlist(mget(egs, org.Bt.egSYMBOL,
> ifnotfound=NA))
> 
> #throws and error - fair enough - need to drop the NA
> 
> which(egs == NA)
> 
> #gives named integer(0) - hmm
> class(egs)
> #gives [1] "character" - so I'm quite confused now.
> 
> NA %in% egs
> #gives [1] TRUE
> 
> 
> How do I identify which entries in 'egs' are NA so I can
> remove them? It's 
> trivial here but the dataset I'm working with is in the
> thousands.
> 
> Thanks
> 
> iain
> 
> > sessionInfo()
> R version 2.13.1 (2011-07-08)
> Platform: x86_64-pc-linux-gnu (64-bit)
> 
> locale:
>  [1] LC_CTYPE=en_GB.utf8   
>    LC_NUMERIC=C 
>  [3] LC_TIME=en_GB.utf8       
> LC_COLLATE=en_GB.utf8 
>  [5] LC_MONETARY=C         
>    LC_MESSAGES=en_GB.utf8 
>  [7] LC_PAPER=en_GB.utf8   
>    LC_NAME=C 
>  [9] LC_ADDRESS=C           
>   LC_TELEPHONE=C 
> [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C 
> 
> attached base packages:
> [1] stats     graphics  grDevices
> utils     datasets 
> methods   base 
> 
> other attached packages:
> [1] org.Bt.eg.db_2.5.0   RSQLite_0.9-4 
>       DBI_0.2-5 
> [4] AnnotationDbi_1.14.1 Biobase_2.10.0 
> 
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 
> 
> 
> The information of this email and in any file transmitted
> with it is strictly confidential and may be legally
> privileged.
> It is intended solely for the addressee. If you are not the
> intended recipient, any copying, distribution or any other
> use of this email is prohibited and may be unlawful. In such
> case, you should please notify the sender immediately and
> destroy this email.
> The content of this email is not legally binding unless
> confirmed by letter.
> Any views expressed in this message are those of the
> individual sender, except where the message states otherwise
> and the sender is authorised to state them to be the views
> of the sender's company. For further information about
> Actelion please see our website at http://www.actelion.com 
> 
> 
>



More information about the Bioconductor mailing list