[R] how to have 'match' ignore no-matches

David Winsemius dwinsemius at comcast.net
Mon Oct 5 23:47:54 CEST 2009


On Oct 5, 2009, at 4:47 PM, Jill Hollenbach wrote:

>
> Let me clarify:
> I'm using this--
>
> dfnew<- sapply(df, function(df) lookuptable[match(df, lookuptable [ , 
> 1]),
> 2])

It seems a very bad idea to use the same name in your functions as  
that of the dataframe argument you might be passing. You end up with  
two different objects (in different environments) both with the name  
"df". The R interpreter of course can handle keeping those two objects  
separate, but my concern is for the poor "wetware" interpreters  
including you out here in R-help-land.

>
>> lookup
> 0101   01:01
> 0201   02:01
> 0301   03:01
> 0401   04:01

These are not cut-and-pastable. (And I cannot figure out what data  
type you expect them to be. They are not displayed in a form that I  
would expect to see at the console from either a matrix or a  
dataframe. Use the dput function to show an ASCII interpretable form  
that can be unambiguously assigned to a variable.

lookup <- read.table(textConnection("
  0101   01:01
  0201   02:01
  0301   03:01
  0401   04:01") )


 > lookup   #as a dataframe would be displayed
    V1    V2
1 101 01:01
2 201 02:01
3 301 03:01
4 401 04:01
 > str(lookup)
'data.frame':	4 obs. of  2 variables:
  $ V1: int  101 201 301 401

(Impossible to tell if you have your first column as an integer or  
character (or even whether you are thinking of them a columns at all  
given how you later indicate you want your output.)

  $ V2: Factor w/ 4 levels "01:01","02:01",..: 1 2 3 4
 > dput(lookup)
structure(list(V1 = c(101L, 201L, 301L, 401L), V2 =  
structure(1:4, .Label = c("01:01",
"02:01", "03:01", "04:01"), class = "factor")), .Names = c("V1",
"V2"), class = "data.frame", row.names = c(NA, -4L))

Easy and completely unambiguous to type "lookup <-" and then paste in  
the output of dput.

>
>> df
> 0101   0301
> 0201   0401
> 0101   0502
>
>> dfnew
> 01:01   03:01
> 02:01   04:01
> 01:01   NA
>
> but what I want is:
>> dfnew2
> 01:01   03:01
> 02:01   04:01
> 01:01   0502
>
> thanks again,
> Jill
>
>
>
>
> Jill Hollenbach wrote:
>>
>> Hi all,
>> I think this is a very basic question, but I'm new to this so  
>> please bear
>> with me.
>>
>> I'm using match to translate elements of a data frame using a lookup
>> table. If the content of a particular cell is not found in the lookup
>> table, the function returns NA. I'm wondering how I can just ignore  
>> those
>> cells, and return the original contents if no match is found in the  
>> lookup
>> table.
>>
>> Many thanks in advance, this site has been extremely helpful for me  
>> so
>> far,
>> Jill
>>
>> Jill Hollenbach, PhD, MPH
>>    Assistant Staff Scientist
>>    Center for Genetics
>>    Children's Hospital Oakland Research Institute
>>    jhollenbach at chori.org
>>
>
> -- 
> View this message in context: http://www.nabble.com/how-to-have-%27match%27-ignore-no-matches-tp25756601p25757009.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list