[R] vector operation using regexpr?

markleeds at verizon.net markleeds at verizon.net
Thu Aug 21 07:09:52 CEST 2008


Hi  John: I didn't realize that that was your problem. You can make it 
work for any number of rows by putting it in lapply as below.
I'm sorry for the misunderstanding. I'll send to the list also since I 
guess my last solution was kind of bad now that I understand what you 
want.

DF <- 
data.frame(col1=c("L","T"),col2=c("MAIL","KITE"),col3=c("PLOY","SIX"))
print(DF)

newcol <- lapply(1:nrow(DF), function(.row) {
   result <- NULL
   if ( regexpr(DF[.row,1],DF[.row,2]) != -1 ) result <- 
substr(DF[.row,3],regexpr(DF[.row,1],DF[.row,2]),regexpr(DF[.row,1],DF[.row,2]))
   result
})

print(newcol)

# BELOW IS FOR IF YOU ONLY WANT TO KEEP THE ONES THAT WERE FOUND
# AND NOT THE NULLS
newcol <- newcol[!sapply(newcol,is.null)]
print(newcol)





On Thu, Aug 21, 2008 at 12:25 AM, John Christie wrote:

> The problem with the grep family of commands is that they either test 
> a string against a list of strings or test a list of strings against a 
> string.  But they cannot do both simultaneously.  Your example only 
> works if there is only one row.
>
> On Aug 21, 2008, at 12:30 AM, markleeds at verizon.net wrote:
>
>> John: Below takes care of when L is not there but it's too ugly so 
>> I'm not even going to send this to the list. There should be a 
>> better way of doing it but I'm still learning ( I guess one can 
>> consider me a senior newbie !!! ) also so I don't know it. Good luck.
>>
>> DF <- data.frame(col1="Y",col2="MAIL",col3="PLOY")
>> result <- NULL
>> if ( regexpr(DF$col1,DF$col2) != -1 ) result <- substr(DF 
>> $col3,regexpr(DF$col1,DF$col2),regexpr(DF$col1,DF$col2))
>> print(result)
>>
>>
>>
>> On Wed, Aug 20, 2008 at 11:21 PM, markleeds at verizon.net wrote:
>>
>>> Hi: I think you want regexpr so below does what you want but it 
>>> doesn't handle the case when L isn't in the second column. I'm 
>>> still trying to figure that out but don't count on it. Hopefully 
>>> someone else will reply with that piece.
>>>
>>> DF <- data.frame(col1="L",col2="MAIL",col3="PLOY")
>>> print(DF)
>>> index <- regexpr(DF$col1,DF$col2)
>>> result <- substr(DF$col3,index,index)
>>>
>>>
>>>
>>> On Wed, Aug 20, 2008 at  3:26 PM, John Christie wrote:
>>>
>>>> Hi,
>>>>
>>>> Here's my problem... I have a data frame with three columns 
>>>> containing strings.  The first columns is a simple character. I 
>>>> want to get the index of that character in the second column and 
>>>> use it to extract the item from the third column.  I can do this 
>>>> using a scalar method.  But I'm not finding a vector method.  An 
>>>> example is below.
>>>>
>>>> col1      col2      col3
>>>> 'L'         'MAIL '   'PLOY'
>>>>
>>>> What I want to do with the above is find the index of col1 in col2 
>>>> (4) and then use it to extract the character from col3 ('Y').  I 
>>>> could do the last part if I could get the index in a vector 
>>>> fashion.
>>>>
>>>> So, the shorter question is, how do I get the index of the letter 
>>>> in col1 as it is found in col2?
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list