[R] text matching and substitution

baptiste auguie ba208 at exeter.ac.uk
Sat Mar 28 18:43:35 CET 2009


yet another attempt,

> colours <- as.character(paste(letters,colours(),"stuff",LETTERS))
> target <- c("red","blue","green","gray")

> matches <- melt(sapply(target, grep, x=colours))
> colours[matches$value] <- matches$L1


(probably a worse idea than a straight for loop, though)

baptiste

>
>
On 28 Mar 2009, at 17:08, Stephan Kolassa wrote:

> Hi Simeon,
>
> I played around a little with Vectorize and mapply, but I couldn't  
> make
> it work :-( So, my best guess would be a simple loop like this:
>
> result <- as.character(paste(letters,colours(),"stuff",LETTERS))
> target <- c("red","blue","green","gray")
> for ( new.color in target ) { result[grep(new.color,result)] <-  
> new.color }
>
> Best of luck,
> Stephan
>
>
> simeon duckworth schrieb:
>> stephan
>>
>> sorry for not being clear - but thats exactly what i want.
>>
>> i'd like to replace every complex string that contains "red" with  
>> just
>> "red", and then so on with "blue", "yellow" etc
>>
>> my data is of the form
>>
>> "xxxxx xx xx xxxxx  red xx xxx xx"
>> "xx xxx xxx xx  blue xx xx xx xx x"
>> "x xx xxxxxxxx xx xx xx xxxx red"
>> "red xx xx xx xx xx"
>> "xx xx xx xx xx xx"
>> "xx x x x x xxxx"
>>
>> which i'd like to replace with
>> "red"
>> "blue"
>> "red"
>> "other"
>> "other"
>>
>> thanks
>>
>>
>> On Sat, Mar 28, 2009 at 2:38 PM, Stephan Kolassa <Stephan.Kolassa at gmx.de 
>> >wrote:
>>
>>> Hi Simeon,
>>>
>>> I'm slightly unclear on what exactly you are trying to achieve...  
>>> Are you
>>> trying to replace every entry of colours which *contains* "red" by  
>>> "red",
>>> dropping the rest of the entry? And same with "blue"?
>>>
>>> A short example "before & after" would be helpful...
>>>
>>> Best,
>>> Stephan
>>>
>>>
>>> simeon duckworth schrieb:
>>>
>>> thanks stephan.  i'd been trying to make gsub work, but couldnt  
>>> make it
>>>> replace the whole expression.  so i'd resorted to trying to loop  
>>>> with grep
>>>> -
>>>> but with two problems.   firstly, i cant seem to make the loop  
>>>> 'remember'
>>>> the substitutions it makes (see below).  secondly, it feels like  
>>>> this is a
>>>> really inefficient way of doing something quite simple anyhow.
>>>>
>>>> colours <- as.character(paste(letters,colours(),"stuff",LETTERS))
>>>> target <- c("red","blue","green","gray")
>>>> new.colour <-colours
>>>> for (i in length(target)) {
>>>>   x <- target[i]
>>>>   new.colour[grep((x),new.colour)] <- x
>>>>   return(new.colour)
>>>>   }
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Mar 28, 2009 at 9:45 AM, Stephan Kolassa <Stephan.Kolassa at gmx.de
>>>>> wrote:
>>>> Hi Simeon,
>>>>> ?gsub
>>>>>
>>>>> HTH,
>>>>> Stephan
>>>>>
>>>>> simeon duckworth schrieb:
>>>>>
>>>>> I am trying to simplify a text variable by matching and  
>>>>> replacing it
>>>>>> with
>>>>>> a
>>>>>> string in another vector
>>>>>>
>>>>>> so for example in
>>>>>> colours <- paste(letters,colours(),"stuff",LETTERS)
>>>>>>
>>>>>> find and replace with  
>>>>>> ("red","blue","green","gray","yellow","other")  -
>>>>>> irrespective of case
>>>>>>
>>>>>> its a large dataset, so i'd like to be able to do this as  
>>>>>> efficiently as
>>>>>> possible.
>>>>>>
>>>>>> thanks for any help
>>>>>>
>>>>>>      [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible  
>>>>>> code.
>>>>>>
>>>>>>
>>>>>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

_____________________________

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag




More information about the R-help mailing list