[R] remove Punctuation characters

Filipe Almeida milheiros at gmail.com
Wed May 10 10:42:09 CEST 2006


Thanks a lot!!

Filipe Almeida

Marc Schwartz (via MN) wrote:
> On Tue, 2006-05-09 at 16:50 +0100, Filipe Almeida wrote:
>   
>> Hi,
>>
>> I want to remove all punctuation characters in a string. I was trying it use
>> a regular expressions but it doesn't work.
>> Here is a sample os what i want:
>>
>> str <- 'ABD - remove de punct, and dot characters.'
>> str <- gsub('[:punct:]','',str)
>> str
>> "'ABD remove de punct and dot characters"
>>
>> is there any function that do this kind of thing?
>>
>> Thanks to all.
>>
>> Filipe Almeida
>>     
>
> You almost have it.  Just need to double the brackets:
>
>   
>> str
>>     
> [1] "ABD - remove de punct, and dot characters."
>
>   
>> gsub("[[:punct:]]", "", str)
>>     
> [1] "ABD  remove de punct and dot characters"
>
>
> Note the following in ?regex:
>
> For example, [[:alnum:]] means [0-9A-Za-z], except the latter depends
> upon the locale and the character encoding, whereas the former is
> independent of locale and character set. (Note that the brackets in
> these class names are part of the symbolic names, and must be included
> in addition to the brackets delimiting the bracket list.) Most
> metacharacters lose their special meaning inside lists. To include a
> literal ], place it first in the list. Similarly, to include a literal
> ^, place it anywhere but first. Finally, to include a literal -, place
> it first or last. (Only these and \ remain special inside character
> classes.)
>
> HTH,
>
> Marc Schwartz
>
>
>
>




More information about the R-help mailing list