[R] Replace Text but not from within a word
jdnewmil at dcn.davis.ca.us
Tue Feb 28 15:36:59 CET 2017
For tasks like this, you will probably want to make sure to import the data as character data rather than as a factor. E.g.
dat <- read.csv( "myfile.csv", header=FALSE, as.is=TRUE )
You can check what you have with the str() function.
Sent from my phone. Please excuse my brevity.
On February 28, 2017 5:19:40 AM PST, Marc Schwartz <marc_schwartz at me.com> wrote:
>> On Feb 28, 2017, at 3:38 AM, Harshal Athawale
><pgcim15.harshal at spjimr.org> wrote:
>> I am new in R.
>> I have a file. This file contains name of the companies.
>> 'data.frame': 494 obs. of 1 variable:
>> $ V1: Factor w/ 470 levels "3-d engineering corp",..: 293 134 339 359
>> 399 122 447 398 384 ...
>> Problem: I would like to remove "CO" (As it is the most frequent
>> would like "CO" to removed from BOEING CO --> BOEING but not from
>> *CO*UNTY INC*. *
>>> text = c("BOEING CO","ENGMANTAYLOR CO","SAGINAW COUNTY INC")
>>> gsub(x = text, pattern = "CO", replacement = "")
>>  "BOEING " "ENGMANTAYLOR " "SAGINAW UNTY"
>> Thanks in advance.
>> - Sam
>See ?regex and ?grep for some details and examples on how to construct
>the expression used for matching, as well as some of the references
>In this case, you want to use something along the lines of:
>> gsub(" CO$", "", text)
> "BOEING" "ENGMANTAYLOR" "SAGINAW COUNTY INC"
>where the "CO" is preceded by a space and followed by the "$", which is
>a special character that indicates the end of the string to be matched.
> [[alternative HTML version deleted]]
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>PLEASE do read the posting guide
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help