[R] puzzle using gsub (and encodings maybe)

Adrian Dragulescu adrian_d at eskimo.com
Wed Oct 14 19:41:50 CEST 2009



> charToRaw(x)
  [1] 4e 45 57 20 59 4f 52 4b 20 ad 4e 45 57 20 45 4e 47 4c 41 4e 44
> charToRaw(y)
  [1] 4e 45 57 20 59 4f 52 4b 20 2d 4e 45 57 20 45 4e 47 4c 41 4e 44
>

So they are different.

Adrian

I use R 2.8.1 on WinXP


On Wed, 14 Oct 2009, Duncan Murdoch wrote:

> On 10/14/2009 1:30 PM, Adrian Dragulescu wrote:
>> Hello,
>> 
>> Below is some output that shows my issue.
>> 
>> I have a variable x that I read from a file (more on this below)
>> 
>>> x
>> [1] "NEW YORK NEW ENGLAND"
>>> gsub(" -", "-", x)            # this does not work!
>> [1] "NEW YORK NEW ENGLAND"
>
> It looks as though it worked, presumably because something got lost in your 
> email.
>
> Could you post charToRaw(x) so we can see what's in x?
>
> Duncan Murdoch
>
>>> Encoding(x)                   # is x in a special encoding? no
>> [1] "unknown"
>>> y = "NEW YORK -NEW ENGLAND"   # I type in variable y
>>> gsub(" -", "-", y)            # and gsub works as expected
>> [1] "NEW YORK-NEW ENGLAND"
>>> 
>> 
>> I'm sure the problem has to do with the way I read the variable x.  But 
>> even if I change the encoding for x to ASCII, I still cannot do the sub.
>> I get x by reading a pdf file with pdftotext so you will not be able to 
>> replicate my issue.
>> 
>> Thanks for any suggestions,
>> Adrian
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>




More information about the R-help mailing list