[R] How to substitute special characters within a data frame?

Yingfu Xie Yingfu.Xie at sekon.slu.se
Fri Aug 15 15:57:05 CEST 2008


Thanks to Prof. Ripley and Henrique, gsub does do the job. In addition, we can use like gsub("\\\\345","aa", the column of the data frame) to replace all such characters in this column.

By the way, I am using Windows Vista, R 2.6.1, in Sweden. As for the \\345 instead of \345, that is because, for some reasons as incomplete end line problem and missing data, I first imported the data into S-plus using S-plus's utility, dumped it out and restored it in R.

Thank you,
Yingfu

-----Original Message-----
From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
Sent: Friday, August 15, 2008 12:44 PM
To: Yingfu Xie
Cc: r-help at r-project.org
Subject: Re: [R] How to substitute special characters within a data frame?

You've not told us the 'at a minimum' information requested in the posting
guide.  What OS?  What locale? And how did you 'import'?

But here's a guess.  If you change \\345 to \345, it should render
correctly in a Latin-1 locale:

> "H\345rkan"
[1] "Hårkan"

If this a UTF-8 locale, convert it

> iconv("H\345rkan", "latin1")
[1] "Hårkan"

and if you have an unsuitable locale, e.g. a Chinese one

> iconv("H\345rkan", "latin1", "ASCII//TRANSLIT")
[1] "Harkan"

or

> gsub("\\\\345", "aa", "H\\345rkan")
[1] "Haarkan"


On Fri, 15 Aug 2008, Yingfu Xie wrote:

> Hello all,
>
> I have a data frame in R, imported from an excel file in Swedish. The
> original file contains several columns that have special characters,
> such as \?{a}, \?{o}, and so on. After import such special characters
> are represented in the data frame by "\\345", "\\366" etc (don't ask me
> why). For example, a word "H?rkan" becomes ''H\\345rkan".

That's odd: the quotes do not match.

We do need to ask you 'why', as we have nothing reproducible here.

> Now my question is if it is possible to substitute those "H\\345rkan" by
> "Haarkan" or simply "Harkan" in R, ideally by finding those "\\345" and
> then replacing.
>
> Thanks in advance,
> Yingfu
>
>       [[alternative HTML version deleted]]

Please don't (as the posting guide asked).  Properly encoded plain text
has a chance of working.


--
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list