[R] ReadLines question

Roger Bivand Roger.Bivand at nhh.no
Sat Oct 21 22:03:31 CEST 2006


On Sat, 21 Oct 2006, Jonathan Greenberg wrote:

> That looks to me like an infinity sign (I have no idea why that is part of
> the header of this file, but it is there).  How do I modify the encoding to
> read this in? 

The problem is the degree sign. Under linux:

$ file tmp/Marlette_lake_snotel.csv
tmp/Marlette_lake_snotel.csv: ISO-8859 text, with CRLF, CR line terminators

so probably the conversion to multibyte is happening on your reading 
platform. Reading the file into 2.4.0 on Windows with a Norwegian 1252 
setting (Sys.getlocale()), I see the degree sign.

If you know the column names anyway, jump over the header and 
insert them yourself. Alternatively filter the non-ASCII character out 
before reading, it looks predictably like a degree sign. In any case, the 
character is not very practical in a column name.

Roger

> 
> --j
> 
> 
> On 10/21/06 4:33 AM, "Roger Bivand" <Roger.Bivand at nhh.no> wrote:
> 
> > On Sat, 21 Oct 2006, Jonathan Greenberg wrote:
> > 
> >> I'm getting the following error:
> >> 
> >>> headerinfo=readLines(met_station_file,n=8)
> >>> headerinfo
> >> [1] "Plot Title: tahoe met validation ,,,,,,,"
> >> [2]Error: invalid multibyte string
> >             ^^^^^^^^^^^^^^^^^^^^^^^^
> > 
> >> 
> >> met_station_file's first 8 lines are as follows:
> >> 
> >> Plot Title: tahoe met validation ,,,,,,,
> >> #,"Time, GMT-07:00","Temp, ƒF",Coupler Attached,Host Connected,Coupler
> >                              ^^^
> > 
> > or whatever this looks like to you (was ^Ã for me in LC_CTYPE=en_GB) is a
> > multibyte string. Is there a mismatch between the encoding (see ?locales)
> > of the file and the machine into which you are reading?
> > 
> >> Detached,Stopped,End Of File
> >> 34,10/1/2005 0:00,49.937,,,,,
> >> 35,10/1/2005 0:30,47.266,,,,,
> >> 36,10/1/2005 1:00,47.446,,,,,
> >> 37,10/1/2005 1:30,47.982,,,,,
> >> 38,10/1/2005 2:00,48.517,,,,,
> >> 39,10/1/2005 2:30,49.228,,,,,
> >> 
> >> Why am I getting this error?  Are those quotation marks causing the hiccup?
> >> If so, how do I get around this programmatically?
> >> 
> >> --j
> >> 
> >> 
> 
> 
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-help mailing list