[R] cannot read iso639 table

Sam Steingold sds at gnu.org
Thu Sep 13 22:17:47 CEST 2012


> * William Dunlap <jqhaync at gvopb.pbz> [2012-09-13 19:50:21 +0000]:
>
> On Windows with R-2.15.1 in a 1252 locale, I had to read (and toss) out
> the initial 3 bytes (the byte-order mark?) to make things work:
>
>   > socket <-
>   > url("http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt",open="r",encoding="utf-8")
>   > readChar(socket, nchars=3, useBytes=TRUE)
>   [1] ""

confirmed - first 3 bytes are "\357\273\277"

>   > d <- read.table(socket, quote="", sep="|", stringsAsFactors=FALSE)
>   > dim(d)
>   [1] 485   5
>   > head(d)
>      V1 V2 V3             V4      V5
>   1 aar    aa           Afar    afar
>   2 abk    ab      Abkhazian abkhaze
>   3 ace             Achinese    aceh
>   4 ach                Acoli   acoli
>   5 ada              Adangme adangme
>   6 ady       Adyghe; Adygei  adyghé

alas, this is all I get:

Warning message:
In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  invalid input found on input connection 'http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt'

  a3bibliographic a3terminologic a2        english  french
1             aar             NA aa           Afar    afar
2             abk             NA ab      Abkhazian abkhaze
3             ace             NA          Achinese    aceh
4             ach             NA             Acoli   acoli
5             ada             NA           Adangme adangme
6             ady             NA    Adyghe; Adygei   adygh

note that the first non-ASCII character terminates the input.

so, I still cannot read the data from the URL.

I can read the file though - with quote="" (thanks Peter!) -
except that the first record is "\357\273\277aar".


-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://thereligionofpeace.com
http://mideasttruth.com http://iris.org.il http://jihadwatch.org
The only thing worse than X Windows: (X Windows) - X




More information about the R-help mailing list