[R] Problem importing square character

Duncan Murdoch murdoch.duncan at gmail.com
Fri Sep 10 16:35:06 CEST 2010


  On 10/09/2010 10:03 AM, Marcelo Estácio wrote:
>
>
> Dear,
>
> When I try to to execute the following command, R don't read all lines (reads only 57658 lines when the file has 814125 lines):
>
>
>
> dados2<-read.table("C:\\Documents and Settings\\mgoncalves\\Desktop\\Tábua IFPD\\200701_02_03_04\\SegurosClube.txt",header=FALSE,sep="^",colClasses=c("character","character","NULL",NA,"NULL","NULL","NULL","character","character","NULL","NULL","NULL","NULL",NA,"NULL","NULL","NULL","NULL",NA,"NULL","NULL"),quote="",comment.char="",skip=1,fill=TRUE)
>
> If I exclude "fill=TRUE", R gives the message
>
>
>
> Warning message:
> In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>    número de itens não é múltiplo do número de colunas (number of itens is not multiple of number of columns)
>
>
>
> I identified that the problem is the following line of my data (line 57659 of my file):
>
>
>
> 13850074571^01/01/1940^00000000000^93101104^^^1^01/05/2006^30/06/2006^13479^13479^13479^0^0^0^0^^66214-Previdência privada fechada^MARIA^DA CONCEI`O FERREIRA LOBATO^CORPORATE
>
>
> As you can observe, my data have a "square" string like this:  (i don't know if you can see the character, but it looks like a white square). It looks like that R understands this character as the end of the archive.
>
> I opened my data on the notepad and copied the character. When I paste this character on R, it try to close asking if I want to save my work. What is happenning?

That symbol is the way some systems display the hex 1A character, which 
in DOS marked the end of file.  By the pathname it looks as though 
you're working on Windows, which has inherited that behaviour.

The best way to get around it would be to correct those bad characters:  
they are almost certainly errors in the data file.  If you want to keep 
them, then you could try reading the file in binary mode rather than 
text mode.  You do this using

con <- file( "filename", open="rb")
read.table(con, header=FALSE, ...)
close(con)

You could also try reading it on a different OS; I don't think Linux 
cares about 1A characters.

Duncan Murdoch

>
>
> Thanks very much.
>
> Marcelo Estácio
>
>   		 	   		
> 	[[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list