[R] End of line marker?

Duncan Murdoch murdoch at stats.uwo.ca
Fri Mar 5 05:55:30 CET 2010


On 04/03/2010 11:40 PM, David Winsemius wrote:
> On Mar 4, 2010, at 10:58 PM, Duncan Murdoch wrote:
> 
>> On 04/03/2010 10:32 PM, David Winsemius wrote:
>>> On Mar 4, 2010, at 9:47 PM, jonas garcia wrote:
>>>> When I opened the file with a hex-editor, the problematic  
>>>> character  turned out to be “1a”
>>>> I am attaching a sample DAT file with 3 lines (the second line is   
>>>> the one with the undesirable character).
>>>>
>>>> The furthest I could get was through readBin:
>>>>
>>>>> tmp<- readBin("new.dat", what = "raw", n=100000000)
>>>>  [1] 30 32 3a 33 35 3a 33 32 2c 20 34 34 30 33 2c 20 33 37 2e 31  
>>>> 31  34 2c 2d 32 30 2e 38 33 36 2c 31
>>>> [33] 35 35 2e 39 2c 30 30 2e 37 36 2c 31 31 35 36 0d 0a 30 32 3a  
>>>> 33  35 3a 33 35 2c 20 34 34 33 32 2c
>>>> [65] 20 33 37 2e 31 31 34 2c 2d 32 30 2e 38 33 36 2c 31 35 35 2e  
>>>> 38  2c 1a 30 2e 38 31 2c 31 31 35 37
>>>> [97] 0d 0a 30 32 3a 33 35 3a 33 39 2c 20 34 34 36 37 2c 20 33 37  
>>>> 2e  31 31 34 2c 2d 32 30 2e 38 33 36
>>>> [129] 2c 31 35 35 2e 38 2c 30 30 2e 38 31 2c 31 31 35 38
>>>>
>>>>
>>>>> tmp[87]
>>>> [1] 1a
>>> I got a different "interpretation" of that character when I let R  
>>> look  at it. And I cannot figure out why \032 should be causing  
>>> problems??? :
>> Hex 1a and octal 032 both correspond to Ctrl-Z, which is the MSDOS  
>> EOF marker.  I forget whether R's text reading routines pay  
>> attention to that, or whether it's the C runtime, but it makes sense  
>> that it would cause problems on Windows.
>>
>> Duncan Murdoch
> 
> Thanks. I was interpreting \032 as decimal, so couldn't figure out why  
> it should equal 0x1A. You've explained the basis (or base) of my  
> confusion.

By the way, here's one way to remove the bad char.  Read it using 
readBin as above, then

tmp <- tmp[tmp != 0x1a]

to remove the bad chars, or

tmp[tmp == 0x1a] <- charToRaw(" ")

to replace them with spaces.  Then write the tmp vector out to a file 
with writeBin.

Duncan Murdoch



More information about the R-help mailing list