[R] Strange problem with reading a pipe delimited file

Duncan Murdoch murdoch.duncan at gmail.com
Sat Nov 17 22:27:03 CET 2012


On 12-11-17 4:18 PM, Brian Feeny wrote:
> I am trying to read in a pipe delimited file that has rows with varying number of columns, here is my sample data:
>
> A|B|C|D
> A|B|C|D|E|F
> A|B|C|D|E
> A|B|C|D|E|F|G|H|I
> A|B|C|D
> A|B|C|D|E|F|G|H|I|J
>
> You can see line 6 has 10 columns.  Yet, I can't explain why R does like so:
>
>> test <- read.delim("mypaths4.txt", sep="|", quote=NULL, header=F, colClasses="character")
>> test
>    V1 V2 V3 V4 V5 V6 V7 V8 V9
> 1  A  B  C  D
> 2  A  B  C  D  E  F
> 3  A  B  C  D  E
> 4  A  B  C  D  E  F  G  H  I
> 5  A  B  C  D
> 6  A  B  C  D  E  F  G  H  I
> 7  J
>
> You can see it moved "J" to row 7, I don't understand why it is not left in position 6,10.
>
> So, more strange to me, I remove line 1, so my data file contains:
>
> A|B|C|D|E|F
> A|B|C|D|E
> A|B|C|D|E|F|G|H|I
> A|B|C|D
> A|B|C|D|E|F|G|H|I|J
>
> and I get a totally different result:
>
>> test <- read.delim("mypaths5.txt", sep="|", quote=NULL, header=F, colClasses="character")
>> test
>    V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
> 1  A  B  C  D  E  F
> 2  A  B  C  D  E
> 3  A  B  C  D  E  F  G  H  I
> 4  A  B  C  D
> 5  A  B  C  D  E  F  G  H  I   J
>
> what it is that I am doing that is changing the fate of that final "J"?  This is just a basic ASCII text file, pipe delimited as shown.

I would suggest reading the help file: read.delim only looks at the 
first 5 lines to determine the number of columns if you don't specify 
the colClasses.

Duncan Murdoch




More information about the R-help mailing list