[R] Reading a tab delimted file of varying length using read.table
ligges at statistik.tu-dortmund.de
Mon Jan 18 00:43:05 CET 2016
I'll take a look how to fix it tomorrow, your proposal is very welocme,
On 18.01.2016 00:01, Rolf Turner wrote:
> On 18/01/16 10:48, Uwe Ligges wrote:
>> This is not a tab delimited file (as you apparently assume given the
>> code), but a fixed width format, hence I'd try:
>> url <- "http://data.princeton.edu/wws509/datasets/divorce.dat"
>> widths <- c(9, 13, 10, 8, 10, 6)
>> f5 <- read.fwf(url, widths = widths, skip = 1, strip.white = TRUE)
>> names(f5) <- as.character(unlist(read.fwf(url, widths = widths,
>> strip.white=TRUE, n=1)))
>> Not sure why reading it simply with header=TRUE des not work, but no
>> time to investiagte this now.
> Dear Uwe,
> I have fiddled around a bit and the situation seems to me to be of the
> nature of a bug in read.fwf. It would seem that in order for
> header=TRUE to work, the entries of the header need to be separated by
> the sep delimiter which defaults to "\t". In the case in question the
> entries are separated by blanks, so presumably the header gets read in
> as a single entity, rather than 6 such, leading to a mismatch between
> the length of the header and the number of columns.
> It seems that the specified widths get ignored when the header line is
> dealt with.
> It also seems that if one specifies sep="" then the header gets read
> correctly but then strings of blanks get interpreted as field separators
> throughout and then blanks within the fields result in the
> wrong number of columns.
> I think that the code of read.fwf is easy enough to fix; a slight
> adjustment will make the header get treated the same way as the body of
> the file.
> I don't see any problems/drawbacks with so-doing, and experimenting with
> my modified function resulted in the divorce data being read in with
> header=TRUE with no problems.
> If this mod is made, I see no reason to keep the "sep" argument in
> read.fwf --- except maybe for backward compatibility issues, and I don't
> think there would be any since it never worked properly anyhow.
> P. S. I can send you my modified version of read.fwf off-list if this
> would be of any use to you.
More information about the R-help