[R] Data separated by spaces, getting data into R using field lengths

Duncan Murdoch murdoch at stats.uwo.ca
Tue Sep 8 14:15:02 CEST 2009


On 9/8/2009 8:07 AM, Lauri Nikkinen wrote:
> Thanks, I tried it but I got
> 
>> varlength <- c(2, 2, 18, 5, 18)
>> read.fwf("c:temppi.txt", widths=varlength)
>   V1 V2                 V3    V4   V5
> 1 DF 12  This is an exampl e 1 T  his
> 2 DF 12  This is an 1232 T his i    s
> 3 DF 14  This is 12334 Thi s is   an
> 4 DF 15  This 23 This is a n exa mple
> 
> Which is not the way I want it.  

It looks as though that's because you don't have fixed width data.  " 
This is an example" is 19 chars, including the leading space.  You told 
R it was 18.  " This is an " is only 12 characters.

I would say you have two fixed width fields, and three varying fields, 
with no delimiters.  If the middle one of the three always contains 
digits and the others don't, you can probably extract them using sub(), 
but you can't use any of the read.* functions to do this:  your format 
is too strange.

Duncan Murdoch

> 
> structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class
> = "factor"),
>     V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L,
>     1L), .Label = c(" This 23 This is a", " This is 12334 Thi",
>     " This is an 1232 T", " This is an exampl"), class = "factor"),
>     V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i",
>     "n exa", "s is "), class = "factor"), V5 = structure(c(2L,
>     4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class =
> "factor")), .Names = c("V1",
> "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA,
> -4L))
> 
> Any ideas?
> -L
> 
> 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>:
>> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote:
>>>
>>> I have a text file similar to this (separated by spaces):
>>>
>>> x <- "DF12 This is an example 1 This
>>> DF12 This is an 1232 This is
>>> DF14 This is 12334 This is an
>>> DF15 This 23 This is an example
>>> "
>>>
>>> and I know the field lengths of each variable (there is 5 variables in
>>> this data set), which are:
>>>
>>> varlength <- c(2, 2, 18, 5, 18)
>>>
>>> How can I import this kind of data into R, using the varlength
>>> variable as an field separator indicator?
>>
>> See ?read.fwf.
>>
>> Duncan Murdoch
>>




More information about the R-help mailing list