[R] Data separated by spaces, getting data into R using field lengths

Ben Bolker bolker at ufl.edu
Tue Sep 8 14:17:30 CEST 2009



  I don't think you described your problem precisely.
You implied that you wanted the field lengths to be
(2,2,18,5,18) -- which is what you got with read.fwf -- 
but it looks like what you meant is something more like:

field 1: first two characters
field 2: characters 3-4
field 3: all alphabetic characters up to the next numeric value
   (not more than 18)
field 4: all numeric values up to the next whitespace
   (not more than 5)
field 5: all alphabetic characters to end of line 
   (not more than 18)

  is that correct?  (i.e., perhaps your field lengths
were MAXIMUM lengths?)

  at the moment all I can think of is using read.fwf
with field lengths 2,2, 41 and as.is=TRUE (to preserve
the last field as character), then use some combination
of gsub, grep, strsplit, paste to pull apart the last three fields ...


Lauri Nikkinen wrote:
> 
> Thanks, I tried it but I got
> 
>> varlength <- c(2, 2, 18, 5, 18)
>> read.fwf("c:temppi.txt", widths=varlength)
>   V1 V2                 V3    V4   V5
> 1 DF 12  This is an exampl e 1 T  his
> 2 DF 12  This is an 1232 T his i    s
> 3 DF 14  This is 12334 Thi s is   an
> 4 DF 15  This 23 This is a n exa mple
> 
> Which is not the way I want it.
> 
> structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class
> = "factor"),
>     V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L,
>     1L), .Label = c(" This 23 This is a", " This is 12334 Thi",
>     " This is an 1232 T", " This is an exampl"), class = "factor"),
>     V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i",
>     "n exa", "s is "), class = "factor"), V5 = structure(c(2L,
>     4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class =
> "factor")), .Names = c("V1",
> "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA,
> -4L))
> 
> Any ideas?
> -L
> 
> 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>:
>> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote:
>>>
>>> I have a text file similar to this (separated by spaces):
>>>
>>> x <- "DF12 This is an example 1 This
>>> DF12 This is an 1232 This is
>>> DF14 This is 12334 This is an
>>> DF15 This 23 This is an example
>>> "
>>>
>>> and I know the field lengths of each variable (there is 5 variables in
>>> this data set), which are:
>>>
>>> varlength <- c(2, 2, 18, 5, 18)
>>>
>>> How can I import this kind of data into R, using the varlength
>>> variable as an field separator indicator?
>>
>> See ?read.fwf.
>>
>> Duncan Murdoch
>>
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/Data-separated-by-spaces%2C-getting-data-into-R-using-field-lengths-tp25344686p25345083.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list