[R] extracting data using strings as delimiters

(Ted Harding) Ted.Harding at manchester.ac.uk
Wed Sep 26 00:20:03 CEST 2007


On 25-Sep-07 20:39:11, lucy b wrote:
> Dear List,
> 
> I have an ascii text file with data I'd like to extract. Example:
> 
> Year Built:  1873 Gross Building Area:  578 sq ft
> Total Rooms:  6 Living Area:  578 sq ft
> 
> There is a lot of data I'd like to ignore in each record, so I'm
> hoping there is a way to use strings as delimiters to get the data I
> want (e.g. tell R to take data between "Built:" and "Gross" -
> incidentally, not always numeric). I think an ugly way would be to
> start at the end of each record and use a substitution expression to
> chip away at it, but I'm afraid it will take forever to run. Is there
> a way to use strings as delimiters in an expression?
> 
> Thanks in advance for ideas.
> 
> LB

The scope of what you're trying to achieve is not clear,
though on the basis of your examples above you'd have to
use a different separator pattern for each type of line.

For your first example, a simple method is on the lines of

gsub(".*Built:" , "",
     "Year Built:  1873 Gross Building Area:  578 sq ft")
[1] "  1873 Gross Building Area:  578 sq ft"

and then just take the first white-space-delimited field
from the result.

Best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 25-Sep-07                                       Time: 23:20:01
------------------------------ XFMail ------------------------------



More information about the R-help mailing list