[R] Tools for data preparation?

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Fri Nov 19 09:56:47 CET 2004


On 19-Nov-04 David Mitchell wrote:
> Hello list,
> 
> I'm regularly in the position where I have to do a lot of data
> manipulation, in order to get the data I have into a format R
> is happy with.  This manipulation would generally be in one of
> two forms:
> - getting data from e.g. text log files into a tabular format
> - extracting sensible sample data from a very large data set
> (i.e. too large for R to handle)
> 
> In general, I use Perl or Python to do the task; I'm curious
> as to what others use when they hit the same problem.

I generally use 'awk' with help from 'sed' when needed.
This is on the same lines as your choice though lighter-weight
and less powerful (but I've never had a case that needed more).

Since the sort of task you describe is basically on a line-by-line
basis (and what's meant by a "line" can be pretty flexible in 'awk'),
this sort of thing can be done straightforwardly; but greater
flexibility is also possible.

E.g. it is easy to extract a line from the input, or apply a certain
transformation to fields in a line, if & only if it has already been
preceded by a line satisfying a certain condition, and so on.

Best wishes,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 19-Nov-04                                       Time: 08:56:47
------------------------------ XFMail ------------------------------




More information about the R-help mailing list