[R] need to find (and distinguish types of) carriage returns in a file that is scanned using scan
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sat Aug 19 17:16:00 CEST 2006
On Sat, 19 Aug 2006, Quicke, Donald L J wrote:
> Hope this is not too trivial
> I am reading a large file using scan.
> In one part of this file there is a chunk of text within which i need to
> know the positions of line breaks. But scan seems only
> An example of the file is:
> a 0 1 0
> bftt 020
> cftt T 1 R
> a 0 1 2 1 2
> b 0 1 2 2 2
> c 0 10 00
> so precisely i need in the scanned file in R to know where each carriage
> return is in the file so that i can then identify the text strings (i.e.
> a, bftt, cftt, a, b, c ) that immediately follow the carriage return
Sounds like a job for readLines.
> On a subsidiary matter, it would be very helpful if i could distinguish
> between Unix, Dos, and Mac carriage returns in the data file
AFAIK there is only type of carriage return character (ASCII code Ctrl-M).
If you mean between CRLF, LF and perhaps CR line endings, you need to read
the files as raw bytes since R's text mode regards all three as equally a
line ending. But that can perfectly well be done using binary-mode
> i should note also, that the input file contains much other stuff and is
> not in the form of a table that can be read using read.table or other
> read version. Nor do i know beforehand how many elements there are in
> each line
Sounds like a job for connections ...
> [[alternative HTML version deleted]]
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
PLEASE do as we ask.
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help