[R] Embedded carriage returns in text document

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Nov 13 23:42:39 CET 2006


Dennis Fisher <fisher at plessthan.com> writes:

> Colleagues,
> 
> I am using R 2.4.0 on both a Mac (10.4.8) and Linux (RedHat 9).  To  
> read data from an Excel spreadsheet, I do "save as"  in Excel, then  
> select the "Text (tab-delimited)" format.  The resulting file uses a  
> tab separator and I can usually read the file using read.delim.
> 
> Sometimes, the header row contains embedded carriage returns.  When I  
> view the file, these carriage returns appear as  "^M".
> 
> Now the problem:
> When I read.delim these files, they do not read correctly.  Sometimes  
> I get error messages; sometimes only the first line is read.   
> Interestingly, invoking the option skip=1 (or a larger N) does not  
> appear to bypass the problem.
> 
> I can solve the problem by manually deleting these carriage returns  
> either in the original Excel file or the .txt version.   However,  
> this is not an ideal solution.
> 
> Does anyone have a work-around within R?

Hmm,... I suppose that CR messes with what R or the system thinks is
the line-end character in this particular file.

My first idea would be to read from a pipe() which executed 

sed 's/\r//' myfile.dat

or something in that vein. Beware of quoting and differences in sed
versions  



-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-help mailing list