[R] scan() Bug

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jan 26 17:14:12 CET 2004


The \zzz notation is octal (just like C)!  I presume you want ASCII
character 10, that is LF, not 8 (BS), although using \n would be much
easier to remember.

On Mon, 26 Jan 2004, Greg Riddick wrote:

> Thanks for your suggestions on dealing with binary files, Prof Ripley
> 
> I ended up using this method:
> 
> 
> PDF = file("file.pdf","a+b")
> PDFlines = readLines(PDF)
> .
> .
> .
> (Extract Some Information From PDFlines and create some objects to add back
> to the PDF file)
> .
> .
> .
> writeLines(newobjects, PDF, sep = "\12")
> close(PDF)
> 
> 
> So I opened the file as binary in read/append mode.
> Works fine now...though I have noticed that the sep character that actually
> gets written to the file is -2 the value specified.
> So I wanted \10 and needed to specify \12 to get it. Am I doing something
> wrong here?
> 
> I'm working on an R package to add annotations(hyperlinks, popups etc.) to
> PDF files that I should release in about 2 weeks.  Should be useful
> especially to the bioinformatics
> people who use R. Incidentally, the uncompressed PDF files that I have seen
> R produce are actually just plain text files---human-readable ascii
> characters delimited by CR or CR/LF.  They are binary only in the sense that
> a cross-reference table at the end of the file records byte offsets of
> individual objects in the file. So insertions and deletions cannot be made
> without updating the
> cross-reference table.
> 
> 
> 
> 
> ----- Original Message ----- 
> From: "Prof Brian Ripley" <ripley at stats.ox.ac.uk>
> To: "Greg Riddick" <gr3k at virginia.edu>
> Cc: <r-help at stat.math.ethz.ch>
> Sent: Thursday, January 22, 2004 4:52 PM
> Subject: Re: [R] scan() Bug?
> 
> 
> > On Thu, 22 Jan 2004, Greg Riddick wrote:
> >
> > > I'm reading a file into a list by:
> > > PDF = scan("file",what="character",sep="\10")
> > >
> > > "\10" is the newline character in this file, also tried "\n" originally
> > >
> > > On lines that are ended by "\13\10", both are dropped from the list
> entry
> > > I want scan to keep the "\13" in the list entry.
> > >
> > > Is this a bug or just a strange feature?
> >
> > Not a strange feature, but the documented behaviour (and useful, too).
> > You have opened the file in text mode.  If you want to keep CRs, open and
> > read in binary mode.
> >
> > -- 
> > Brian D. Ripley,                  ripley at stats.ox.ac.uk
> > Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> > University of Oxford,             Tel:  +44 1865 272861 (self)
> > 1 South Parks Road,                     +44 1865 272866 (PA)
> > Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> >
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list