[R] Regex matching that gives byte offset?

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Nov 2 13:41:45 CET 2009


On Mon, 2 Nov 2009, Johannes Graumann wrote:

> Hmmm ... that should do it, thanks. But how would one use this on a file
> without reading it into memory completely?

?file, ?readLines, ?readBin

will tell you about connections.

> Joh
>
>
> On Wednesday 28 October 2009 16:29:00 Prof Brian Ripley wrote:
>> Do you mean like regexpr() (on the same help page)?
>>
>> Depending on your locale, you might actually prefer the character
>> offset: if you want to match in a MBCS and have byte offsets you will
>> need to work a bit harder if useBytes=TRUE is not sufficient for you.
>>
>> On Wed, 28 Oct 2009, Johannes Graumann wrote:
>>> Hi,
>>>
>>> Is there any way of doing 'grep' ore something like it on the content of
>>> a text file and extract the byte positioning of the match in the file?
>>> I'm facing the need to access rather largish (>600MB) XML files and would
>>> like to be able to index them ...
>>>
>>> Thanks for any help or flogging,
>>>
>>> Joh
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html and provide commented,
>>> minimal, self-contained, reproducible code.
>>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list