[R] End of File for binary files

Duncan Murdoch murdoch at stats.uwo.ca
Sun Jan 24 20:54:53 CET 2010


On 24/01/2010 12:15 PM, Henrik Bengtsson wrote:
> On Sun, Jan 24, 2010 at 3:56 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
>> On 23/01/2010 10:40 PM, rn00b wrote:
>>> I am using readBin to continuously read characters from the binary file.
>>> I'm
>>> trying to figure out how many characters are in the file. What I would
>>> like
>>> to do is something like
>>> (while! EOF)
>>> {
>>> charRead <-.Internal(readBin(con,"character",1L,NA,TRUE,swap))
>>> i++
>>> }
>>>
>>> I'm not clear on how to determine the EOF condition in this case.
>> You should not be calling .Internal.  It's for internal use, subject to
>> change, etc.
>>
>> Using readBin(...)  you can detect EOF by reading fewer than n items when
>> you ask for n.
> 
> Is this safe?  Is EOF the only case where readBin() returns fewer
> elements than you requested?  Does it depend on the type of connection
> you are reading from?  Does it depend on OS?
> 
> The help("readBin") does not say much about this, but it says:
> 
> "If readBin(what = character()) is used incorrectly on a file which
> does not contain C-style character strings, warnings (usually many)
> are given. From a file or connection, the input will be broken into
> pieces of length 10000 with any final part being discarded."
> 
> which seems to suggest that you (at least in special cases) can get
> fewer items than requested without hitting EOF.  EOF behavior might be
> document elsewhere in R.

I read the passage above to say I might get more than n strings from a 
file containing n of them if they are too long; I don't see it saying 
that I would ever get fewer than n if there are n properly terminated 
strings remaining in the file.

EOF is not the only case in general where you'd get fewer than n strings 
(e.g. a non-blocking connection will only return what's in the buffer), 
but in the usual case of reading from a file, it should be safe.

If you are worried about some particular case, check the source code. 
It's always the final arbiter.

Duncan Murdoch


> 
> /Henrik
> 
>> So the loop would be something like
>>
>> while (length(charRead <- readBin(con, "character")) > 0) {
>>  i <- i + 1
>> }
>>
>> Duncan Murdoch
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>



More information about the R-help mailing list