[Rd] binary file access [was "RFC: System and time support"...]

Duncan Murdoch dmurdoch@pair.com
Tue, 25 Jul 2000 12:06:58 GMT

On Tue, 25 Jul 2000 10:33:17 +0100 (BST), you wrote:

>>     Duncan> I think it would be really useful, and have put together a
>>     Duncan> prototype (still for Windows only) that's on my web site at
>> 	                        ==============
>>     Duncan> <http://www.stats.uwo.ca/faculty/murdoch/software/Rstreams.zip>
>> I first thought "great! .." when you announced this a while ago, but
>> "Windows only" & relying on Delphi, i.e. proprietary software,
>> stopped me to even have a look, sorry.
>> We are committed primarily to the POSIX "clarification" of ANSI C and freely
>> available tools.
>I don't think that translating/re-writing it is a problem, but I thought
>Duncan was planning to do this.  If not I will have a go.

I am, but it won't be quick.  I now have a C compiler installed, but I
have very little experience writing in C.  I'll probably ask for help
later in cleaning up whatever I write.  

>> An aside :
>>   Your binary files are read into/from "character", right?
>[Doesn't look like it to me!]

Right, as you quoted I try to read and write the native R types.  It
seemed to me that doing type conversions was a lot easier externally
than it would be internally.

>> Coming back to your package:
>> Is it worth/fast enough to port this to POSIX C?

I imagine it would be really easy for someone fluent in both Delphi &
C to port it.  I'm fluent in Delphi but not C (which is why I wrote it
in Delphi in the first place), so it'll take a while longer for me.

>> Have you ever compared it to the (very general) approach taken by Sv4 ?
>> That would be something worth following at least in parts, I think.

No, I'm not familiar with that.

>although some of that is Windows-specific (10 bytes = 80 bits = extended
>format, I presume).

Yes, but that's Intel-specific, not Windows-specific.  I think it
would be useful to have on any platform: the idea is that this code
will allow you to read binary files produced by someone else. For
example, I do include code to handle byte-order switching, and use it
in the demo routine (readsfile) to be able to read binary S objects,
whether produced on big or little endian machines.

If the data you're reading were produced on an Intel platform, they
might include the extended types. Are there conversion routines
available in the libraries that R already uses for machines that don't
support extended as a native type?

>I suspect the best way forward is a get a general (non-Delphi, both Unix
>and Windows) contributed package working and on CRAN, and then think about
>merging it into base if it looks worthwhile.  (There is a lot of very
>useful stuff not in base, and the point of my original posting was that
>those are things which need to be internal and OS-specific.)

Sounds good to me.
>BTW, I think something like inttostr (but not that name) and its converse
>would be useful in base.  
>  Converts an integer to a string representation in base 2 to 36.
>My memory says S had a function called something like oddometer, but
>I can't find it.

I don't object to a name change, but I think "odometer" is a pretty
bad choice.

>Another comment: The R code uses _, F and T and is seriously lacking in
>spaces. One way to get standard formating is to set
>options(keep.source=FALSE) and then read in and dump the code.

Thanks, I'll do that.

>How much support is there for adding a `raw' (byte-stream) type?

I haven't had any need for such a thing, but the streams code was
written with the intention that it could be extended to handle streams
of bytes from other sources than just files.  I think that will likely
be hidden when I translate to C:  not being an OO language, it doesn't
really support the concept of an abstract stream, as far as I know.

r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch