[R] Accelerating binRead

Philippe de Rochambeau phiroc at free.fr
Sat Sep 17 20:45:52 CEST 2016


Hi Jim,
this is exactly the answer I was look for. Many thanks. I didn’t R had a pack function, as in PERL.
To answer your earlier question, I am trying to update legacy code to read a binary file with unknown size, over a network, slice up it into rows each containing an integer, an integer, a long, a short, a float and a float, and stuff the rows into a matrix.
Best regards,
Philippe

> Le 17 sept. 2016 à 20:38, jim holtman <jholtman at gmail.com> a écrit :
> 
> Here is an example of how to do it:
> 
> x <- 1:10  # integer values
> xf <- seq(1.0, 2, by = 0.1)  # floating point
> 
> setwd("d:/temp")
> 
> # create file to write to
> output <- file('integer.bin', 'wb')
> writeBin(x, output)  # write integer
> writeBin(xf, output)  # write reals
> close(output)
> 
> 
> library(pack)
> library(readr)
> 
> # read all the data at once
> allbin <- read_file_raw('integer.bin')
> 
> # decode the data into a list
> (result <- unpack("V V V V V V V V V V d d d d d d d d d d", allbin))
> 
> 
> 
> 
> Jim Holtman
> Data Munger Guru
>  
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
> 
> On Sat, Sep 17, 2016 at 11:04 AM, Ismail SEZEN <sezenismail at gmail.com <mailto:sezenismail at gmail.com>> wrote:
> I noticed same issue but didnt care much :)
> 
> On Sat, Sep 17, 2016, 18:01 jim holtman <jholtman at gmail.com <mailto:jholtman at gmail.com>> wrote:
> Your example was not reproducible.  Also how do you "break" out of the
> "while" loop?
> 
> 
> Jim Holtman
> Data Munger Guru
> 
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
> 
> On Sat, Sep 17, 2016 at 8:05 AM, Philippe de Rochambeau <phiroc at free.fr <mailto:phiroc at free.fr>>
> wrote:
> 
> > Hello,
> > the following function, which stores numeric values extracted from a
> > binary file, into an R matrix, is very slow, especially when the said file
> > is several MB in size.
> > Should I rewrite the function in inline C or in C/C++ using Rcpp? If the
> > latter case is true, how do you « readBin »  in Rcpp (I’m a total Rcpp
> > newbie)?
> > Many thanks.
> > Best regards,
> > phiroc
> >
> >
> > -------------
> >
> > # inputPath is something like http://myintranet/getData <http://myintranet/getData>?
> > pathToFile=/usr/lib/xxx/yyy/data.bin <http://myintranet/getData <http://myintranet/getData>?
> > pathToFile=/usr/lib/xxx/yyy/data.bin>
> >
> > PLTreader <- function(inputPath){
> >         URL <- file(inputPath, "rb")
> >         PLT <- matrix(nrow=0, ncol=6)
> >         compteurDePrints = 0
> >         compteurDeLignes <- 0
> >         maxiPrints = 5
> >         displayData <- FALSE
> >         while (TRUE) {
> >                 periodIndex <- readBin(URL, integer(), size=4, n=1,
> > endian="little") # int (4 bytes)
> >                 eventId <- readBin(URL, integer(), size=4, n=1,
> > endian="little") # int (4 bytes)
> >                 dword1 <- readBin(URL, integer(), size=4, signed=FALSE,
> > n=1, endian="little") # int
> >                 dword2 <- readBin(URL, integer(), size=4, signed=FALSE,
> > n=1, endian="little") # int
> >                 if (dword1 < 0) {
> >                         dword1 = dword1 + 2^32-1;
> >                 }
> >                 eventDate = (dword2*2^32 + dword1)/1000
> >                 repNum <- readBin(URL, integer(), size=2, n=1,
> > endian="little") # short (2 bytes)
> >                 exp <- readBin(URL, numeric(), size=4, n=1,
> > endian="little") # float (4 bytes, strangely enough, would expect 8)
> >                 loss <- readBin(URL, numeric(), size=4, n=1,
> > endian="little") # float (4 bytes)
> >                 PLT <- rbind(PLT, c(periodIndex, eventId, eventDate,
> > repNum, exp, loss))
> >         } # end while
> >         return(PLT)
> >         close(URL)
> > }
> >
> > ----------------
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> > PLEASE do read the posting guide http://www.R-project.org/ <http://www.r-project.org/>
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
>         [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.


	[[alternative HTML version deleted]]



More information about the R-help mailing list