[R] Accelerating binRead

Ismail SEZEN sezenismail at gmail.com
Sat Sep 17 16:10:05 CEST 2016


I suspect that rbind is responsible. Use list and append instead of rbind. At the end, combine elements of list by do.call(“rbind”, list).

> On 17 Sep 2016, at 15:05, Philippe de Rochambeau <phiroc at free.fr> wrote:
> 
> Hello,
> the following function, which stores numeric values extracted from a binary file, into an R matrix, is very slow, especially when the said file is several MB in size.
> Should I rewrite the function in inline C or in C/C++ using Rcpp? If the latter case is true, how do you « readBin »  in Rcpp (I’m a total Rcpp newbie)?
> Many thanks.
> Best regards,
> phiroc
> 
> 
> -------------
> 
> # inputPath is something like http://myintranet/getData?pathToFile=/usr/lib/xxx/yyy/data.bin <http://myintranet/getData?pathToFile=/usr/lib/xxx/yyy/data.bin>
> 
> PLTreader <- function(inputPath){
> 	URL <- file(inputPath, "rb")
> 	PLT <- matrix(nrow=0, ncol=6)
> 	compteurDePrints = 0
> 	compteurDeLignes <- 0
> 	maxiPrints = 5
> 	displayData <- FALSE
> 	while (TRUE) {
> 		periodIndex <- readBin(URL, integer(), size=4, n=1, endian="little") # int (4 bytes)
> 		eventId <- readBin(URL, integer(), size=4, n=1, endian="little") # int (4 bytes)
> 		dword1 <- readBin(URL, integer(), size=4, signed=FALSE, n=1, endian="little") # int
> 		dword2 <- readBin(URL, integer(), size=4, signed=FALSE, n=1, endian="little") # int
> 		if (dword1 < 0) {
> 			dword1 = dword1 + 2^32-1;
> 		}
> 		eventDate = (dword2*2^32 + dword1)/1000
> 		repNum <- readBin(URL, integer(), size=2, n=1, endian="little") # short (2 bytes)
> 		exp <- readBin(URL, numeric(), size=4, n=1, endian="little") # float (4 bytes, strangely enough, would expect 8)
> 		loss <- readBin(URL, numeric(), size=4, n=1, endian="little") # float (4 bytes)
> 		PLT <- rbind(PLT, c(periodIndex, eventId, eventDate, repNum, exp, loss))
> 	} # end while
> 	return(PLT)
> 	close(URL)
> }
> 
> ----------------
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list