[R] ERROR : cannot allocate vector of size (in MB & GB)

Akkara, Antony (GE Energy, Non-GE) Antony.Akkara at ge.com
Tue Aug 7 13:05:24 CEST 2012


How is possible to split a .csv file in terms of size (in KiloByte) ? 


-----Original Message-----
From: jim holtman [mailto:jholtman at gmail.com] 
Sent: Tuesday, July 24, 2012 11:30 PM
To: Akkara, Antony (GE Energy, Non-GE)
Cc: r-help at r-project.org
Subject: Re: [R] ERROR : cannot allocate vector of size (in MB & GB)

try this:

input <- file("yourLargeCSV", "r")
fileNo <- 1
repeat{
    myLines <- readLines(input, n=100000) # 100K lines / file
    if (length(myLines) == 0) break
    writeLines(myLines, sprintf("output%03d.csv", fileNo))
    fileNo <- fileNo + 1
}
close(input)


On Tue, Jul 24, 2012 at 9:45 AM, Rantony <antony.akkara at ge.com> wrote:
> Hi,
>
> Here in R, I need to load a huge file(.csv) , its size is 200MB. [may 
> come more than 1GB sometimes].
> When i tried to load into a variable it taking too much of time and 
> after that when i do cbind by groups, getting an error like this
>
> " Error: cannot allocate vector of size 82.4 Mb "
>
> My requirement is, spilt data from Huge-size-file(.csv) to no. of 
> small csv files.
> Here i will give no of lines to be 'split by' as input.
>
> Below i give my code
> -------------------------------
>                 SplitLargeCSVToMany <-
function(DataMatrix,Destination,NoOfLineToGroup)
>                 {
>                         test <- data.frame(read.csv(DataMatrix))
>
>                         # create groups No.of rows
>                         group <- rep(1:NROW(test),
each=NoOfLineToGroup)
>                         new.test <- cbind(test, group=group)
>                         new.test2 <- new.test
>                         new.test2[,ncol(new.test2)] <- NULL
>
>                         # now get indices to write out
>                         indices <- split(seq(nrow(test)), new.test[, 
> 'group'])
>
>                         # now write out the files
>                         for (i in names(indices))
>                         {
>                         write.csv(new.test2[indices[[i]],], 
> file=paste(Destination,"data.", i, ".csv", sep=""),row.names=FALSE)
>                         }
>                 }
>
> -----------------------------------------------------
> My system Configuration is,
> Intel Core2 Duo
> speed : 3GHz
> 2 GB RAM
> OS: Windows-XP [ServicePack-3]
> ---------------------------------------------------
>
> Any hope to solve this issue ?
>
> Thanks in advance,
> Antony.
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/ERROR-cannot-allocate-vector-of-size-in-
> MB-GB-tp4637597.html Sent from the R help mailing list archive at 
> Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.



More information about the R-help mailing list