[R] long to wide on larger data set

jim holtman jholtman at gmail.com
Mon Jul 12 13:54:29 CEST 2010


What is the configuration you are running on (OS, memory, etc.)?  What
does your object consist of?  Is it numeric, factors, etc.?  Provide a
'str' of it.  If it is numeric, then the size of the object is
probably about 1.8GB.  Doing the long to wide you will probably need
at least that much additional memory to hold the copy, if not more.
This would be impossible on a 32-bit version of R.

On Mon, Jul 12, 2010 at 1:25 AM, Juliet Hannah <juliet.hannah at gmail.com> wrote:
> I have a data set that has 4 columns and 53860858 rows. I was able to
> read this into R with:
>
> cc <- rep("character",4)
> myData <- read.table("myData.csv",header=FALSE,skip=1,colClasses=cc,nrow=53860858,sep=",")
>
>
> I need to reshape this data from long to wide. On a small data set the
> following lines work. But on the real data set, it didn't finish even
> when I took a sample of two (rows in new data). I didn't receive an
> error. I just stopped it because it was taking too long. Any
> suggestions for improvements? Thanks.
>
> # start example
> # i have commented out the write.table statement below
>
> testData <- read.table(textConnection("rs9999853,cv0084,A,A
> rs999986,cv0084,C,B
>  rs9999883,cv0084,E,F
>  rs9999853,cv0085,G,H
>  rs999986,cv0085,I,J
>  rs9999883,cv0085,K,L"),header=FALSE,sep=",")
>  closeAllConnections()
>
> mysamples <- unique(testData$V2)
>
> for (one_ind in mysamples) {
>   one_sample <- testData[testData$V2==one_ind,]
>   mywide <- reshape(one_sample, timevar = "V1", idvar =
> "V2",direction = "wide")
> #   write.table(mywide,file
> ="newdata.txt",append=TRUE,row.names=FALSE,col.names=FALSE,quote=FALSE)
> }
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list