[R] How to load a big txt file

Charles C. Berry cberry at tajo.ucsd.edu
Thu Jun 7 05:39:32 CEST 2007


On Wed, 6 Jun 2007, Charles C. Berry wrote:

>
> Alex,
>
> See
>
> 	R Data Import/Export Version 2.5.0 (2007-04-23)
>
> search for 'large' or 'scan'.
>
> Usually, taking care with the arguments
>
> 	nlines, what, quote, comment.char
>
> should be enough to get scan() to cooperate.
>
> You will need around 1GB RAM to store the result, so if you are working on a

Oops. 23800*49*8 == 9329600 is more like 0.01GB, I guess.


> machine with less, you will need to upgrade. Consider storing the result as a 
> numeric matrix.
>
> If any of those columns are long strings not needed in your computation, be 
> sure to skip over them. Read the 'Details' of the help page for scan() 
> carefully.
>
> Chuck
>
>
> On Thu, 7 Jun 2007, ssls sddd wrote:
>
>>  Dear list,
>>
>>  I need to read a big txt file (around 130Mb; 23800 rows and 49 columns)
>>  for downstream clustering analysis.
>>
>>  I first used "Tumor <- read.table("Tumor.txt",header = TRUE,sep = "\t")"
>>  but it took a long time and failed. However, it had no problem if I just
>>  put
>>  data of 3 columns.
>>
>>  Is there any way which can load this big file?
>>
>>  Thanks for any suggestions!
>>
>>  Sincerely,
>>      Alex
>>
>>   [[alternative HTML version deleted]]
>>
>>  ______________________________________________
>>  R-help at stat.math.ethz.ch mailing list
>>  https://stat.ethz.ch/mailman/listinfo/r-help
>>  PLEASE do read the posting guide
>>  http://www.R-project.org/posting-guide.html
>>  and provide commented, minimal, self-contained, reproducible code.
>> 
>
> Charles C. Berry                        (858) 534-2098
>                                         Dept of Family/Preventive Medicine
> E mailto:cberry at tajo.ucsd.edu	         UC San Diego
> http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0901
>
>
>

Charles C. Berry                        (858) 534-2098
                                          Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	         UC San Diego
http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0901



More information about the R-help mailing list