[R] Large Dataset

Gabor Grothendieck ggrothendieck at gmail.com
Tue Jan 6 18:15:20 CET 2009


The sqldf R package can import a file into an sqlite database and
extract a portion of it.  You basically need two statements:
one to specify the name and format of the file and one to specify what
you want to extract. See home page at:
http://sqldf.googlecode.com

On Tue, Jan 6, 2009 at 12:10 PM, Edwin Sendjaja <edwin7 at web.de> wrote:
> Hi Ben,
>
> Using colClasses doensnt improve the performace much.
>
> With the data, I will calculate the mean, min, max, and standard deviance.
>
> I have also failed to import the data in a Mysql Database. I dont have much
> knowledge in Mysql.
>
> Edwin
>
>
>
>> Edwin Sendjaja <edwin7 <at> web.de> writes:
>> > Hi Simon,
>> >
>> > My RAM is only 3.2 GB (actually it should be 4 GB, but my Motherboard
>> > doesnt support it.
>> >
>> > R use almost of all my RAM and half of my swap. I think memory.limit will
>> > not solve my problem.  It seems that I need  RAM.
>> >
>> > Unfortunately, I can't buy more RAM.
>> >
>> > Why R is slow reading big data set?
>> >
>> > Edwin
>>
>>   Start with FAQ 7.28 ,
>> http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-is-read_002etable_0028_002
>>9-so-inefficient_003f
>>
>>   However, I think you're going to have much bigger problems
>> if you have a 3.1G data set and a total of 3.2G of RAM: what do
>> you expect to be able to do with this data set once you've read
>> it in?  Have you considered storing it in a database and accessing
>> just the bits you need at any one time?
>>
>>   Ben Bolker
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html and provide commented, minimal,
>> self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list