[R] Faster Printing Alternatives to 'cat'

Gundala Viswanath gundalav at gmail.com
Thu Jan 8 14:26:32 CET 2009


Dear Jim and Henrik,

> What exactly is the problem you are trying to solve.
> Is it going to be read by some other program?

I  simply want to print the data out. Surely, this data
will be manipulated (with Excel or other
programming languages) by other people suit to their purpose.

Typically the print out from the loop looks  like this:

ATCGATCGATCGGGGGGGGGGGGGGGTTTGCGGG   10   11.992
CCCCCCCCGGGCCATCGGTCAGGGAATTGACGGAA   2      0.222
.....
up to ~16 million lines.

> How much physical memory do you have on your machine?
6GB

>  Is there paging  occuring due to the size of the objects?
Don't quite understand what do you mean by that
So sorry for my lack of knowledge in R.

>  Have you consider creating a  structure with 10,000 of the variables
> each time through the loop and then writing them out?

Never thought about that. Can you be specific how can this be achieved?

- Gundala Viswanath
Jakarta - Indonesia



On Thu, Jan 8, 2009 at 10:10 PM, jim holtman <jholtman at gmail.com> wrote:
> What exactly is the problem you are trying to solve.  What is going to
> be done with the data?  Is it going to be read by some other program?
> How much physical memory do you have on your machine?  Is there paging
> occuring due to the size of the objects?  Have you consider creating a
> structure with 10,000 of the variables each time through the loop and
> then writing them out?  A lot will depend on how much free memory you
> have.  I will also ask one of my favorite questions; "tell me what you
> want to do, not how you want to do it".
>
> On Thu, Jan 8, 2009 at 6:12 AM, Gundala Viswanath <gundalav at gmail.com> wrote:
>> Dear all,
>>
>> I found that printing with 'cat' is very slow.
>>
>> For example in my machine this snippet
>>
>> __BEGIN__
>>
>> # I need to resolve to use this type of loop.
>> # because using write(), I need to create a matrix  which
>> # consumes so much memory. Note that "foo, bar, qux" object
>> # is already very large (>2Gb)
>>
>> for ( s in 1:length(x) ) {
>>    cat(as.character(foo[s]),"\t",bar[s],"\t", qux[s],"\n")
>> }
>> __END__
>>
>> for "x" of size ~1.5million, takes more than 10 hours to print.
>> On my Linux 1994.MHz AMD processor.
>>
>> Is there any faster alternatives to "cat" ?
>>
>>
>> - Gundala Viswanath
>> Jakarta - Indonesia
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>




More information about the R-help mailing list