[R] Construct All Possible Strings from 4 Bases (ATCG)

Ben Bolker bolker at ufl.edu
Thu Dec 18 02:16:14 CET 2008




Gundala Viswanath wrote:
> 
> Dear Ivar,
> 
> How can I extend the limit of "n" size?
> 
> When I tried this function with n>= 15, it fails:
> 
>> f <-  function(bases, n){apply(expand.grid(rep(list(bases),n)), 1, paste,
>> collapse="")}
>> f(c("A", "T", "C", "G"), 15)
> Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
>   cannot allocate vector of length 1073741824
>> f(c("A", "T", "C", "G"), 30)
> Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
>   invalid 'times' value
> In addition: Warning message:
> In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
>   NAs introduced by coercion
> 
> 

Get more memory/move to a 64-bit machine?

> 4^15
[1] 1073741824
> obj <- rep(0,4^15)
Error: cannot allocate vector of length 1073741824

from help("Memory-limit"):

There are also limits on individual objects.  On all versions of
     R, the maximum length (number of elements) of a vector is 2^31 - 1
     ~ 2*10^9, as lengths are stored as signed integers.  In addition,
     the storage space cannot exceed the address limit, and if you try
     to exceed that limit, the error message begins 'cannot allocate
     vector of length'. The number of characters in a character string
     is in theory only limited by the address space.

  What were you going to do with these approx. 10^9 objects once you had
a vector of them?

see 

http://lucis.net/stuff/clarke/9billion_clarke.html

  cheers
   Ben Bolker
-- 
View this message in context: http://www.nabble.com/Construct-All-Possible-Strings-from-4-Bases-%28ATCG%29-tp21049478p21065275.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list