[R] Construct All Possible Strings from 4 Bases (ATCG)

Gundala Viswanath gundalav at gmail.com
Thu Dec 18 01:55:05 CET 2008


Dear Ivar,

How can I extend the limit of "n" size?

When I tried this function with n>= 15, it fails:

> f <-  function(bases, n){apply(expand.grid(rep(list(bases),n)), 1, paste, collapse="")}
> f(c("A", "T", "C", "G"), 15)
Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
  cannot allocate vector of length 1073741824
> f(c("A", "T", "C", "G"), 30)
Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
  invalid 'times' value
In addition: Warning message:
In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
  NAs introduced by coercion

- Gundala Viswanath
Jakarta - Indonesia



On Wed, Dec 17, 2008 at 6:41 PM, Ivar Herfindal
<ivar.herfindal at bio.ntnu.no> wrote:
> To add on Robin Hankin's solution, if you want to generate the strings you
> can try:
> f <-  function(bases, n){apply(expand.grid(rep(list(bases),n)), 1, paste,
> collapse="")}
> f(c("A", "T", "C", "G"), 2)
> f(c("A", "T", "C", "G"), 4)
>
> best
>
> Ivar
>
> Robin Hankin wrote:
>>
>> Gundala
>>
>> f <-  function(n){expand.grid(rep(list(seq_len(4)),n))}
>>
>>
>> HTH
>>
>> Robin
>>
>>
>> Gundala Viswanath wrote:
>>>
>>> Dear all,
>>>
>>> Is there an efficient  way in R  to construct all  strings from 4 bases
>>> (ATCG).
>>> If we want a length L string,  there are 4 ^ L possible strings of such.
>>>
>>> e . g with L = 2 we have AA, AT, AC, AG, .. GC, GA, GT, GG as many as
>>> 4 ^ 2 = 16 strings,
>>> with L = 3 we have as many as 4 ^ 3 = 64 strings
>>>
>>>
>>> - Gundala Viswanath
>>> Jakarta - Indonesia
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>



More information about the R-help mailing list