[R] How to merge sequences with new sequence insertion

arun smartpink111 at yahoo.com
Tue Nov 26 16:38:13 CET 2013





Hi,
Try:

Lines1 <- readLines(textConnection(">contig number 11
tttgctcggaggggatc
>contig number 23
gaaaacacttccttattatacaggtaaaccgtatttggat
>contig number 3
aaagctcggaggggatcccct")) 


seq1 <- "nnnnncattccattcattaattaattaatgaatgaatgnnnnn"
concatenated_contig <- paste(Lines1[!grepl(">",Lines1)],collapse=seq1)
concatenated_contig
#[1] #"tttgctcggaggggatcnnnnncattccattcattaattaattaatgaatgaatgnnnnngaaaacacttccttattatacaggtaaaccgtatttggatnnnnncattccattcattaattaattaatgaatgaatgnnnnnaaagctcggaggggatcccct"
A.K.



Hi all, 

I have a sequence files with huge number of contigs such as (contig number does not reflect the order): 

>contig number 11 
tttgctcggaggggatc 
>contig number 23 
gaaaacacttccttattatacaggtaaaccgtatttggat 
>contig number 3 
aaagctcggaggggatcccct 
... 
.. 

I want to concatenate the contigs such that the above order is 
preserved, and also, I want to insert the sequence 
"nnnnncattccattcattaattaattaatgaatgaatgnnnnn" in each contig boundaries 
(here are two contig boundaries), such that the final output file would 
become as follows: 


>concatenated contig 
tttgctcggaggggatcnnnnncattccattcattaattaattaatgaatgaatgnnnnngaaaacacttccttattatacaggtaaaccgtatttggatnnnnncattccattcattaattaattaatgaatgaatgnnnnnaaagctcggaggggatcccct 

Any help in solving the problem is highly appreciated. Thanks in advance..



More information about the R-help mailing list