[R] About Creating a List by Parsing Text

Gundala Viswanath gundalav at gmail.com
Tue Aug 5 09:09:28 CEST 2008


Hi all,

I have the following data in which I want to parse and
store them in a list

__DATA__
> print(comp.ll)
   [1] "\tGene  11340 211952_at RANBP5  k= 1  LL= -970.692 "
   [2] "\tGene  11340 211952_at RANBP5  k= 2  LL= -965.35 "
   [3] "\tGene  11340 211952_at RANBP5  k= 3  LL= -963.669 "
   [4] "\tGene  12682 213301_x_at TRIM24  k= 1  LL= -948.527 "
   [5] "\tGene  12682 213301_x_at TRIM24  k= 2  LL= -947.275 "
   [6] "\tGene  12682 213301_x_at TRIM24  k= 3  LL= -947.379 "
   [7] "\tGene  13764 214385_s_at AI521646  k= 1  LL= -827.86 "
   [8] "\tGene  13764 214385_s_at AI521646  k= 2  LL= -777.756 "
   [9] "\tGene  13764 214385_s_at AI521646  k= 3  LL= -812.083 "
__END__

I expect to get this kind of data structure:

> wanted_output

[['211952_at']]
$ll.list
[1] -970.692 -965.35 -963.669

[['213301_x_at']]
$ll.list
[1] -948.527 -947.275 -947.379

etc.

How can I achieve that?

I am stuck with the following construct

__BEGIN__
comp.ll <- model_all[grep("Gene .* k=.*", model_all)]
print(comp.ll)

patt <- "Gene  \\d+ ([\\w-/]+) [\\w-]+  k= (\\d)  LL= ([-]\\d+\.\\d+)"
nresk <- unlist(strsplit(sub(patt, "\\1 \\2 \\3",comp.ll,perl=TRUE)," "))
__END__


- Gundala Viswanath
Jakarta - Indonesia



More information about the R-help mailing list