[R] question

Wed Nov 19 13:20:56 CET 2003

Peter Dalgaard <p.dalgaard at biostat.ku.dk> wrote:
> > > My problem is that I have a .txt file with data separated by a
> > > sentence, for example:
> > > 
> > > 2.22 3.45
> > > 1.56 2.31
> > > pattern 1
> > > 4.67 7.91
> > > 3.34 2.15
> > > 5.32 3.88
> > > pattern 2
> > > ...
> > > 
> > > I do not know the number of lines where these separating
> > > sentences are located, because the number of lines in between
> > > them can be random. If it was fixed, I think I could use
> > > "read.table" using the option "skip", but in this case, I do
> > > not know how I could manage to do that automatically.
> > 
> > 
> > This is a job for sed. The following command will delete any line
> > not starting with a digit from "file.txt" and save the results in
> > "file2.txt":
> > 
> > cat file.txt | sed -e '/^$\|^[^0-9]/D' > file2.txt
> 
> Er, no, that wasn't the requirement. It's a job for awk or perl, e.g.
> 
> #!/usr/bin/perl -n
> if (/pattern 1/){
>     $copy = 1;
>     next;
> }
> if (/pattern 2/){
>     $copy = 0;
> }
> print if $copy;
> 
> or 
> 
> awk '/pattern 1/{copy=1;next};/pattern 2/{copy=0};copy==1' < file.txt > file2.txt

Peter, I cannot see your point. sed can get rid of any pattern in
a text file. Fuensanta's example seemed to show that the
sentences (pattern 1, 2,...) were on separate lines from lines
containing data, thus my approach. Another one closer to your awk
example would use:

sed -e '/pattern 1\|pattern 2\|pattern xyz//g' <file.txt>file2.txt

Or is this just a perl versus sed versus awk troll?

-- 
Philippe