[R] question

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Nov 19 12:29:19 CET 2003


Philippe Glaziou <glaziou at pasteur-kh.org> writes:

> Fuensanta Saura Igual <igual at maFuensanta Saura Igual <igual at mat.uji.es> wrote:
> > Does anyone know how I can read from a .txt file the lines that
> > are between two strings whose location is unknown?
> > 
> > My problem is that I have a .txt file with data separated by a
> > sentence, for example:
> > 
> > 2.22 3.45
> > 1.56 2.31
> > pattern 1
> > 4.67 7.91
> > 3.34 2.15
> > 5.32 3.88
> > pattern 2
> > ...
> > 
> > I do not know the number of lines where these separating
> > sentences are located, because the number of lines in between
> > them can be random. If it was fixed, I think I could use
> > "read.table" using the option "skip", but in this case, I do
> > not know how I could manage to do that automatically.
> 
> 
> This is a job for sed. The following command will delete any line
> not starting with a digit from "file.txt" and save the results in
> "file2.txt":
> 
> cat file.txt | sed -e '/^$\|^[^0-9]/D' > file2.txt

Er, no, that wasn't the requirement. It's a job for awk or perl, e.g.

#!/usr/bin/perl -n
if (/pattern 1/){
    $copy = 1;
    next;
}
if (/pattern 2/){
    $copy = 0;
}
print if $copy;

or 

awk '/pattern 1/{copy=1;next};/pattern 2/{copy=0};copy==1' < file.txt > file2.txt


-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list