[R] How to scan df from a specific word?

Gabor Grothendieck ggrothendieck at gmail.com
Sat Oct 30 01:49:04 CEST 2010


On Fri, Oct 29, 2010 at 6:34 PM, M.Ribeiro <mresendeufv at yahoo.com.br> wrote:
>
> Hi R-helpers,
>
> I need to read some file with different lines (I don't know the number of
> lines to skip) and I would like to find a way to start reading the
> data.frame from the word "source".
>
> ex:
>
> djhsafk
> asdfhkjash
> shdfjkash
> asfhjkash         #those lines contain numbers and words, I want to skip
> then but they have different sizes
> asdfhjkash
> asdfhjksa
>
> source
> tret 2
> res 3
>

Here is a one line solution but it does make use of the external
utility, gawk.  If you using Linux you probably have it on your system
already.  You can also get gawk for Windows or if you download
Duncan's Rtools distribution its included there too -- gawk.exe is
just a single file so just make sure you put it somewhere on your
PATH.

> read.table(pipe('gawk "/Analysis of Variance/ {exit}; /Source/ {i++}; i" myfile.dat'), header = TRUE, fill = TRUE)
               Source Model terms    Gamma Component Comp.SE X. C
1            Residual  8383  8367       NA        NA      NA NA
2     at(type,1).Nfam    62    62  10.1131   10.1131    1.81  0 P
3     at(type,2).Nfam    62    62  28.1153   28.1153    2.16  0 P
4            rep.iblk   768   768  63.2919   63.2919   10.94  0 P
5  at(type,1).Nfemale    44    44  29.9049   29.9049    2.93  0 P
6   at(type,1).Nclone  2689  2689 109.5600  109.5600   12.66  0 P
7  at(type,2).Nfemale    44    44  14.0305   14.0305    1.68  0 P
8            Variance     0     0 479.0400  479.0400   36.23  0 P
9            Variance     0     0 490.5800  490.5800   17.51  0 P
10           Variance     0     0 469.9320  469.9320   36.51  0 P
11           Variance     0     0 544.6540  544.6540   17.86  0 P

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list