[R] problem formatting data frames

John Fox jfox at mcmaster.ca
Thu Jul 18 01:19:25 CEST 2002

Dear Vlad,

You could solve this problem in R, but I suspect that it would be easier to 
pre-filter the files before reading them into data frames, using a tool 
such as grep. In particular, if all of the valid data are numeric and all 
of the offending lines have alphabetic characters, then something like the 
following should do the trick

         grep -v [a-z,A-Z] data.file > filtered.file

(You may have to adjust the regular expression to get exactly what you want.)

As well, since read.table produces a data frame, you don't need to call 

I hope that this helps,

At 10:01 AM 7/17/2002 -0400, VBMorozov at lbl.gov wrote:

>  Dear R-guRus:
>I have a problem with the format of my data in R.
>Let's say I have a HUGE text table which consists of columns of
>numerical data, separated by tabs, but in some places rows of text
>(error messages, etc) are inserted in between rows of numerical data.
>Because the data file is so huge and because I have thousands of these
>files, it's unpractical to try and go thru these files manually and
>remove text rows - I'd like R to do it for me.
>The following command works:
>but instead of numerical data in my frame I get "factor" data, because
>of these text inserts. How do I filter them out??
>Thank you very much,
>r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
>Send "info", "help", or "[un]subscribe"
>(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list