[R] problem formatting data frames

Tony Plate tplate at blackmesacapital.com
Wed Jul 17 20:13:31 CEST 2002

If you're actually able to read the data files into R (I'd be surprised -- 
having messages interspersed between the rows of data should cause problems 
with numbers of columns), you could do something like the following:

 > x <- data.frame(a=c(1,"foo",2,1),b=c("bar","foo",3,3))  # create a 
"problem" data frame
 > x
     a   b
1   1 bar
2 foo foo
3   2   3
4   1   3
 > sapply(x, data.class)  # verify that it contains factor data
        a        b
"factor" "factor"
 > # convert it
 > x1 <- data.frame(lapply(x, function(col) if (is.factor(col)) 
as.numeric(levels(col))[as.numeric(col)] else col))
Warning messages:
1: NAs introduced by coercion
2: NAs introduced by coercion
 > x1
    a  b
1  1 NA
3  2  3
4  1  3
 > sapply(x1, data.class)
         a         b
"numeric" "numeric"
 > x1[!apply(is.na(x1), 1, any), ]    # filter rows with any NA's in them
   a b
3 2 3
4 1 3

At 10:01 AM 7/17/2002 -0400, you wrote:

>  Dear R-guRus:
>I have a problem with the format of my data in R.
>Let's say I have a HUGE text table which consists of columns of
>numerical data, separated by tabs, but in some places rows of text
>(error messages, etc) are inserted in between rows of numerical data.
>Because the data file is so huge and because I have thousands of these
>files, it's unpractical to try and go thru these files manually and
>remove text rows - I'd like R to do it for me.
>The following command works:
>but instead of numerical data in my frame I get "factor" data, because
>of these text inserts. How do I filter them out??
>Thank you very much,
>r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
>Send "info", "help", or "[un]subscribe"
>(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list