[R] remove NA in df results in NA, NA.1 ... rows

Thu Dec 13 14:03:00 CET 2012

Hi,

You could use either:
?na.omit() #the option was already suggested
#or
df2[complete.cases(df2),]

#In this case, this should also work
sapply(df2,function(x) x[!is.na(x)]) 
#or
 apply(df2,2,function(x) x[!is.na(x)]) #If the NAs are not in the same rows, then the ouptut will be a list with the list elements differ in length.
A.K.

----- Original Message -----
From: "raphael.felber at art.admin.ch" <raphael.felber at art.admin.ch>
To: r-help at r-project.org
Cc: 
Sent: Thursday, December 13, 2012 3:20 AM
Subject: [R] remove NA in df results in NA, NA.1 ... rows

Good morning!

I have the following data frame (df):

    X.outer  Y.outer   X.PAD1   Y.PAD1   X.PAD2 Y.PAD2   X.PAD3 Y.PAD3   X.PAD4 Y.PAD4
73 574690.0 179740.0 574690.2 179740.0 574618.3 179650 574729.2 179674 574747.1 179598
74 574680.6 179737.0 574693.4 179740.0 574719.0 179688 574831.8 179699 574724.9 179673
75 574671.0 179734.0 574696.2 179740.0 574719.0 179688 574807.8 179787 574729.2 179674
76 574663.6 179736.0 574699.1 179734.0 574723.5 179678 574703.4 179760 574831.8 179699
77 574649.9 179734.0 574704.7 179724.0 574724.9 179673 574702.4 179755 574852.3 179626
78 574647.3 179742.0 574706.9 179719.0 574747.1 179598 574702.0 179754 574747.1 179598
79 574633.6 179739.0 574711.4 179710.0 574641.8 179570 574698.0 179747       NA     NA
80 574634.9 179732.0 574716.6 179698.0 574639.6 179573 574700.2 179738       NA     NA
81 574616.5 179728.6 574716.7 179695.0 574618.3 179650 574704.4 179729       NA     NA
82 574615.4 179731.0 574718.2 179690.0       NA     NA 574708.1 179724       NA     NA
83 574614.4 179733.6 574719.1 179688.0       NA     NA 574709.3 179720       NA     NA
...

44 574702.0 179754.0       NA       NA       NA     NA       NA     NA       NA     NA

45 574695.1 179751.0       NA       NA       NA     NA       NA     NA       NA     NA

46 574694.4 179752.0       NA       NA       NA     NA       NA     NA       NA     NA

Which I subset to

df2 <- df[,c("X.PAD2","Y.PAD2")]

df2

     X.PAD2 Y.PAD2

73 574618.3 179650

74 574719.0 179688

75 574719.0 179688

76 574723.5 179678

77 574724.9 179673

78 574747.1 179598

79 574641.8 179570

80 574639.6 179573

81 574618.3 179650

82       NA     NA

83       NA     NA

...

44       NA     NA

45       NA     NA

46       NA     NA

followed by removing the NA's using

df2 <- df2[!is.na(df2),]

If I now call df2, I get:

       X.PAD2 Y.PAD2

73   574618.3 179650

74   574719.0 179688

75   574719.0 179688

76   574723.5 179678

77   574724.9 179673

78   574747.1 179598

79   574641.8 179570

80   574639.6 179573

81   574618.3 179650

NA         NA     NA

NA.1       NA     NA

NA.2       NA     NA

NA.3       NA     NA

NA.4       NA     NA

NA.5       NA     NA

NA.6       NA     NA

NA.7       NA     NA

NA.8       NA     NA

It seems there are still NA's in my data frame. How can I get rid of them? What is the meaning of the rows numbered NA, NA.1 and so on?

Thanks for any hints.

Best regards

Raphael Felber

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.