[R] NA's when subset in a dataframe

Milan Bouchet-Valat nalimilan at club.fr
Thu May 3 23:33:48 CEST 2012


Le jeudi 03 mai 2012 à 07:37 -0700, agent dunham a écrit :
> Dear community, 
> 
> I'm having this silly problem.
> 
> I've a linear model. After fixing it, I wanted to know which data had
> studentized residuals larger than 3, so i tried this: 
> 
> d1 <- cooks.distance(lmmodel)
> r <- sqrt(abs(rstandard(lmmodel)))
> rstu <- abs(rstudent(lmmodel))
> 
> a <- cbind( mydata, d1, r,rstu) 
> 
> alargerthan3 <-  a[rstu >3, ]
> 
> And suddenly  a[rstu >3, ]  has 17 rows, 7 of them are "new rows", where all
> the entries are NA's, even its rownames. 
> 
> Because of this I'm not sure of the dimension of    a[rstu >3, ]  (Do I only
> have 8 entries?)
> 
> Has this happened to anybody before? If so, why this extra NA rows? what's
> the problem? Is there any other way to know which data have studentized
> residuals larger than   3?
> 
> 
>  if it's needed  to upload my data, just tell me.
A small reproducible example would have been better. Anyway, see page 88
of The R Inferno.

In your case, the simplest solutions are to do:
alargerthan3 <- a[which(rstu > 3),]
or
alargerthan3 <- subset(a, rstu > 3)


Cheers



More information about the R-help mailing list