[R] Problems making subsets with [] or "subset"

Ista Zahn izahn at psych.rochester.edu
Fri Jan 28 19:22:02 CET 2011


Hi Mario,

On Fri, Jan 28, 2011 at 12:27 PM, gaiarrido <gaiarrido at usal.es> wrote:
>
> Hi,
> I'm trying to make a model in order to know wich factors got´s influence in
> the intensity of a infection, but just in the individuals who's got this
> infection. In my data I've got a variable called "prevalence" with 2 levels:
> 1.- Infected individual
> 0.- Non infected
>
> So what i'm trying to do is a subset in a model like this,
> model<-aov(intensity~ageandsex*month*zone*year,subset=(prevalence=="1"))
> Is a correct way?

That should work, assuming you have attached the dataframe containing
the variables in your model. A safer way is to use the data agrument
to aov instead:

model<-aov(intensity~ageandsex*month*zone*year, data=X,
subset=(prevalence=="1"))


> I prefer to make the subsets with [], but doesn't work, why
>
>> model<-aov(hemogregarinas~edadysexo*mes*zona*ano,[prevalencia=="1"])
> Error: inesperado '[' en
> "model<-aov(hemogregarinas~edadysexo*mes*zona*ano,["
>

Well what is [prevalencia=="1"] indexing? Nothing, which isn't going
to work. You can use

model<-aov(hemogregarinas~edadysexo*mes*zona*ano, data=X[prevalencia=="1" , ])

> And finally, look at this, i don't know why I get different results
>> summary(hemogregarinas[prevalencia=="1"])

>   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>   1.00    5.00    9.00   15.94   19.00  173.00
>> summary(hemogregarinas,subset=(prevalencia=="1"))
>   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>   0.00    4.00    9.00   14.92   18.00  173.00
>

Well I'm not sure, but I suspect it has something to do with attaching
things, and getting confused about where the data is actually coming
from. This gives the same results:

dat <- data.frame(x=c(1,2,2,1,3,4,2,5,3,20), y =
c(2,1,3,3,2,5,6,4,5,20), z=c(rep(1, 9),2))

attach(dat)
summary(x[z==1])
summary(subset(x, z==1))

so again, not sure why you're getting something different. Try storing
your data in a data.frame, and use the data arguments instead of
attaching it, and see if you get more sensible results.

Best,
Ista

> Thanks very much
>
> -----
> Mario Garrido Escudero
> PhD student
> Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca.
> Agrícola
> Universidad de Salamanca
> --
> View this message in context: http://r.789695.n4.nabble.com/Problems-making-subsets-with-or-subset-tp3245007p3245007.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org



More information about the R-help mailing list