[R] "Ghost" values after subsetting

Sarah Goslee sarah.goslee at gmail.com
Thu Jan 13 12:12:40 CET 2011


Hi Jacob,

You don't give us enough information to answer your question. Specifically,
what is your dataframe?
str(data)
would be helpful (and calling your data data is not usually wise).

My guess is that Dags is actually a factor -- do you want it to be a factor? --
and so you are retaining all of the levels. You need to think about how you
are getting the data into R, whether you want a factor for that column, and
whether you should drop the unused levels.

Reading the help for factor may be enlightening.

Sarah

On Thu, Jan 13, 2011 at 3:57 AM, Jacob Kasper <jacobkasper at gmail.com> wrote:
> I am using subset to select the data I want to use for my analysis and find
> that after I subset my data frame on one column I get ghost values in the
> other columns. here is an example:
>
>> table(data$Dags)
>
>           2008/04/12 2008/04/13 2008/04/16 2008/04/17 2008/04/19 2008/05/06
>
>       103        140         82        187        179        212         68
>
> 2008/05/07 2008/05/12 2008/05/15 2008/05/25 2008/05/28 2008/05/29 2009/04/17
>
>       184        308        120        227        250        150        259
>
> 2009/04/18 2009/04/20 2009/04/21 2009/05/04 2009/05/15 2009/06/09 2009/06/10
>
>       246        241        252        153        366        225         79
>
> 2009/06/24 2009/06/25 2009/06/26 2010/03/11 2010/04/27 2010/05/07 2010/05/08
>
>       126        278        297        135        285        286        275
>
> 2010/05/10 2010/05/11 2010/05/20 2010/05/21 2010/06/02 2010/07/20 2010/08/12
>
>       290         22        259        291        381         20        648
>
> 2010/08/16 2010/08/18
>        11          2
>
>
>> data10<-subset(data, data$Year==2010 & data$Recatpure1==1)
>
>> table(data10$Dags)
>
>           2008/04/12 2008/04/13 2008/04/16 2008/04/17 2008/04/19 2008/05/06
>
>         0          0          0          0          0          0          0
>
> 2008/05/07 2008/05/12 2008/05/15 2008/05/25 2008/05/28 2008/05/29 2009/04/17
>
>         0          0          0          0          0          0          0
>
> 2009/04/18 2009/04/20 2009/04/21 2009/05/04 2009/05/15 2009/06/09 2009/06/10
>
>         0          0          0          0          0          0          0
>
> 2009/06/24 2009/06/25 2009/06/26 2010/03/11 2010/04/27 2010/05/07 2010/05/08
>
>         0          0          0         23         38         20         29
>
> 2010/05/10 2010/05/11 2010/05/20 2010/05/21 2010/06/02 2010/07/20 2010/08/12
>
>        18          1         15         45         38          1          5
>
> 2010/08/16 2010/08/18
>         0          0
> How can I perform a subset so that these ghost values do not appear at all
> in my new table?
>


-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list