[R] How to create unique factor from two factors? + Boostrap Q

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Nov 9 16:14:06 CET 2003


Well, it is one of those things

-- it works in R but not in S
-- it appears in the examples for help(":") but is not otherwise mentioned
   on the help page (why?)
-- it does not give a numerical list of combinations, as asked for
-- it does give unused levels, which in this application is disastrous.

so I at least do not find it `easier'.

> a <- factor(letters)[1:6]
> b <- factor(rep(letters[1:3], each=2))
> a:b
[1] a:a b:a c:b d:b e:c f:c
78 Levels: a:a a:b a:c b:a b:b b:c c:a c:b c:c d:a d:b d:c e:a e:b e:c ... 


On Sun, 9 Nov 2003 kjetil at entelnet.bo wrote:

> On 9 Nov 2003 at 13:29, Prof Brian Ripley wrote:
> 
> > Factor3 <- factor(unclass(Factor1) + nlevels(Factor1)*(unclass(Factor2)-1))
> > 
> 
> Cannot this be done even easier by calculating the interaction?
> 
> > a <- factor(rep(1:3,rep(3,3)))

a <- factor(rep(1:3, each=3) is definitely easier!

> > b <- factor(rep(1:3,3))
> > ab <- a:b
> > ab
> [1] 1:1 1:2 1:3 2:1 2:2 2:3 3:1 3:2 3:3
> Levels: 1:1 1:2 1:3 2:1 2:2 2:3 3:1 3:2 3:3
> 
> Kjetil Halvorsen
> 
> > will give you the unique combinations, not labelled as you do but then I 
> > don't think you need that.
> > 
> > On Sun, 9 Nov 2003, Scott Norton wrote:
> > 
> > > This might be easy but I'm very new to R and this question doesn't seem to
> > > have any nice keywords that bring up relevant search results when I search
> > > the CRAN search engine.  Therefore, I'll plead (as I have in the recent
> > > past) Newbie status.
> > > 
> > >  
> > > 
> > > I have a data frame with two factors (Factor 1 and 2) which together specify
> > > another unique level.  I want to create a third factor in the data frame
> > > that captures this uniqueness.
> > > 
> > > For example, say I had dataframe, Df, with Factors, 1 and 2.  I want to
> > > create Factor 3 and add it to my Df dataframe.
> > > 
> > > i.e.
> > > 
> > > Df dataframe:                          WANT TO 
> > > 
> > > Row#     Factor1          Factor2     CREATE THIS: Factor 3        Data
> > > 
> > > 1            1               1                    1                 23
> > > 
> > > 2            1               2                    2                 43
> > > 
> > > 3            1               2                    2                 19
> > > 
> > > 4            1               2                    2                 11
> > > 
> > > 5            1               4                    3                 3
> > > 
> > > 6            1               4                    3                 13
> > > 
> > > 7            3               1                    4                 52
> > > 
> > > 8            3               1                    4                 12
> > > 
> > > 9            3               1                    4                 9
> > > 
> > > 10           3               3                    5                 21
> > > 
> > > 11           3               3                    5                 43
> > > 
> > > 
> > > 12           4               1                    6                 32
> > > 
> > > 13           4               1                    6                 18
> > > 
> > > 14           4               2                    7                 52
> > > 
> > > 15           4               2                    7                 21
> > > 
> > > 
> > >  
> > > 
> > > and of course, I'm trying to create Factor 3 without loops..
> > > 
> > >  
> > > 
> > > My end goal here (which I add because maybe I don't need to create Factor 3
> > > (although I'm still curious)), is to bootstrap "sample" Factor 3. I want to
> > > repeatedly grab, say, 3 levels of Factor 3, and take the mean of those
> > > levels (e.g. say in my first bootstrap sample, I grab levels 2,4, and 7 from
> > > Factor 3, then I want to take the mean of rows, 2,3,4,7,8,9,14,15).  Of
> > > course, each sample from Factor 3 for my bootstrap will most likely have a
> > > differing number of rows since my experiment is not balanced.  I'm not sure
> > > if this is an issue yet when I try to implement the "boot" function in R (I
> > > haven't gotten to that point yet).  
> > 
> > The boot package will easily do this for you.
> > 
> > -- 
> > Brian D. Ripley,                  ripley at stats.ox.ac.uk
> > Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> > University of Oxford,             Tel:  +44 1865 272861 (self)
> > 1 South Parks Road,                     +44 1865 272866 (PA)
> > Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list