[R] bug in interaction order when using drop?

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Aug 11 13:31:55 CEST 2006


On Thu, 10 Aug 2006, Petr Pikal wrote:

> Ooops, my first suggestion reorders factor itself but
> 
> if (drop) factor(ans) else ans
> 
> instead of whole drop construction shall preserve levels order 
> without changing order of factor

Even easier would be to return ans[,drop=drop].  It seems to me that there 
is an argument for expecting interaction(..., drop=TRUE) to give the same 
result as interaction(...)[,drop=TRUE], but little argument that any 
ordering is a *bug*.

The order of the levels of a factor are arbitrary, and in fact they seem 
to me to be in a strange order, with the levels of the first factor 
varying fastest (reverse lexiographic order).

> levels(interaction(c("A", "A", "B"), letters[1:3]))
[1] "A.a" "B.a" "A.b" "B.b" "A.c" "B.c"

so the existing

> levels(interaction(c("A", "A", "B"), letters[1:3], drop=T))
[1] "A.a" "A.b" "B.c"

looks more sensible in this case.

> 
> Petr
> 
> On 10 Aug 2006 at 16:32, Petr Pikal wrote:
> 
> From:           	"Petr Pikal" <petr.pikal at precheza.cz>
> To:             	r-help at stat.math.ethz.ch
> Date sent:      	Thu, 10 Aug 2006 16:32:54 +0200
> Priority:       	normal
> Subject:        	[R] bug in interaction order when using drop?
> 
> > Hallo all
> > 
> > > version
> >                _                                   
> > platform       i386-pc-mingw32                       
> > arch           i386                                  
> > os             mingw32                               
> > system         i386, mingw32                         
> > status         beta                                  
> > major          2                                   
> > minor          3.1                                   
> > year           2006                                  
> > month          05                                   
> > day            23                                   
> > svn rev        38179                                 
> > language       R                                   
> > version.string Version 2.3.1 beta (2006-05-23 r38179)
> > >
> > 
> > When I use interaction(....) without drop=T parameters I will get
> > neatly organized factor with "protiproud" and "souproud" aligned.
> > 
> > > levels(interaction(vykon, teplota, proudeni))
> >  [1] "3.750.protiproud"  "12.750.protiproud" "3.775.protiproud" 
> > "12.775.protiproud" "3.800.protiproud"  "12.800.protiproud"
> >  [7] "3.825.protiproud"  "12.825.protiproud" "3.850.protiproud" 
> > "12.850.protiproud" "3.750.souproud"    "12.750.souproud"  [13]
> > "3.775.souproud"    "12.775.souproud"   "3.800.souproud"   
> > "12.800.souproud"   "3.825.souproud"    "12.825.souproud"  [19]
> > "3.850.souproud"    "12.850.souproud"  
> > 
> > However when I use 
> > 
> > > levels(interaction(vykon, teplota, proudeni, drop=T))
> > [1] "3.775.protiproud"  "3.800.souproud"    "3.750.souproud"    
> > "12.850.souproud"   "12.825.protiproud"
> > 
> > everything is out of order. I know I can reorder any factor according
> > to my wish but it would be good to have it ordered same way as without
> > using drop.
> > 
> > Everything comes from unique in
> > 
> > if (drop) {
> >         f <- unique(ans[!is.na(ans)])
> >         ans <- match(ans, f)
> >         lvs <- lvs[f]
> > }
> > 
> > maybe it can be modified.
> > 
> > if (drop) {
> >         f <- unique(ans[!is.na(ans)])
> >         ord <- order(f)
> >         ans <- match(ans, f)
> >         lvs <- lvs[f[ord]]
> >         }
> > 
> > which seems to work but I am not sure if it does not makes problems
> > having NA in data.
> > 
> > Here is my data frame.
> > Thank you 
> > 
> > Petr Pikal
> > 
> > > dump("df", file=stdout()) 
> > df <-
> > structure(list(proudeni = structure(as.integer(c(1, 1, 1, 1, 
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 
> > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
> > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
> > 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
> > 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
> > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
> > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 
> > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 
> > 1, 1, 1)), .Label = c("protiproud", "souproud"), class = "factor"), 
> >     vykon = as.integer(c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
> >     3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
> >     3, 3, 3, 3, 3, 3, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 
> >     12, 12, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
> >     3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
> >     3, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 3, 3, 
> >     3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
> >     3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 12, 12, 12, 
> >     12, 12, 12, 12, 12, 12, 12, 12, 12, 3, 3, 3, 3, 3, 3, 3, 
> >     3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
> >     3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 12, 12, 12, 12, 12, 12, 
> >     12, 12, 12, 12, 12, 12)), teplota = as.integer(c(775, 775, 
> >     775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 
> >     775, 775, 775, 775, 800, 800, 800, 800, 800, 800, 800, 800, 
> >     800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 750, 850, 
> >     850, 850, 850, 850, 850, 825, 825, 825, 825, 825, 825, 775, 
> >     775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 
> >     775, 775, 775, 775, 775, 800, 800, 800, 800, 800, 800, 800, 
> >     800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 750, 
> >     850, 850, 850, 850, 850, 850, 825, 825, 825, 825, 825, 825, 
> >     775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 
> >     775, 775, 775, 775, 775, 775, 800, 800, 800, 800, 800, 800, 
> >     800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 
> >     750, 850, 850, 850, 850, 850, 850, 825, 825, 825, 825, 825, 
> >     825, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 775, 
> >     775, 775, 775, 775, 775, 775, 775, 800, 800, 800, 800, 800, 
> >     800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 800, 
> >     800, 750, 850, 850, 850, 850, 850, 850, 825, 825, 825, 825, 
> >     825, 825))), .Names = c("proudeni", "vykon", "teplota"), 
> > row.names = c("1", 
> > "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
> > "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24",
> > "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
> > "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46",
> > "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57",
> > "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68",
> > "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79",
> > "80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90",
> > "91", "92", "93", "94", "95", "96", "97", "98", "99", "100", "101",
> > "102", "103", "104", "105", "106", "107", "108", "109", "110", "111",
> > "112", "113", "114", "115", "116", "117", "118", "119", "120", "121",
> > "122", "123", "124", "125", "126", "127", "128", "129", "130", "131",
> > "132", "133", "134", "135", "136", "137", "138", "139", "140", "141",
> > "142", "143", "144", "145", "146", "147", "148", "149", "150", "151",
> > "152", "153", "154", "155", "156", "157", "158", "159", "160", "161",
> > "162", "163", "164", "165", "166", "167", "168", "169", "170", "171",
> > "172", "173", "174", "175", "176", "177", "178", "179", "180", "181",
> > "182", "183", "184", "185", "186", "187", "188", "189", "190", "191",
> > "192", "193", "194", "195", "196"), class = "data.frame") > Petr Pikal
> > petr.pikal at precheza.cz
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
> 
> Petr Pikal
> petr.pikal at precheza.cz
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list