[Rd] apply: new behaviour for factors in R-2.4.0

Christoph Buser buser at stat.math.ethz.ch
Thu Sep 28 16:32:45 CEST 2006


Dear Brian

Thank you for your answer and the comment you included on the
apply() help page.

1)

You are correct. My data.frame is coerced into a matrix in
apply() 

2)

I agree that the new version of unlist() is better and works
correctly and that in array() (due to as.vector()) the factor
"ans" is coerced into a character matrix.

Nevertheless I disagree that this is "feature freeze" with
R version 2.3.1:

Since in R-2.3.1, unlist() on a list of factors returned an
integer vector, the result of apply was an integer matrix and
not a character matrix.

Therefore my question is if it would be desirable to return an
integer matrix by changing apply. One could include additional
code to handle the case if the output "should" be a factor
matrix and coerce into an integer matrix.

Then the outcome would be consistent with R-2.3.1 without
changing something in unlist() or array().


But in the end I am not sure if an integer matrix is better than
a character matrix or a factor matrix. I am not sure what output
is best if one uses as.factor in apply.


Regards,

Christoph

--------------------------------------------------------------
Christoph Buser <buser at stat.math.ethz.ch>
Seminar fuer Statistik, LEO C13
ETH Zurich	8092 Zurich	 SWITZERLAND
phone: x-41-44-632-4673		fax: 632-1228
http://stat.ethz.ch/~buser/
--------------------------------------------------------------

Prof Brian Ripley writes:
 > Christoph,
 > 
 > This is more complicated than your analysis.
 > 
 > 1) apply takes a matrix as an argument, not a data frame, and so first 
 > coerced 'dat' to a character matrix.
 > 
 > 2) unlist is working quite correctly.  The issue is array(), which 
 > contains as.vector(data).  Thus although the result could be a factor 
 > matrix, as.vector is coercing it to a character matrix.  It might be 
 > desirable to return a factor matrix, but we are not going to do that in 
 > feature freeze (if ever) and I really don't think it would be what you 
 > wanted.
 > 
 > Perhaps the help page should contain an explicit statement that the result 
 > will be coerced to a basic vector type by as.vector().
 > 
 > On Mon, 25 Sep 2006, Christoph Buser wrote:
 > 
 > > Dear R-core
 > >
 > > There is a different output for the apply function due to the
 > > change of unlist as mentioned in the R news.
 > >
 > > Newly, applying as.factor() (or factor()) in
 > >
 > > str(dat <- data.frame(x = 1:10, f1 = gl(2,5,labels = c("A", "B"))))
 > > (d1 <- apply(dat,2,as.factor))
 > >
 > > newly returns a character matrix while in R-2.3.1 the same
 > > command resulted in an integer matrix that was consistent (up to
 > > the ordering of the factor levels) with data.matrix().
 > 
 > That's coincidence -- try x=11:20.
 > 
 > > The change is caused by the change of unlist() that, used for a
 > > list of factors, newly returns a single factor instead of an
 > > integer. I am happy with this change, but:
 > >
 > > Is it desirable to change apply so that it does not return a
 > > character matrix in the example above or include a warning for
 > > such a case?
 > >
 > > Thank you very much for an answer.
 > >
 > > Regards,
 > >
 > > Christoph Buser
 > >
 > > --------------------------------------------------------------
 > > Christoph Buser <buser at stat.math.ethz.ch>
 > > Seminar fuer Statistik, LEO C13
 > > ETH Zurich	8092 Zurich	 SWITZERLAND
 > > phone: x-41-44-632-4673		fax: 632-1228
 > > http://stat.ethz.ch/~buser/
 > >
 > > ______________________________________________
 > > R-devel at r-project.org mailing list
 > > https://stat.ethz.ch/mailman/listinfo/r-devel
 > >
 > 
 > -- 
 > Brian D. Ripley,                  ripley at stats.ox.ac.uk
 > Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 > University of Oxford,             Tel:  +44 1865 272861 (self)
 > 1 South Parks Road,                     +44 1865 272866 (PA)
 > Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-devel mailing list