[Rd] Error in unsplit() with tibbles

Mario Annau m@r|o@@nn@u @end|ng |rom qu@nt@rgo@com
Sat Nov 21 18:04:18 CET 2020


Cool - thank you Peter!

@Marc: This is really not a tidyverse vs base-R debate and I personally
think that they should both work together for most parts. The common
environment is still R. But just to give you the full picture I also filed
a bug for tibbles (https://github.com/tidyverse/tibble/issues/829). With
these two fixes I think that split/unsplit would work for tibbles and users
(like me) just don't have to care in which "environments" they are working
in.

Cheers,
Mario


On Sat, 21 Nov 2020 at 17:54, Peter Dalgaard <pdalgd using gmail.com> wrote:

> I get the sentiment, but this is really just bad coding (on my own part, I
> suspect), so we might as well just fix it...
>
> -pd
>
> > On 21 Nov 2020, at 17:42 , Marc Schwartz via R-devel <
> r-devel using r-project.org> wrote:
> >
> >
> >> On Nov 21, 2020, at 10:55 AM, Mario Annau <mario.annau using gmail.com>
> wrote:
> >>
> >> Hello,
> >>
> >> using the `unsplit()` function with tibbles currently leads to the
> >> following error:
> >>
> >>> mtcars_tb <- as_tibble(mtcars, rownames = NULL)
> >>> s <- split(mtcars_tb, mtcars_tb$gear)
> >>> unsplit(s, mtcars_tb$gear)
> >> Error: Must subset rows with a valid subscript vector.
> >> ℹ Logical subscripts must match the size of the indexed input.
> >> x Input has size 15 but subscript `rep(NA, len)` has size 32.
> >> Run `rlang::last_error()` to see where the error occurred.
> >>
> >> Tibble seems to (rightly) complain, that a logical vector has been used
> for
> >> subsetting which does not have the same length as the data.frame (rows).
> >> Since `NA` is a logical value, the subset should be changed to
> >> `NA_integer_` in `unsplit()`:
> >>
> >>> unsplit
> >> function (value, f, drop = FALSE)
> >> {
> >>   len <- length(if (is.list(f)) f[[1L]] else f)
> >>   if (is.data.frame(value[[1L]])) {
> >>       x <- value[[1L]][rep(*NA_integer_*, len), , drop = FALSE]
> >>       rownames(x) <- unsplit(lapply(value, rownames), f, drop = drop)
> >>   }
> >>   else x <- value[[1L]][rep(NA, len)]
> >>   split(x, f, drop = drop) <- value
> >>   x
> >> }
> >>
> >> Cheers,
> >> Mario
> >
> >
> > Hi,
> >
> > Perhaps I am missing something, but if you are using objects, like
> tibbles, that are intended to be part of another environment, in this case
> the tidyverse, why would you not use functions to manipulate these objects
> that were specifically created in the other environment?
> >
> > I don't use the tidyverse, but it seems to me that to expect base R
> functions to work with objects not created in base R, is problematic, even
> though, perhaps by coincidence, they may work without adverse effects, as
> appears to be the case with split().
> >
> > In other words, you should not, in reality, have had an a priori
> expectation that split() would work with a tibble either.
> >
> > Rather than modifying the base R functions, like unsplit(), as you are
> suggesting, to be compatible with these third party objects, the burden
> should either be on you to use relevant tidyverse functions, or on the
> authors of the tidyverse to provide relevant class methods to provide that
> functionality.
> >
> > Regards,
> >
> > Marc Schwartz
> >
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com
>
>
>
>
>
>
>
>
>
>

-- 
Mario Annau
Founder and CEO
Quantargo

Tel: +43 1 348 44 55-11 | mario.annau using quantargo.com
www.quantargo.com

	[[alternative HTML version deleted]]



More information about the R-devel mailing list