[R] duplicated() on zero-column data frames returns empty vector

Mark Webster m@rkweb@ter204 @end|ng |rom y@hoo@co@uk
Fri Apr 5 10:40:52 CEST 2024


 Hello Ivan, thanks for this.
   > Part of the problem is that it's not obvious what should be a
> zero-column but non-zero-row data.frame mean.
> 
> On the one hand, your database relation use case is entirely valid. On
> the other hand, if data.frames are considered to be tables of data with
> row.names as their identifiers, then duplicated(d) should be returning
> logical(nrow(d)) for zero-column data.frames, since row.names are
> required to be unique. I'm sure that more interpretations can be
> devised, requiring some other behaviour for duplicated() and friends.

Do you mean the row names should mean all the rows should be counted as non-duplicates?Yes, I can see the argument for that, thanks.I must say I'm still puzzled at what interpretation would motivate the current behaviour of returning a logical(0), however.

> Thankfully, duplicated() and anyDuplicated() are generic functions, and
> you can subclass your data frames to change their behaviour:
> > ...
Indeed, I'm already doing something along these lines!
Best Regards,Mark  
	[[alternative HTML version deleted]]



More information about the R-help mailing list