[Rd] inconsistency in treatment of USE.NAMES argument

Hervé Pagès hpages at fredhutch.org
Thu Feb 18 23:22:22 CET 2016


On 02/11/2016 07:02 AM, Michael Lawrence wrote:
> Changing the vapply() behavior makes sense in principle.

Sorry to disagree, Changing the behavior of sapply() so we end up
with consistent treatment of USE.NAMES across sapply(), vapply(),
and mapply() sounds much better *in principle*.

I understand sapply() predates vapply() but the real question is:
how much code around use sapply(..., USE.NAMES=FALSE) on a object
with names() and expect the names to be preserved?

H.

> I analyzed
> the CRAN code base using the R parser and found 143 instances of
> calling vapply with USE.NAMES=FALSE. These would need to be inspected
> to understand the consequences of the change.
>
> For reference:
> /AzureML/R/datasets.R:226
> /BBmisc/R/toRangeStr.R:33
> /DBI/R/DBDriver.R:205
> /Kmisc/R/str_rev.R:37
> /Matrix/R/diagMatrix.R:98
> /MuMIn/R/utils-models.R:110
> /OpenMx/R/MxNamespace.R:702
> /OrthoPanels/R/opm.R:167
> /XML2R/R/utils.R:16
> /assertive.base/tests/testthat/test-utils.R:14
> /bigrquery/R/utils.r:13
> /bold/R/zzz.R:29
> /checkmate/R/checkList.r:56
> /coin/R/ExactDistributions.R:80
> /coin/R/ExactDistributions.R:97
> /coin/R/ExactDistributions.R:234
> /coin/R/SymmetryTests.R:217
> /copula/R/aux-acopula.R:950
> /covr/R/data_frame.R:13
> /covr/R/display_name.R:40
> /cplm/R/lme4_lmer.R:423
> /crunch/R/batches.R:71
> /crunch/R/batches.R:102
> /crunch/R/categorical-array.R:87
> /crunch/R/hide-variables.R:78
> /crunch/R/misc.R:68
> /crunch/R/share.R:11
> /crunch/R/shoji-catalog.R:39
> /crunch/R/show.R:88
> /crunch/R/subvariables.R:76
> /crunch/R/subvariables.R:95
> /dplR/R/common.interval.R:8
> /dplR/R/fill.internal.NA.R:47
> /dplR/R/helpers.R:3
> /dplyr/R/dataframe.R:49
> /dplyr/R/glimpse.R:38
> /dplyr/R/id.r:36
> /dplyr/R/tbl-cube.r:98
> /dplyr/R/utils.r:15
> /fulltext/R/chunks.R:352
> /fulltext/R/chunks.R:356
> /ggvis/R/transform.R:56
> /httr/R/oauth-token-utils.R:23
> /igraph/R/lazyeval.R:219
> /jsonlite/R/asJSON.data.frame.R:74
> /jsonlite/R/deparse_vector.R:26
> /jsonlite/R/simplifyDataFrame.R:14
> /jsonlite/R/unescape_unicode.R:10
> /knitr/R/utils.R:207
> /knitrBootstrap/R/knit_bootstrap.R:303
> /lazyeval/R/names.R:27
> /learningr/R/buggy_count.R:67
> /lintr/R/absolute_paths_linter.R:35
> /lintr/R/absolute_paths_linter.R:71
> /loo/R/helpers.R:13
> /loo/R/helpers.R:20
> /matconv/R/convEasySyntax.R:71
> /matconv/R/convFunctionCalls.R:11
> /matconv/R/utils.R:113
> /micropan/R/biostrings.R:80
> /micropan/R/biostrings.R:106
> /mime/R/mime.R:129
> /packrat/R/bundle.R:137
> /packrat/R/hooks.R:45
> /packrat/R/lockfile.R:56
> /pixiedust/R/sprinkle_colnames.R:66
> /plyr/R/id.r:38
> /polyCub/R/polyCub.SV.R:110
> /polyCub/R/polyCub.exact.Gauss.R:99
> /polyCub/R/polyCub.iso.R:130
> /polyCub/R/polyCub.iso.R:166
> /polyCub/R/polyCub.iso.R:168
> /pryr/R/dots.r:25
> /rappdirs/R/utils.r:39
> /rbison/R/bison.R:181
> /rcrossref/R/cr_ft_text.R:191
> /rcrossref/R/get_styles.R:16
> /rebus.base/R/internal.R:29
> /rerddap/R/info.R:73
> /rerddap/R/info.R:80
> /rerddap/R/search.R:37
> /rerddap/R/search_adv.R:64
> /reutils/R/ecitmatch.R:33
> /reutils/R/parse-docsum.R:53
> /reutils/R/utils.R:44
> /reutils/R/utils.R:51
> /reutils/R/utils.R:55
> /rgbif/R/occ_search.r:230
> /rgbif/R/zzz.r:120
> /rjstat/R/rjstat.R:75
> /rlist/R/internal.R:155
> /rnoaa/R/homr.R:140
> /rnoaa/R/storm_shp.R:42
> /roxygen2/R/template.R:27
> /rplos/R/fulltext.R:29
> /rplos/R/fulltext.R:82
> /shiny/inst/tests/test-bootstrap.r:12
> /shinyjs/inst/examples/demo/helpers.R:49
> /simcausal/R/network.R:319
> /sisal/R/sisalTable.R:806
> /sisal/R/sisalTable.R:829
> /sisal/R/sisalTable.R:853
> /sisal/R/sisalTable.R:915
> /sisal/R/sisalTable.R:924
> /sisal/R/sisalTable.R:933
> /stringdist/R/seqdist.R:130
> /stringdist/R/stringdist.R:286
> /surveillance/R/calibration_null.R:25
> /surveillance/R/calibration_null.R:185
> /surveillance/R/epidata.R:356
> /surveillance/R/epidata.R:362
> /surveillance/R/hhh4.R:311
> /surveillance/R/hhh4_methods.R:403
> /surveillance/R/hhh4_oneStepAhead.R:126
> /surveillance/R/hhh4_plot.R:419
> /surveillance/R/hhh4_plot.R:691
> /surveillance/R/hhh4_plot.R:744
> /surveillance/R/pit.R:29
> /surveillance/R/pit.R:45
> /surveillance/R/spatial_tools.R:261
> /surveillance/R/spatial_tools.R:264
> /surveillance/R/twinSIR_profile.R:229
> /surveillance/R/twinSIR_simulation.R:284
> /surveillance/R/twinSIR_simulation.R:288
> /surveillance/R/twinstim.R:212
> /surveillance/R/twinstim.R:553
> /surveillance/R/twinstim_epitest.R:188
> /surveillance/R/twinstim_helper.R:139
> /surveillance/R/twinstim_siaf.R:176
> /surveydata/R/questions.R:164
> /sweidnumbr/R/luhn_algo.R:75
> /sweidnumbr/R/oin.R:104
> /sweidnumbr/R/oin.R:134
> /sweidnumbr/R/pin.R:202
> /sweidnumbr/R/pin.R:440
> /taxize/R/gni_parse.R:29
> /taxize/inst/ignore/taxonclass.R:117
> /taxize/inst/ignore/taxonclass2.R:157
> /testthat/R/utils.r:13
> /textreuse/R/TextReuseCorpus.R:137
> /textreuse/R/minhash.R:46
> /textreuse/R/similarity.R:111
> /tidyr/R/id.R:17
>
> On Tue, Feb 9, 2016 at 3:37 AM, Martin Maechler
> <maechler at stat.math.ethz.ch> wrote:
>>>>>>> Hervé Pagès <hpages at fredhutch.org>
>>>>>>>      on Mon, 8 Feb 2016 10:48:50 -0800 writes:
>>
>>      > Hi,
>>      > Both vapply() and sapply() support the 'USE.NAMES' argument. According
>>      > to the man page:
>>
>>      > USE.NAMES: logical; if ‘TRUE’ and if ‘X’ is character, use ‘X’ as
>>      >    ‘names’ for the result unless it had names already.
>>
>>      > But if 'X' has names already and 'USE.NAMES' is FALSE, it's not clear
>>      > what will happen to the names. Are they going to propagate to the
>>      > result or not? Unfortunately, vapply() and sapply() give a different
>>      > answer:
>>
>>      >> vapply(list(A="a", B=1:2), is.integer, logical(1), USE.NAMES=FALSE)
>>      > [1] FALSE  TRUE
>>
>>      >> sapply(list(A="a", B=1:2), is.integer, USE.NAMES=FALSE)
>>      > A     B
>>      > FALSE  TRUE
>>
>> This is very unfortunate, and I was not aware of this.
>>
>> You know that sapply()  is an order of magnitude older than vapply()
>> and you probably don't know that lapply() is also somewhat older
>> than sapply() [but that part is pre-R (but S-) history ...]
>> which explains part:
>>
>> 1) lapply() does *not* have a  USE.NAMES  argument and it
>>     always keeps names when they are there in X.
>>
>> 2) sapply() has been designed as "s"implified l"apply" where
>>     in this case "simplified" also was to mean "user-friendly" /
>>     "simple to use".
>>     For that reason,
>>     a) sapply() also keeps names when they are there (as lapply).
>>     b) If USE.NAMES=TRUE (as by default) is also constructs names
>>        in cases where lapply() does not contain, i.e., in case of
>>        character X.
>>
>> 3) IIRC, the goals for vapply() had been  "like sapply", with two advantages:
>>     a. faster
>>     b. "error-checking" in the sense of ensuring consistent
>>         results of the single function calls.
>>
>>      > Wouldn't it make sense to have vapply() and sapply() treat the
>>      > 'USE.NAMES' argument consistently?
>>
>> Yes, but from what I wrote above, I believe  vapply() would have
>> to change.
>>
>> Martin
>>
>>
>>      > The behavior of vapply() seems
>>      > to make more sense to me. Note that it's consistent with what mapply()
>>      > does:
>>
>>      >> mapply(is.integer, list(A="a", B=1:2), USE.NAMES=FALSE)
>>      > [1] FALSE  TRUE
>>
>>      > If the behavior of sapply() cannot be changed, at least the man page
>>      > could clarify what happens when 'USE.NAMES' is FALSE, which is
>>      > different for each function.
>>
>>      > Thanks,
>>      > H.
>>
>>      > --
>>      > Hervé Pagès
>>
>>      > Program in Computational Biology
>>      > Division of Public Health Sciences
>>      > Fred Hutchinson Cancer Research Center
>>      > 1100 Fairview Ave. N, M1-B514
>>      > P.O. Box 19024
>>      > Seattle, WA 98109-1024
>>
>>      > E-mail: hpages at fredhutch.org
>>      > Phone:  (206) 667-5791
>>      > Fax:    (206) 667-1319
>>
>>      > ______________________________________________
>>      > R-devel at r-project.org mailing list
>>      > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list