[Rd] Possible bug in formatC

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Mon Jun 3 17:04:27 CEST 2019

>>>>> Randy Cragun 
>>>>>     on Thu, 30 May 2019 00:26:38 -0400 writes:

    > I do not know if this is a bug or a case of improper
    > documentation. The documentation for formatC() implies
    > that the difference between the options format="f" and
    > format="g" is that with "g", scientific format is
    > sometimes used. There is another difference between them
    > that is not mentioned in the
    > documentation. drop0trailing=FALSE is ignored when format
    > is set to "g" unless flag contains "#" (this is the
    > documented behavior for format="fg").  For instance, the
    > first line below return " 2.5", whereas the second returns
    > the expected "2.50".

    > formatC(2.50, format="g", digits=3, drop0trailing=F)
    > formatC(2.50, format="g", digits=3, drop0trailing=F, flag="#")

Well, you have a point that this behavior is not documented in
details (and I assume the text reference "Kernighan and Richie"
is less available for the typical R users than in 1995...)

However, formatC() has been unchanged like that for close to 20
years, so we will most probably not change the function's behavior.

Notice that   drop0trailing=FALSE  is really the default
(and format="g" is also the default for non-character / non-integer numbers).

The design of formatC(*) for numbers has entailed to default to
format="g" which drops trailing zeros most of the time
[whereas the format = "f" does not, unless drop0trailing=TRUE is set.]

Lastly, note that 2.50 and 2.5 are exactly identical as R
numbers; so, your two examples above are identical to the much shorter

   formatC(2.5, digits=3)
   formatC(2.5, digits=3, flag="#")

If you want "extraneous" trailing zeros, the "f" format is your
friend most of the time anyway:

> t(sapply(1:8, function(D) formatC(c(2.5,pi), format="f", digits= D)))
     [,1]         [,2]        
[1,] "2.5"        "3.1"       
[2,] "2.50"       "3.14"      
[3,] "2.500"      "3.142"     
[4,] "2.5000"     "3.1416"    
[5,] "2.50000"    "3.14159"   
[6,] "2.500000"   "3.141593"  
[7,] "2.5000000"  "3.1415927" 
[8,] "2.50000000" "3.14159265"

I will add more information to the formatC()  help
page, notably not only mentioning but explaining most of the
'flag's that are available typically(*).

Thank you for raising the issue.

Martin Maechler
ETH Zurich and R Core Team

*) as formatC() interfaces to the OS C library, some of the
   availability and meaning of 'flags' is platform dependent.

    > ----------------------
    > sessionInfo():

    > R version 3.5.3 (2019-03-11) Platform:
    > x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >=
    > 8 x64 (build 9200)

    > Matrix products: default

    > locale: [1] LC_COLLATE=English_United States.1252
    > LC_CTYPE=English_United States.1252 [3]
    > LC_MONETARY=English_United States.1252 LC_NUMERIC=C

    > [5] LC_TIME=English_United States.1252

    > attached base packages: [1] stats graphics grDevices utils
    > datasets methods base

    > loaded via a namespace (and not attached): [1]
    > compiler_3.5.3 tools_3.5.3

    > ______________________________________________
    > R-devel using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel

More information about the R-devel mailing list