[Rd] prettyNum digits=0 not compatible with scientific notation

Robert McGehee rmcgehee @end|ng |rom w@||eyetr@d|ng@net
Fri Mar 22 16:14:49 CET 2019


Hi,
Thanks for this. To be clear, I did not intend to use scientific notation, I just happened to stumble upon this when using prettyNum on numbers large enough that R switched to scientific notation and I noticed the problem. I just made this artificial example just to show with an example using smaller numbers (and in case someone on this list redefined their options(scipen) in their .Rprofile as I do).

Specifically, here's what I wanted to see:
> prettyNum(30000000.9, big.marks=",", digits=0)
[1] "30,000,001"
But got "%#4.0-1e" instead. I was intending to use digits=0 as a way of rounding at the same time as adding commas, and was meaning to have a zero significant digit scientific notation thing, which I agree probably doesn't make sense.

Also, even smaller numbers that don't normally trigger scientific notation cause the odd output, so maybe this isn't strictly a scientific notation problem? 
> options(scipen=0)
> 12345.6
[1] 12345.6 # No scientific notation here
> prettyNum(12345.6, digits=0)
[1] "%#4.0-1e"

My "fix" is just to add scientific=FALSE to my prettyNum calls as this seems to make the problem disappear for me in all cases. However, the odd output I encountered along the way seemed worthy of comment.

HTH, Robert

-----Original Message-----
From: Martin Maechler [mailto:maechler using stat.math.ethz.ch] 
Sent: Friday, March 22, 2019 5:11 AM
To: Robert McGehee <rmcgehee using walleyetrading.net>
Cc: r-devel using r-project.org
Subject: Re: [Rd] prettyNum digits=0 not compatible with scientific notation

Thank you, Robert for raising this here !

>>>>> Robert McGehee 
>>>>>     on Thu, 21 Mar 2019 20:56:19 +0000 writes:

    > R developers,
    > Seems I get a bad result ("%#4.0-1e" in particular) when trying to use prettyNum digits=0 with scientific notation. I tried on both my Linux box and on an online R evaluator and saw the same problem, so it's not limited to my box at least. I see the problem in both R 3.5.3 and R 3.3.2.

    > options(scipen= -100)

The above is extreme: You're basically setting an option to
always see non-integer numbers in "scientific" format ..
But read below about what 'digits' means in this case.

    > prettyNum(1, digits=0)
    > [1] "%#4.0-1e"
    > prettyNum(2, digits=0)
    > [1] "%#4.0-1e"
    > prettyNum(1, digits=0, scientific=FALSE) # Works
    > [1] "1"
    > prettyNum(1:2, digits=0) # Works
    > [1] "1" "2"
    > prettyNum(c(1.1, 2.1), digits=0)
    > [1] "%#4.0-1e" "%#4.0-1e"
    > prettyNum(c(1.1, 2.1), digits=1) # Works
    > [1] "1e+00" "2e+00"

I'm the scape goat / culprit /.. : I have worked on tweaking the
formatting of (non integer) numbers in R for a long time, on the
way introducing prettyNum(), also already long ago...
but then it's actually not prettyNum() but rather format() here :

Have you read its help page - *with* care?

If you do, you'd notice that 'digits' is not among the formal arguments
of prettyNum() *and* that prettyNum() passes all its  `...`  to format().
And if you read  ?format [which then you should to understand 'digits' !]
you see

  > digits: how many significant digits are to be used for numeric and
  >      complex ‘x’.  The default, NULL, uses ‘getOption("digits")’.
>      This is a suggestion: enough decimal places will be used so that
  >      the smallest (in magnitude) number has this many significant 
  >      digits, and also to satisfy ‘nsmall’.

  > 	  [.....]

So, for the real numbers you use in your example, digits are
*significant* digits as in  'options(digits= *)' or
'print(.., digits= *)'

------ Excursion about 'integer' and format()ing ------------
-- and you now may also understand why   prettyNum(1:2, digits=0)  works
 as you expect: integer formatting behaves differently   ....
 but I acknowledge that the  ?format   help page does not say so
 explicitly yet:  it 'real and complex' numbers for the
 'scientific' argument,  and 'numeric and complex' numbers for
 the 'format' argument.
 If you replac numeric by reald, what this entails (by logic)
 that 'integer' numbers are *not* affected by 'digits' nor  'scientific'.

 But there's yet more subtlety: 'integer' here means class/type "integer"
 and not just an integer number, i.e., the difference of  1L and 1 :

  > str(1)
   num 1
  > str(1L)
   int 1
  > str(1:3)
   int [1:3] 1 2 3
  > str(1:3 + 0)
   num [1:3] 1 2 3
  > 
------ end of Excursion{'integer' ..} -------------------------------

Back to real numbers : 'digits' are used as *significant* digits
(and are only a suggestion: format() and prettyNum() ensure
 a common width for their resulting strings so printing will be
nicely aligned), see e.g.

   > format(3.14, scientific=FALSE)
   [1] "3.14"
   > format(3.14*c(1, 1e-7),   scientific=FALSE)
   [1] "3.140000000" "0.000000314"
   > 

So back to your examples : What would you mean with
``0 significant digits'' ? 
It really does not make sense to show *no* significant digits ..

Intuitively, without spending enough time, I'd say that the bug
in format.default() -- which is called by prettyNum() in
this case -- is that it does *not* treat
'digits = 0'  as 'digits = 1'  in this case.  

NB:  digits = 0 has been valid in    options(digits = 0)  etc,
 "forever" I think, possibly inherited from S,  but I don't
 really know and I wonder if we should not  make it invalid eventually
 requiring at least 1.
So we could make it raise a *warning* (and set it to '1') for  now.
What do others think? 

Anyone with S(-PLUS) or Terr or .. who can report if  digits=0
is allowed there and what it does in a simple situation like

  > format(as.integer(2^30), digits=0) # digits not used at all
  [1] "1073741824"
  > format(2^30, digits=0)
  [1] "%#4.0-1e"
  > 


Last but not least:  If you really want to use exponential
format, you should typically use  sprintf() or formatC()  where
you can tweak to get what you want.


More information about the R-devel mailing list