[Rd] problem with display of complex number

Thu Jul 19 11:27:19 CEST 2018

TL;DR :  It's more complicated and needs more discussion
      	 (which I start below)

> Hi,
> > 1e10+5i
> [1] 1e+10+0e+00i
> > Im(1e10+5i)
> [1] 5
> 
> maybe little better...
> 
> --- R-3.5.1.orig/src/main/complex.c    2018-03-26 07:02:25.000000000 +0900
> +++ R-3.5.1/src/main/complex.c    2018-07-10 12:50:42.523874767 +0900
> @@ -381,6 +381,7 @@
>      r->i = fround(pow10 * x->i, digits)/pow10;
>      } else {
>      digits = (double)(dig);
> +    if(digits < 1) digits=1; /* a little better */
>      r->r = fround(x->r, digits);
>      r->i = fround(x->i, digits);
>      }
> 
> 
> -- 
> Best Regards,
> -- 
> Eiji NAKAMA <nakama (a) ki.rim.or.jp>
> "\u4e2d\u9593\u6804\u6cbb"  <nakama (a) ki.rim.or.jp>

Thanks a lot, Eiji!
Yes, your proposed change does  "prevent the worst", in this
case, however it is more complicated.

I do think the current display {i.e., format(), as.character(),..}, 
of such complex numbers is far from optimal, not only in the example
you show above but also more generally, when the real and
complex parts of the complex vector(s) are of different magnitude.

If only the Mod(z) of the complex number z was of importance, one
could argue that indeed it makes sense to use the same format
for the real and imaginary parts; this is however not always the
case, and rather the Re(z) and/or Im(z) are of interest in
themselves.  For such cases, the current formatting of complex
numbers is unfortunate also in my view.

I've checked other systems:
  1) octave, matlab, python
  2) mathematica, maple

and they all format Re() and Im() separately, and I think we
should do.

However, we have "known" about this for a long time 
In R/tests/reg-tests-2.Rout.save, line 5641 ff, we have had,
from svn c39481 | 2006-09-22 ), log 'tweak printing of complex numbers' ,

    > ## printing of complex numbers of very different magnitudes
    > 1e100  + 1e44i
    [1] 1e+100+0e+00i
    > 1e100 + pi*1i*10^(c(-100,0,1,40,100))
    [1] 1e+100+ 0.000000e+00i 1e+100+ 0.000000e+00i 1e+100+ 0.000000e+00i
    [4] 1e+100+ 0.000000e+00i 1e+100+3.141593e+100i
    > ## first was silly, second not rounded correctly in 2.2.0 - 2.3.1
    > ## We don't get them lining up, but that is a printf issue
    > ## that only happens for very large complex nos.
    > 
    > 
    > ### end of tests added in 2.4.0 ###

and with your change the output of the above example also
changes.

I now found more history:
The fundamental changes is indeed from 2005,  for  R 2.2.0 :

------------------------------------------------------------------------
r35253 | ripley | 2005-08-11 18:34:24 +0200 (Thu, 11 Aug 2005) | 2 lines
Changed paths:
   M NEWS
   M src/main/complex.c
   M src/main/format.c
   M tests/arith.R
   M tests/arith.Rout.save
   M tests/complex.R
   M tests/complex.Rout.save
   M tests/print-tests.Rout.save

enhance printing of complex numbers to use pairs together.
---------------------------------------------------------------

where the NEWS entry (now in  <R>/doc/NEWS.2 )  has been 

    o	The printing of complex numbers has changed, handling numbers
	as a whole rather than in two parts.  So both real and
	imaginary parts are shown to the same accuracy, with the
	'digits' parameter referring to the accuracy of the larger
	component, and both components are shown in fixed or
	scientific notation (unless one is entirely zero when it is
	always shown in fixed notation).

There was a bug in the implementation on which Robin Hankin
wrote about year later in a thread on R-devel :

  https://stat.ethz.ch/pipermail/r-devel/2006-September/042792.html

where Brian Ripley mentioned the (above, not the buggy one)
behavior as "by design", and Robin then also found and mentioned
the above entry. ... and notably the above regression test was
added after the bug fix also in Sept. 2006.

Apart from that NEWS entry (now basically only visible in the
source code), the only documentation about this behavior, is the
one sentence in the  ?signif  help page

  "Complex numbers are rounded to retain the specified number
   of digits in the larger of the components"

And where this does make (some) sense for
    signif(<complex>, digits),

it seems wrong to me (and you) for format() and print()  and
I think we should reconsider using it when format()ing  complex
numbers.

Note that your proposed change does change the z_prec_r()
function which in fact *is* used for signif(<complex>), and so
your change also does affect signif(<complex>) which I think is
more than I'd want here.

For instance, it would change the following (regression test in
tests/complex.R )

> signif(1.678932e80+0i, 5)
[1] 1.678932e+80+0i
>

instead of really showing 5 digits for the real part as it does
now :

[1] 1.6789e+80+0i

---------------------

I'm proposing to reconsider and change such that

1) print() {and auto-print} and format() of complex numbers should
   treat Re() and Im() separately, and identically to how double()
   are print()ed / format()ted in such case, notably, using
   'digits' to both parts "separately".

2) We can and probably should keep signif(<complex>)'s  behavior
   as current,  as that has been documented to behave as it does
   for almost 13 years.

Martin