[R] unique and precision of long integers

Thomas Lumley tlumley at u.washington.edu
Mon May 14 17:41:15 CEST 2001


On Mon, 14 May 2001, Michael Herron wrote:

>
> Hello.
>
> I have a dataset with about 500,000 observations, most of which are
> not unique.  The first 10 observations look like
>
> 901000000000100000010100101011002
> 901101101110100000010100101011002
> 901000000000100000010100000001002
> 901000000000100000010101001011002
> 901000000000100000010101010011002
> 901000000000100000010100110101002
> 901000000000100000010100101011002
> 900000000000100000010010101011002
> 901000000000100000010100101101002
> 901000000000100000010100101011002
>
> Each digit reflects a separate field, but above all spaces are
> removed.
>
> I read in the data with scan(), and then use unique() to get the
> unique observations.  But, when I print these elements to a file I
> lose precision.  For instance, let x be a vector of the first 10
> observations from the dataset:
>
> > write (x,file="output",ncol=1)
>
> more output
>
> 9.01e+32
> 9.011011e+32
> 9.01e+32
> 9.01e+32
> 9.01e+32
> 9.01e+32
> 9.01e+32
> 9e+32
> 9.01e+32
> 9.01e+32
>
> Is there a way to get all the digits back?
>
> > write (format(x,digits=22),file="output",ncol=1)
>
> does not do it, and I cannot seem to increase digits >22.
>

You can't store numbers to more than the precision provided by your
compiler/hardware, so there's probably only 16 accurate digits no matter
how many R prints.

In order to unique() them you can read them as strings, which have
essentially unlimited precision.

	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list