[R] Controlling number of numbers before R rewrites to "+e18" etc

jim holtman jholtman at gmail.com
Sat Oct 23 03:56:23 CEST 2010


Your best bet is to make sure that you read the IDs in as characters.
If they are being read in as floating point numbers, then there is
only 15 digits of accuracy, so if you have IDs 18-22 digits, you will
be missing data.  So if you are using read.table, then look at
colClasses to see how to do this.

Provide a subset of your data and the statements that you are using to
read in the data.

On Fri, Oct 22, 2010 at 1:15 PM, ZeMajik <zemajik at gmail.com> wrote:
> Hey,
>
> I'm using R as a pre-processor for a large dataset with IDs which are
> numeric (but has no numeric meaning so can be seen as factors).
> I do some data formating and then write it out to a csv file.
>
> However the problem is that the IDs are very long, 18-22 chars long more
> precisely. R is constantly rewriting these IDs to the abbreviated +eX which
> hinders me from exporting the data to the csv since the IDs are no longer
> intact.
> I've tried telling R that ID column is a factor, but this results in two
> problems: 1) Since I have millions of rows and R is slower handling factors
> than numbers my comp can't run the process in any kind of reasonable time.
> and 2) Some IDs STILL seem to be rewritten somehow. The second point made me
> believe that perhaps R is rewriting upon import?
>
> Does anyone have any tips on how to solve this problem?
>
> Thanks,
> Mike
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list