[Rd] Unexplicable difference between 2 R installations regarding reading numbers

Joris Meys jorismeys at gmail.com
Mon Nov 3 17:06:40 CET 2014


...and apparently I have 3.1.1 installed here, instead of 3.1.0 like on the
server. That illustrates very nicely the lack of coffee I experienced on
this monday.

Thank you!

On Mon, Nov 3, 2014 at 4:41 PM, Simon Urbanek <simon.urbanek at r-project.org>
wrote:

> R version.
>
> NEWS for 3.1.0:
>
>       type.convert() (and hence by default
>       read.table() returns a character vector or factor when
>       representing a numeric input as a double would lose accuracy.
>       Similarly for complex inputs.
>
> NEWS for 3.1.1:
>
>       type.convert(), read.table() and similar
>       read.*() functions get a new numerals argument,
>       specifying how numeric input is converted when its conversion to
>       double precision loses accuracy.  The default value,
>       allow.loss allows accuracy loss, as in R versions before
>       3.1.0.
>
>
> On Nov 3, 2014, at 10:07 AM, Joris Meys <jorismeys at gmail.com> wrote:
>
> > Dear all,
> >
> > A colleague of mine reported a problem that I fail to understand
> > completely. He has a number of .csv files that look all very
> > straightforward, and they all read in perfectly well using read.csv() on
> > both his and my computer.
> >
> > When we try the exact same R version on the university server however,
> > suddenly all numeric variables turn into factors. The problem is resolved
> > by deleting the last digits of every number in the .csv file.  Using
> > as.numeric() on the values works as well.
> >
> > Anybody a clue as to what might cause this problem? If needed, I can send
> > an example of a .csv file.
> >
> > Example output on server:
> >
> >> X <- read.csv("Originelen/Originelen/heavymetals.csv")
> >> levels(X[[2]])
> > [1] "11.140969600635804" "11.548972671055257" "11.98554898321271"
> > [4] "16.317868213178677" "17.179218967921898" "18.596573461949852"
> > [7] "18.786014405762298" "18.87978032658098"  "23.604106448719225"
> > [10] "26.75482955698816"  "27.33829851044687"  "29.26619704952923"
> > [13] "33.07842352705811"  "39.296270581233884" "4.8696848424212105"
> > [16] "5.5751725517655295" "6.0256909109049195" "9.117975845892804"
> > [19] "9.26944194868723"
> >> str(X)
> > 'data.frame':   19 obs. of  18 variables:
> > $ ID   : int  1 2 3 4 5 6 7 8 9 10 ...
> > $ Cd5  : Factor w/ 19 levels "11.140969600635804",..: 3 8 6 12 11 10 2 5
> > 14 13 ...
> > $ Cd20 : Factor w/ 19 levels "10.160499999999999",..: 2 8 10 12 5 6 18 9
> > 11 4 ...
> > $ Cr5  : Factor w/ 19 levels "118.43421710855425",..: 6 11 10 17 16 15 7
> > 13 19 18 ...
> > $ Cr20 : Factor w/ 19 levels "100.48101898101898",..: 9 15 14 17 13 11 6
> > 16 18 12 ...
> > $ Cu5  : Factor w/ 19 levels "101.8005401620486",..: 8 17 16 15 14 12 9
> 18
> > 19 1 ...
> > $ Cu20 : Factor w/ 19 levels "103.67346938775509",..: 11 18 19 2 16 17 14
> > 3 4 1 ...
> > $ Fe5  : Factor w/ 19 levels "17239.349496158833",..: 3 8 10 9 12 14 7 16
> > 19 18 ...
> > $ Fe20 : Factor w/ 19 levels "17701.77893264042",..: 3 14 16 18 10 15 6
> 17
> > 19 13 ...
> > $ Mn5  : Factor w/ 19 levels "440.37211163349",..: 10 14 4 5 3 17 2 7 18
> 6
> > ...
> > $ Mn20 : Factor w/ 19 levels "375.19156134938805",..: 12 2 6 3 1 9 11 7 8
> > 5 ...
> > $ Ni5  : Factor w/ 19 levels "19.54255213010077",..: 4 12 8 10 11 16 6 14
> > 19 18 ...
> > $ Ni20 : Factor w/ 19 levels "21.295222866280234",..: 8 13 15 18 12 16 7
> > 17 19 14 ...
> > $ Pb5  : Factor w/ 19 levels "125.5616926977306",..: 1 11 14 9 13 8 5 12
> > 15 16 ...
> > $ Pb20 : Factor w/ 19 levels "106.96930306969303",..: 3 8 11 12 9 10 4 13
> > 14 15 ...
> > $ Zn5  : Factor w/ 19 levels "1024.909963985594",..: 17 4 7 5 8 3 18 6 9
> > 10 ...
> > $ Zn20 : Factor w/ 19 levels "1247.816195886593",..: 15 4 5 7 2 1 16 6 8
> 3
> > ...
> > $ river: int  1 1 1 1 1 1 1 1 1 1 ...
> >
> > Using as.numeric(levels(X[[2]])) works perfectly fine though...
> >
> > Session info both server and my own computer :
> >
> >> sessionInfo()
> > R version 3.1.0 (2014-04-10)
> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> >
> > locale:
> > [1] LC_COLLATE=Dutch_Belgium.1252  LC_CTYPE=Dutch_Belgium.1252
> > [3] LC_MONETARY=Dutch_Belgium.1252 LC_NUMERIC=C
> > [5] LC_TIME=Dutch_Belgium.1252
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > loaded via a namespace (and not attached):
> > [1] tools_3.1.0
> >
> > --
> > Joris Meys
> > Statistical consultant
> >
> > Ghent University
> > Faculty of Bioscience Engineering
> > Department of Mathematical Modelling, Statistics and Bio-Informatics
> >
> > tel :  +32 (0)9 264 61 79
> > Joris.Meys at Ugent.be
> > -------------------------------
> > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>


-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel :  +32 (0)9 264 61 79
Joris.Meys at Ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

	[[alternative HTML version deleted]]



More information about the R-devel mailing list