[R] Dealing with factors ???

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Fri Nov 16 07:40:41 CET 2012


Your numeric data appears to have commas (thousands separators) in it. You don't say where you got the data, but Excel does this, and if this is the case then a straightforward way to fix it is to load it in Excel and set the formatting of all numeric columns to "general" before saving again.

You can also fix it in R using gsub to replace commas with empty strings and as.numeric to convert to numeric form.  There are examples of this in the mailing list archives.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

eric <ericstrom at aol.com> wrote:

>I have a data frame x that came from read.csv. It seemed to read in ok
>but
>then I tried doing some plotting of the values and ran into
>difficulties. 
>The plot command seems to be plotting factors instead of the values.
>How do
>I get rid of these factors ? The plot command I use is : plot (x$dat,
>x$TX,
>type='l'). I also tried  ...plot(x$dat, levels(x$TX), type='l) but got
>an
>error :
>
>What am I doing wrong here ?
>
>Error in plot.window(...) : need finite 'ylim' values
>In addition: Warning messages:
>1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
>2: In min(x) : no non-missing arguments to min; returning Inf
>3: In max(x) : no non-missing arguments to max; returning -Inf
>
> head(x,4)
>Period         PA           NJ                 MD            TX        
>          
>All        dat
>1 200812  903,231   1,985,460   905,422   3,312,088   7,106,201 
>2008-12-31
>2 200901  880,491   1,924,111   892,980   3,006,050   6,703,631 
>2009-01-31
>3 200902  883,994   1,926,169   890,021   3,247,530   6,947,714 
>2009-03-03
>4 200903  888,021   1,901,182   892,593   3,216,730   6,898,526 
>2009-03-31
>> str(x)
>'data.frame':	41 obs. of  7 variables:
> $ Period: int  200812 200901 200902 200903 200904 200905 200906 200907
>200908 200909 ...
>$ PA  : Factor w/ 41 levels " 818,037 "," 823,191 ",..: 26 22 23 25 19
>7 10
>2 1 12 ...
>$ NJ   : Factor w/ 41 levels " 1,599,113 ",..: 31 28 29 27 22 19 20 17
>14
>16 ...
>$ MD   : Factor w/ 41 levels " 800,827 "," 807,154 ",..: 27 25 23 24 15
>13
>11 6 5 3 ...
>$ TX   : Factor w/ 41 levels " 2,472,690 ",..: 41 23 40 39 35 34 32 21
>18
>27 ...
>$ All   : Factor w/ 41 levels " 6,111,993 ",..: 40 27 38 36 25 21 19 13
>11
>16 ...
> $ dat   :Class 'Date'  num [1:41] 14244 14275 14306 14334 14365 ...
>
>
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/Dealing-with-factors-tp4649686.html
>Sent from the R help mailing list archive at Nabble.com.
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list