[R] convert Factor as numeric

arnaud Gaboury arnaud.gaboury at gmail.com
Thu Apr 29 14:09:56 CEST 2010


TY petr, I was just trying something like that few mn ago :-)

as.numeric(gsub(",", "", S))  does exactly what I want. 




> -----Original Message-----
> From: Petr PIKAL [mailto:petr.pikal at precheza.cz]
> Sent: Thursday, April 29, 2010 1:28 PM
> To: arnaud Gaboury
> Cc: r-help at r-project.org
> Subject: Odp: [R] convert Factor as numeric
> 
> Hi
> 
> You have to get rid of thousands separator firsr
> 
> as.numeric(gsub(",", "", S))
> 
> Regards
> Petr
> 
> r-help-bounces at r-project.org napsal dne 29.04.2010 13:12:44:
> 
> > Dear group,
> >
> > I know this issue has been already covered, and before you reply I
> must
> say
> > I have read the R-FAQ and search the mailing list archive.
> > I still can't manage to change my factor to numeric as I couldn't
> find
> any
> > clear answer.
> >
> > Here is my df :
> >
> > Pose1 <-
> > structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L,
> > 8L), .Label = c(" SUGAR NO.11 May/10 ", "COTTON NO.2 May/10 ",
> > "PLATINUM Jul/10 ", "ROBUSTA COFFEE (10) May/10 ", "WHEAT May/10 ",
> > "PRIMARY NICKEL USD", "PRM HGH GD ALUMINIUM USD", "SPCL HIGH GRADE
> ZINC
> > USD",
> > "STANDARD LEAD USD"), class = "factor"), POSITION = c(5, 3, -1,
> > 15, 4, 2), SETTLEMENT = structure(c(3L, 5L, 2L, 1L, 4L, 8L), .Label =
> > c("1,353.0000",
> > "1,739.4000", "16.5400", "467.7500", "78.1300", "25,760.8600",
> > "2,415.9000", "2,421.0500", "2,357.1200"), class = "factor")), .Names
> =
> > c("DESCRIPTION",
> > "POSITION", "SETTLEMENT"), row.names = c("1", "2", "3", "4",
> > "5", "51"), class = "data.frame")
> >
> > >S<-Pose1$SETTLEMENT  #select the last column
> > > S
> > [1] 16.5400    78.1300    1,739.4000 1,353.0000 467.7500   2,421.0500
> > Levels: 1,353.0000 1,739.4000 16.5400 467.7500 78.1300 25,760.8600
> > 2,415.9000 2,421.0500 2,357.1200
> > > str(S)
> >  Factor w/ 9 levels "1,353.0000","1,739.4000",..: 3 5 2 1 4 8
> >
> > Now I need to change S to numeric class
> >
> > > S1<-as.numeric(levels(S))[as.integer(S)]   #doesn't work, numbers
> are
> > rounded or NA
> > Warning message:
> > NAs introduced by coercion
> >
> > > S1<-as.numeric(levels(S))[S]  #doesn't work, numbers are rounded or
> NA
> > Warning message:
> > NAs introduced by coercion
> >
> > > S1<-as.numeric(as.character(S))  #doesn't work, numbers are rounded
> or
> NA
> > Warning message:
> > NAs introduced by coercion
> >
> > If it can help, my column S is part of a DF that has been obtained
> via
> this
> > line :
> >
> >
> >pose=read.csv2("LSCPos1.csv",sep=",",dec=".",as.is=T,h=T,skip=1)[,c(4,
> 8,14,
> > 15)]
> >
> > pose <-
> > structure(list(DESCRIPTION = c("WHEAT May/10 ", "WHEAT May/10 ",
> > "WHEAT May/10 ", "WHEAT May/10 ", "COTTON NO.2 May/10 ", "COTTON NO.2
> May/10
> > ",
> > "COTTON NO.2 May/10 ", "PLATINUM Jul/10 ", " SUGAR NO.11 May/10 ",
> > " SUGAR NO.11 May/10 ", " SUGAR NO.11 May/10 ", " SUGAR NO.11 May/10
> ",
> > " SUGAR NO.11 May/10 ", "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA
> COFFEE
> (10)
> > May/10 ",
> > "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ",
> > "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ",
> > "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ",
> > "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ",
> > "ROBUSTA COFFEE (10) May/10 ", "ROBUSTA COFFEE (10) May/10 ",
> > "PRM HGH GD ALUMINIUM USD 09/07/10 ", "PRM HGH GD ALUMINIUM USD
> 09/07/10
> ",
> > "PRIMARY NICKEL USD 04/06/10 ", "PRIMARY NICKEL USD 04/06/10 ",
> > "PRIMARY NICKEL USD 10/06/10 ", "PRIMARY NICKEL USD 10/06/10 ",
> > "STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 01/07/10 ",
> > "STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 01/07/10 ",
> > "STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 01/07/10 ",
> > "STANDARD LEAD USD 01/07/10 ", "STANDARD LEAD USD 06/07/10 ",
> > "SPCL HIGH GRADE ZINC USD 08/07/10 ", "SPCL HIGH GRADE ZINC USD
> 08/07/10
> ",
> > "SPCL HIGH GRADE ZINC USD 08/07/10 ", "SPCL HIGH GRADE ZINC USD
> 09/07/10
> ",
> > "SPCL HIGH GRADE ZINC USD 09/07/10 ", "SPCL HIGH GRADE ZINC USD
> 09/07/10
> ",
> > "SPCL HIGH GRADE ZINC USD 09/07/10 ", "SPCL HIGH GRADE ZINC USD
> 09/07/10
> ",
> > "SPCL HIGH GRADE ZINC USD 13/04/10 ", "SPCL HIGH GRADE ZINC USD
> 13/04/10
> "
> > ), CREATED.DATE = structure(c(14705, 14707, 14707, 14711, 14700,
> > 14700, 14711, 14711, 14708, 14708, 14708, 14711, 14711, 14707,
> > 14707, 14707, 14707, 14707, 14708, 14708, 14708, 14708, 14708,
> > 14708, 14708, 14708, 14708, 14672, 14673, 14678, 14678, 14700,
> > 14700, 14700, 14700, 14700, 14700, 14700, 14705, 14707, 14707,
> > 14707, 14708, 14708, 14708, 14708, 14708, 14622, 14634), class =
> "Date"),
> >     QUANITY = c(1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 2, 1,
> >     1, 1, 2, 1, 1, 1, 1, 2, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1,
> >     1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1),
> CLOSING.PRICE =
> > c("467.7500",
> >     "467.7500", "467.7500", "467.7500", "78.1300", "78.1300",
> >     "78.1300", "1,739.4000", "16.5400", "16.5400", "16.5400",
> >     "16.5400", "16.5400", "1,353.0000", "1,353.0000", "1,353.0000",
> >     "1,353.0000", "1,353.0000", "1,353.0000", "1,353.0000",
> "1,353.0000",
> >     "1,353.0000", "1,353.0000", "1,353.0000", "1,353.0000",
> "2,415.9000",
> >     "2,415.9000", "25,755.7100", "25,755.7100", "25,760.8600",
> >     "25,760.8600", "2,355.9600", "2,355.9600", "2,355.9600",
> >     "2,355.9600", "2,355.9600", "2,355.9600", "2,355.9600",
> "2,357.1200",
> >     "2,420.7300", "2,420.7300", "2,420.7300", "2,421.0500",
> "2,421.0500",
> >     "2,421.0500", "2,421.0500", "2,421.0500", "2,388.4300",
> "2,388.4300"
> >     )), .Names = c("DESCRIPTION", "CREATED.DATE", "QUANITY",
> > "SETTLEMENT"), row.names = c(NA, -49L), class = "data.frame")
> >
> > > str(pose)
> > 'data.frame':   49 obs. of  4 variables:
> >  $ DESCRIPTION : chr  "WHEAT May/10 " "WHEAT May/10 " "WHEAT May/10 "
> "WHEAT
> > May/10 " ...
> >  $ CREATED.DATE:Class 'Date'  num [1:49] 14705 14707 14707 14711
> 14700
> ...
> >  $ QUANITY     : num  1 1 1 1 1 1 1 -1 1 1 ...
> >  $ SETTLEMENT  : chr  "467.7500" "467.7500" "467.7500" "467.7500" ...
> >
> >
> > "Pose$SETTLEMENT" has a "character" class, when it should have been
> > "numeric". So maybe a solution would be to give a numeric class when
> I
> read
> > my .csv file?
> > I tried to change class of this column right after the
> read.csv()(using
> > type.convert() let me with a factor), but again got some rounded
> number
> or
> > NA.
> >
> > So, what am I supposed to do??
> >
> > TY for the help.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list