[R] Ordering problem

Dimitris Rizopoulos dimitris.rizopoulos at med.kuleuven.be
Fri Nov 25 14:01:41 CET 2005


another posibility would be to use something like:

v1 <- c(1, 2, 3); v2 <- c("a", "b", "c"); v3 <- c("1", "2", "3")
dat <- data.frame(v1, v2, v3)
############3
dat <- lapply(dat, as.character)
dat <- as.data.frame(lapply(dat, type.convert))

dat
sapply(dat, data.class)



I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "John Logsdon" <j.logsdon at quantex-research.com>
To: <r-help at stat.math.ethz.ch>
Sent: Friday, November 25, 2005 1:25 PM
Subject: Re: [R] Ordering problem


> Thanks to Florence but it needs a little modification.  However as I 
> have
> now discovered the str() command, things are looking up.:))
>
> I have a character matrix so I() just leaves it as characters 
> whereas I
> want the various columns to be integers or whatever they contain.
>
> To take Florence's example slightly extended:
>
>> v1<-c(1,2,3);v2<-c("a","b","c");v3<-c("1","2","3")
>
> Note that the third vector is a character with numerical contents.
>
>> data.frame(v1,v2,v3)
>  v1 v2 v3
> 1  1  a  1
> 2  2  b  2
> 3  3  c  3
>
> so it looks OK, but
>
>> str(data.frame(v1,v2,v3))
> `data.frame':   3 obs. of  3 variables:
> $ v1: num  1 2 3
> $ v2: Factor w/ 3 levels "a","b","c": 1 2 3
> $ v3: Factor w/ 3 levels "1","2","3": 1 2 3
>
> reveals the nasty truth!
>
> whereas
>
>> str(data.frame(v1,v2,I(v3)))
> `data.frame':   3 obs. of  3 variables:
> $ v1: num  1 2 3
> $ v2: Factor w/ 3 levels "a","b","c": 1 2 3
> $ v3:Class 'AsIs'  chr [1:3] "1" "2" "3"
>
> just keeps the character v3 as characters.  I want it to be 
> interpreted as
> numeric so:
>
>> str(data.frame(v1,v2,as.numeric(v3)))
> `data.frame':   3 obs. of  3 variables:
> $ v1            : num  1 2 3
> $ v2            : Factor w/ 3 levels "a","b","c": 1 2 3
> $ as.numeric.v3.: num  1 2 3
>
> actually gives me what I need.
>
> The only problem is that I have to do everything column by column 
> and
> there are 15 cols all.  So it makes particularly ugly coding to 
> reproduce
> an as.is read from a .csv file.
>
> The other solutions from Baz and Carlos would also work of course - 
> but
> they are still pretty horrible.  Perhaps another way to do this is 
> to
> write it out using cat then read it in again using as.is=TRUE!! ;)
>
> Thanks to one and all
>
> Best wishes
>
> John
>
> John Logsdon                               "Try to make things as 
> simple
> Quantex Research Ltd, Manchester UK         as possible but not 
> simpler"
> j.logsdon at quantex-research.com 
> a.einstein at relativity.org
> +44(0)161 445 4951/G:+44(0)7717758675       www.quantex-research.com
>
>
> On Fri, 25 Nov 2005, Florence Combes wrote:
>
>> John,
>>
>> at ?factor, you can see :
>>
>> " Be careful only to compare factors with the
>>   same set of levels (in the same order).  In particular,
>>   'as.numeric' applied to a factor is meaningless, and may happen 
>> by
>>   implicit coercion.  To "revert" a factor 'f' to its original
>>   numeric values, 'as.numeric(levels(f))[f]' is recommended and
>>   slightly more efficient than 'as.numeric(as.character(f))'. "
>>
>> 'as.numeric(levels(f))[f]'  worked well for me in the similar 
>> situation i.e.
>> to get back numeric values from a factor type.
>> But see also the I() "option" of the data.frame() function, which 
>> allows you
>> not to obtain a factor (from a character vector only) if it is not 
>> what you
>> want.
>>
>> from ?data.frame :
>>
>> "Objects passed to 'data.frame' should have the same number of
>>      rows, but atomic vectors, factors and character vectors 
>> protected
>>      by 'I' will be recycled a whole number of times if necessary."
>>
>>
>> see this example:
>> --------------------------------------------------
>> > v1<-c(1,2,3)
>> > v2<-c("a","b","c")
>> > df.A<-data.frame(v1,v2)
>> > str(df.A)
>> `data.frame':   3 obs. of  2 variables:
>>  $ v1: num  1 2 3
>>  $ v2: Factor w/ 3 levels "a","b","c": 1 2 3
>> > df.B<-data.frame(v1,I(v2))
>> > str(df.B)
>> `data.frame':   3 obs. of  2 variables:
>>  $ v1: num  1 2 3
>>  $ v2:Class 'AsIs'  chr [1:3] "a" "b" "c"
>> -------------------------------------------------
>>
>> hope this helps,
>>
>> Florence.
>>
>>
>>
>>
>>
>> On 11/25/05, John Logsdon <j.logsdon at quantex-research.com> wrote:
>> >
>> > I have an ordering and factor problem to which there must be a 
>> > simple
>> > solution!  The version is R 2.0.1  (2004-11-15) on A Linux 
>> > platform.
>> >
>> > A data frame H is read in from a .csv file using read.csv with 
>> > as.is=TRUE.
>> >
>> > Another data frame HN is constructed from data and I want to 
>> > compare two
>> > columns both named ss of the (sorted) data frames that are the 
>> > same
>> > length.
>> >
>> > The problem is that HN$ss is always treated as a factor whatever 
>> > I do
>> > while H$ss is treated as an integer, which is what I want. 
>> > Somewhere R is
>> > making an implicit transformation but I can't see how to correct 
>> > it.
>> >
>> > The data are all integers in the range 1:13 - in fact with no 
>> > gaps.  If I
>> > tabulate from H:
>> >
>> > > table(H$ss)
>> >
>> >    1    2    3    4    5    6    7    8    9   10   11   12   13
>> > 176  176  176  176  176  176  341 8726 8784 8777 8773 8749 8747
>> >
>> > and for HN:
>> >
>> > > table(HN$ss)
>> >
>> >    1   10   11   12   13    2    3    4    5    6    7    8    9
>> > 176 8777 8773 8749 8747  176  176  176  176  176  341 8726 8784
>> >
>> > At some time while constructing HN, I have to make it a character 
>> > matrix -
>> > otherwise gsub doesn't work when removing surplus blanks for 
>> > example - but
>> > I have turned it back into a data frame in the end.
>> >
>> > If I check the modes, both data frames are lists and both columns 
>> > are
>> > numeric - HN is not reported as a factor.  Yet it appears to be 
>> > treated as
>> > a factor, for example:
>> >
>> > > table(formatC(H$ss,dig=0,width=2,format="f",flag="0"))
>> >
>> >   01   02   03   04   05   06   07   08   09   10   11   12   13
>> > 176  176  176  176  176  176  341 8726 8784 8777 8773 8749 8747
>> > > table(formatC(HN$ss,dig=0,width=2,format="f",flag="0"))
>> >
>> > yet:
>> >
>> >    1   10   11   12   13    2    3    4    5    6    7    8    9
>> > 176 8777 8773 8749 8747  176  176  176  176  176  341 8726 8784
>> > Warning messages:
>> > 1: "+" not meaningful for factors in: Ops.factor(x, ifelse(x == 
>> > 0, 1, 0))
>> > 2: "<" not meaningful for factors in: Ops.factor(x, 0)
>> >
>> > I have tried as.numeric but then I get the factor level rather 
>> > than name
>> > returned:
>> >
>> > > table(formatC(as.numeric(HN$ss),dig=0,width=2,format="f",flag="0"))
>> >
>> >   01   02   03   04   05   06   07   08   09   10   11   12   13
>> > 176 8777 8773 8749 8747  176  176  176  176  176  341 8726 8784
>> >
>> > which obviously is a tabulation of the internal levels rather 
>> > than the
>> > data.
>> >
>> > TIA
>> >
>> > John
>> >
>> > John Logsdon                               "Try to make things as 
>> > simple
>> > Quantex Research Ltd, Manchester UK         as possible but not 
>> > simpler"
>> > j.logsdon at quantex-research.com 
>> > a.einstein at relativity.org
>> > +44(0)161 445 4951/G:+44(0)7717758675 
>> > www.quantex-research.com
>> >
>> > ______________________________________________
>> > R-help at stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide!
>> > http://www.R-project.org/posting-guide.html
>> >
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm




More information about the R-help mailing list