[R] How to create a numeric data.frame

Joshua Wiley jwiley.psych at gmail.com
Mon Jun 13 23:30:27 CEST 2011


On Mon, Jun 13, 2011 at 2:11 PM, Patrizio Frederic
<frederic.patrizio at gmail.com> wrote:
> On Mon, Jun 13, 2011 at 6:47 PM, Aparna <aparna.sampath26 at gmail.com> wrote:
>> Hi Joshua
>>
>> While looking at the data, all the values seem to be in numeric. As i mentioned,
>> the dataset is already in data.frame.
>>
>> As suggested, I used str(mydata) and got the following result:
>>
>>
>> str(leu_cluster1)
>> 'data.frame':   984 obs. of  100 variables:
>>  $ V2  : Factor w/ 986 levels "-0.00257361",..: 543 116 252 54 520 ...
>
> your data columns are not numeric but factors indeed.
> you may try this one
>
> a <- as.character(rnorm(100))           # some numeric data
> adf <- data.frame(matrix(a,10))         # which are misinterpreted as factors
> adf
> adf[,1]
> class(adf[,1]) # check for the class of the first column
> sapply(adf,function(x)class(x)) # check classes for all columns
>
> b <- sapply(adf,function(x)as.numeric(as.character(x))) #

But coercing to a character class first is not the recommended method.
 Also, I am leery about using sapply() with data frames, because it
converts them to matrices, which can cause havoc, if you have
different classes of data.  You mentioned that as a first step, you
had removed the names column from the data frame before trying to
convert it to numeric.  I would simply leave the names in, and then
(supposing they are in column 101)

leu_cluster1[, 1:100] <- lapply(leu_cluster1[, 1:100], function(x)
as.numeric(levels(x))[x])

apply the conversion to numeric on only the necessary columns.  This
simplifies life because you are not making interim data sets.  Using
lapply() allows you to work with (potentially) different classes of
data (although I realize in this particular case you are only dealing
with one class).  So long as you are assigning the results back into a
data frame (as above), the methods for lapply will automatically
conver the list back to a data frame.  If you are concerned about
this, just wrap the call in as.data.frame()

leu_cluster1[, 1:100] <- as.data.frame(lapply(
  leu_cluster1[, 1:100], function(x) as.numeric(levels(x))[x]))

Cheers,

Josh

> as.character: use levels literally, as.numeric: transforms in numbers
> b # look at b
>
> class(b) # which is now a numeric matrix
>
> best regards
>
> PF
>
> --
> +-----------------------------------------------------------------------
> | Patrizio Frederic,
> | http://www.economia.unimore.it/frederic_patrizio/
> +-----------------------------------------------------------------------
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list