[R] inconsistency with cor() - "x must be numeric"

Joshua Wiley jwiley.psych at gmail.com
Mon Dec 13 23:48:04 CET 2010


Hi,

I can certainly understand not wanting to be long winded, and no
damage done.  Here's a link to the R news file:
http://cran.stat.ucla.edu/src/base/NEWS   and if you search in your
browser for "cor() and cov()" you should find what happened.

At any rate, I could not fully check your code because:  object
'accessibility_data' not found, but my guess would be that you created
a matrix (if inadvertently), and at least one of the columns had some
character data in it, which would push *all* the data to character
class (even though a particular column may be numeric data it is not
stored as character).  Previously I think cor() did not check this,
and would silently convert using as.numeric().

I would look at:

str(acc_averages)

and I bet you will find that it is not numeric.  If this is the case,
one fix would be:

correlation = cor(as.numeric(acc_averages[,2]),
gene_densities$avg_density[1:23])

probably a better fix would be to initiate acc_averages as a
data.frame rather than with c(), that way it can store different types
of data without moving everything up the hierarchy of classes.  To see
what I mean look at ?rbind under the heading "Values" the second
paragraph.

Cheers,

Josh


On Mon, Dec 13, 2010 at 2:23 PM, Justin Fincher <fincher at cs.fsu.edu> wrote:
> I apologize for the lack of example.  I was trying not to be too long
> winded.  Below is the first portion of my function that is causing the
> error. (I'm including both calls to cor(), though it quits after the first
> throws an error).  I do not believe he has redefined cor() as he is a novice
> user and we tried this after starting a fresh session.  And I will look into
> upgrading.  I realize it is a little out of date since it is the version in
> the repository for my distribution and not the latest-and-greatest from R.
>  I just didn't realize a change like that would be made that would
> (seemingly to me) reduce functionality. Thank you again for your help.
> - Fincher
>    # As they don't change, hard code gene density values
>    gene_densities =
> data.frame(chrom=c("chr1","chr2","chr3","chr4","chr5","chr6","chr7",
>
> "chr8","chr9","chr10","chr11","chr12","chr13",
>
> "chr14","chr15","chr16","chr17","chr18","chr19",
>
> "chr20","chr21","chr22","chrX","chrY"),
>
>  avg_density=c(10.19,6.457,6.71,4.917,6.083,7.491,7.453,
>                                        5.939,7.27,7.132,11.38,9.429,3.757,
>                                        7.607,8.455,11.81,17.84,4.649,26.52,
>                                        11.19,6.51,11.28,7.535,2.931))
>
>    acc_averages = c()
>    # subset out relevant data
>    accessibility_data = subset(accessibility_data,
> accessibility_data$V9==";color=000000")
>
>    # calculate mean accessibility value for each chromosome
>    for(i in seq(1,22)){
>       sub = paste("chr",i,sep="")
>       temp = subset(accessibility_data,accessibility_data$V1==sub)
>       acc_averages = rbind(acc_averages,c(sub,as.double(mean(temp$V6))))
>    }
>    temp = subset(accessibility_data,accessibility_data$V1=="chrX")
>    acc_averages = rbind(acc_averages,c("chrX",as.double(mean(temp$V6))))
>
>    # Output the correlation without including chromosome Y
>    correlation = cor(acc_averages[,2],gene_densities$avg_density[1:23])
>    cat("Correlation w/o chrY:",correlation,'\n')
>
>    temp = subset(accessibility_data,accessibility_data$V1=="chrY")
>    acc_averages = rbind(acc_averages,c("chrY",mean(temp$V6)))
>    # Output overall correlation
>    correlation = cor(acc_averages[,2],gene_densities$avg_density)
>    cat("Correlation w/chrY:",correlation,'\n')
>
> On Mon, Dec 13, 2010 at 17:06, Joshua Wiley <jwiley.psych at gmail.com> wrote:
>>
>> Hi Fincher,
>>
>> cor() only works on numeric arguments now (as of R 2.11 or 2.10 if
>> memory serves).  So, I would update your function to ensure that you
>> are only passing numeric data to cor() and the error should go away
>> (it will probably be easier on you if you can update your version of R
>> to the latest and greatest...quite a bit has changed since 2.8.1).  If
>> you post a reproducible example of your function, I'm sure we can help
>> update it.
>>
>> Cheers,
>>
>> Josh
>>
>> On Mon, Dec 13, 2010 at 1:56 PM, Justin Fincher <fincher at cs.fsu.edu>
>> wrote:
>> > Howdy,
>> >   I have written a small function to generate a simple plot and my
>> > colleague is having an error when attempting to run it.  Essentially I
>> > loop
>> > through categories in a data frame and take the average value for each
>> > category The categories are in $V1, subset first then mean taken and
>> > concatenated to previous values using rbind(c("label",mean(data$V6)).
>> >  The
>> > result is a two-column matrix with labels in column one and values in
>> > column
>> > two.  Within the function I calculate the correlation of column two and
>> > another set of values that are part of the function.  On my computer
>> > (linux
>> > box running R 2.8.1) the function runs correctly.  On my colleague's
>> > computer (Windows box running R 2.12) the function throws an error at
>> > the
>> > cor() function call saying that "x must be numeric."  We are running on
>> > the
>> > exact same data set and source'ing the same function definition.  Any
>> > help
>> > would be appreciated.
>> >
>> > - Fincher
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> University of California, Los Angeles
>> http://www.joshuawiley.com/
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list