[R] correlation between a discrete variable and a continuous variable

Greg Snow Greg.Snow at imail.org
Thu Oct 16 18:37:04 CEST 2008

Since water quality is ordered, you could use cor.test with method of kendall or spearman.  This will basically give a test of whether the rank based correlation is significantly different from 0.  Often more interesting information can come from the data if you do some more work.  Start by graphing your data to see what is there and if anything stands out.  Then decide what you really want to learn from the data to decide what other tests to do.  If you want to predict nitrogen given water quality, then you can use lm or aov (or other related methods), if you want to predict quality given nitrogen, then possibilities include proportional odds logistic regression (polr in MASS, lrm in Design) or recursive partitioning (rpart and other packages).

Hope this helps,

Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of kdebusk
> Sent: Thursday, October 16, 2008 6:54 AM
> To: r-help at r-project.org
> Subject: [R] correlation between a discrete variable and a continuous
> variable
> What test would you use to determine if a correlation exists between a
> discrete variable and a continuous variable?
> For example, I have a rating for stream water quality (excellent,
> good, fair, poor) and a corresponding nitrogen concentration. I want
> to see if there is a correlation between the water quality rating and
> the concentration of nitrogen in the stream.
> Help?
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list