[R] Statistical analysis

Arien Lam A.Lam at geo.uu.nl
Thu Sep 24 12:26:05 CEST 2009


Hi Chris,

If I understand your question correctly, what you want is both easy and hard.
Easy:
# making a reproducible example, as asked in the posting guide
# two vectors
water <- rnorm(1000)
rain <- rgamma(1000,.5)
# the following does everything you mention and more
summary(lm(water~rain))
cor(water,rain)

Hard:
lm() and cor() assume independence of observations, linearity of the relation, normality of the 
residuals. Are these assumptions valid for your problem?
Are your datasets time series? There will be ??autocorrelation in both datasets. There may be a 
?lag. Decide whether to estimate and correct for those.
Are there multiple sample locations? There may be dependence.
Would you rather assume rain and change in groundwater level are related?
Etc.

Cheers,

Arien


Chris Li wrote:
> Hi all,
> 
> I have got two datasets, one of them is rainfall data and the other one is
> groundwater level data.
> 
> I would like to see whether there is a correlation between these two
> datasets and if there is, to what extent they are correlated.
> 
> My stats background is limited, therefore any advice on which command I
> should use in R would be greatly appreciated.
> 
> Thanks in advance.
> Chris

-- 
drs. H.A. (Arien) Lam (Ph.D. student)
Department of Physical Geography
Faculty of Geosciences
Utrecht University, The Netherlands




More information about the R-help mailing list