[R] Curve Fitting/Regression with Multiple Observations

Gabor Grothendieck ggrothendieck at gmail.com
Tue Apr 27 20:35:56 CEST 2010


This will compute a loess curve and plot it:

example(loess)
plot(dist ~ speed, cars, pch = 20)
lines(cars$speed, fitted(cars.lo))

Also this directly plots it but does not give you the values of the
curve separately:

library(lattice)
xyplot(dist ~ speed, cars, type = c("p", "smooth"))



On Tue, Apr 27, 2010 at 1:30 PM, Kyeong Soo (Joseph) Kim
<kyeongsoo.kim at gmail.com> wrote:
> I recently came to realize the true power of R for statistical
> analysis -- mainly for post-processing of data from large-scale
> simulations -- and have been converting many of existing Python(SciPy)
> scripts to those based on R and/or Perl.
>
> In the middle of this conversion, I revisited the problem of curve
> fitting for simulation data with multiple observations resulting from
> repetitions.
>
> In the past, I first processed simulation data (i.e., multiple y's
> from repetitions) to get a mean with a confidence interval for a given
> value of x (independent variable) and then applied spline procedure
> for those mean values only (i.e., unique pairs of (x_i, y_i) for i=1,
> 2, ...) to get a smoothed curve. Because of rather large confidence
> intervals, however, the resulting curves were hardly smooth enough for
> my purpose, I had to fix the function to exponential and used least
> square methods to fit its parameters for data.
>
> >From a plot with confidence intervals, it's rather easy for one to
> visually and manually(?) figure out a smoothed curve for it.
> So I'm thinking right now of directly applying spline (or whatever
> regression procedures for this purpose) to the simulation data with
> repetitions rather than means. The simulation data in this case looks
> like this (assuming three repetitions):
>
> # x    y
> 1      1.2
> 1      0.9
> 1      1.3
> 2      2.2
> 2      1.7
> 2      2.0
> ...      ....
>
> So my idea is to let spline procedure handle the fluctuations in the
> data (i.e., in repetitions) by itself.
> But I wonder whether this direct application of spline procedures for
> data with multiple observations makes sense from the statistical
> analysis (i.e., theoretical) point of view.
>
> It may be a stupid question and quite obvious to many, but personally
> I don't know where to start.
> It would be greatly appreciated if anyone can shed a light on this in
> this regard.
>
> Many thanks in advance,
> Joseph
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list