[R] interpolation using R for PCR quantification
jdnewmil at dcn.davis.ca.us
Tue Dec 29 00:02:07 CET 2015
There is some terminology confusion here... interpolation as implemented by approx or spline usually means estimating values between known points. You seem to have approximate (not known) points, and are looking to apply a linear regression model to estimate missing data. Beware that mixing estimates (a.k.a. yhat) with measured data can be misleading because yhat is missing the error term(s) that are present in the original data.
If you really want to do this after reconsidering, then perhaps the following does what you want?
logconlm <- lm( log10( con ) ~ cyc, data = df )
df$yhat <- predict( logconlm, newdata=df )
df$logconfixed <- ifelse( is.na( df$con ), df$yhat, log10( df$con ) )
plot( df$cyc, df$logconfixed, pch=df$sam )
Sent from my phone. Please excuse my brevity.
On December 24, 2015 8:12:13 AM PST, Luigi Marongiu <marongiu.luigi at gmail.com> wrote:
>I am a newbie in interpolation using R and I would like to learn
>better the procedure.
>I am applying interpolation to quantify nucleic acid targets using an
>assay known as PCR. To do this, I have two sets of variables: standard
>of known concentrations and query for which I need to identify the
>For each variable I have the output of the assay (cyc) and an
>approximation of the concentration expressed in relation to the
>concentration of the standard, so 5 means 10^5 etc.
>Given that the actual concentration of the standards is given in the
>'con' variable, the relation is that x=log10(con) and y = cyc, as
>represented in the first plot of the following example. In black are
>depicted the standard and in red the query samples.
>Now, to obtain interpolation the only function that i know is
>approx(). The first problem is that I need to switch the x-y variables
>because the values specifying where interpolation is to take place go
>in the 'xout' parameter and I have y outputs. If I maintain the
>original x/y orientation the output from approx() is empty. How can I
>keep the original layout? I must admit, anyhow, that the construct
>x=log10(con) and y = cyc is an artifact of the PCR analysis, since the
>independent variable is indeed the cyc value.
>The second problem I am facing -- and the most important -- is that
>the output seems weird. The values I get are simply the concentration
>input as such and not calculated by interpolation. In the example, the
>output I obtain is:
>  NA 1480.600 1480.600 148.060 202.319 148.060 14.806
>the first and last value are OK because the cyc values are outside the
>dynamic range under evaluation, but the only value that seems genuine
>is 202.319, the others are just the values I placed in the 'con'
>variable. For instance the second and third values have cyc = 26.992
>and 26.961 and yet they are both assigned to 1480.600.
>What I am getting wrong?
>Thank you (and merry Christmas!)
>dil <- c(5, 5, 5, 5, 4, 4, 4, 3, 3, 3,
>3, 3, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1,
> 1, 1, 1)
>sam <- c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
>1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1,
> 1, 1, 1)
>cyc <- c(20.787, 20.494, 20.475, 20.189, 23.991,
>24.084, 23.863, 26.298, 28.007, 27.413, 26.992,
>26.961, 31.363, 30.979, 32.013, 31.004, 30.576,
>31.195, 35.219, 34.096, 38.088, 34.934, 35.101,
>con <- c(148060, 148060, 148060, NA, 14806, 14806,
>14806, 1480.6, 1480.6, 1480.6, NA, NA, 148.06,
>148.06, 148.06, NA, NA, NA, 14.806, 14.806,
>14.806, NA, NA, NA, NA)
>df <- data.frame(dil, sam, cyc, con)
>std <- subset(df, sam == 0)
>qry <- subset(df, sam == 1)
>plot(std$cyc ~ std$dil)
>points(qry$dil, qry$cyc, col ="red")
>Q <- approx(x=std$cyc, y=log10(std$con), xout=qry$cyc,
>method="linear", rule = 1)
>abline(lm(log10(df$con) ~ df$cyc))
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>PLEASE do read the posting guide
>and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
More information about the R-help