R: [R] slope estimations of teeth like data

Petr Pikal petr.pikal at precheza.cz
Wed Jun 16 13:54:20 CEST 2004


Thank you for your insights to my problem. I finally came up with 
function which sets an index for increasing resp decreasing parts of 
my data.

increasing<-function(x, cutoff=0, s=3, cas.osa=cas.osa, ...)
{
x.agg<-aggregate(x, list(hod=cut(cas.osa,"5 min")), mean, 
na.rm=T)

index<-diff(x.agg[,2, drop=T])<cuttof

ind<-outer(which(index),(-s):(s*2),"+")
	ind<-ind[ind>0 & ind <= length(index)]
	index[ind]<-TRUE
idx<-rep(index,each=5)
idx<-c(rep(idx[1],5),idx)
invisible(idx)
}


I have much more points than in my toy example so there is no big 
problem with throwing away some data. However, both packages 
you (Achim, Vito) suggested seems to be useful in further 
processing of my time series data

Thanks again.
Petr


On 15 Jun 2004 at 17:33, Gabor Grothendieck wrote:

> 
> 
> I am not entirely sure what is essential and what is inessential in
> this problem.  In the yy data shown, the peaks occur in one step and
> all the peaks are equal.  Just plotting
> 
> plot(diff(yy))
> 
> shows that there is a huge region of potential cutoffs for separating
> out the large single step increases from the other differences.
> 
> If it continues to be true that the increases are large relative to
> everything else but the cutoff is not quite so obvious as here, we can
> calculate a cutoff by just sorting the unique differences and
> partitioning them in every possible way into small and large
> differences and then choosing the partition whose residual variance is
> least.   
> 
> Its actually pretty simple to do that from first pinciples, as
> shown here, but there might be a function in R that does it
> even more directly.
> 
> diffyy <- diff(yy)
> yy. <- unique(sort(diffyy)) # unique diffs sorted
> yy. <- yy.[-1] - diff(yy.)/2  # midpoints
> 
> # For each 2 class partition into small & large, get residual var
> res <- sapply(yy., function(x) 
>                      var(resid(lm(diffyy ~ factor(diffyy < x)))) )
> 
> cutoff <- yy.[which.min(res)]
> 
> # indices of points involved in rise
> which(diffyy > cutoff) + 1
> 
> ---
> 
> Petr Pikal <petr.pikal at precheza.cz>
> 
> 
> On 15 Jun 2004 at 13:52, Vito Muggeo wrote: 
> 
> > Dear Petr, 
> > Probably I don't understand exactly what you are looking for. 
> > 
> > However your "plot(x,c(y,z))" suggests a broken-line model for the
> > response "c(y,x)" versus the variables x. Therefore you could
> > estimate a segmented model to obtain (different) slope (and
> > breakpoint) estimates. See the package segmented. 
> 
> Thank you Vito, but it is not what I want. plot(x,c(y,z)) shows only
> one "spike" and I have many such spikes in actual data. 
> 
> My actual data look like those 
> 
> set.seed(1) 
> y <- 0.03*x[1:100]+rnorm(100, mean=.001, sd=.03) 
> z <- 3-rep(seq(1,100,10),each=10)*.03+rnorm(100,mean=.001, sd=.03) yy
> <- NULL for( i in 1:10) yy <- c(yy,c(y,z)[1:floor(runif(1)*200)]) y.l
> <- length(yy) plot(1:y.l, yy) 
> 
> x axis is actually a time and y is a weight of gradually filled
> conteiner, which is irregularly emptied. I want to do an hourly and/or
> daily averages of increases in weight (it can by done by aggregate) 
> 
> myfac <- gl(y.l/12,12,length=1271) #hopefully length is ok 
> 
> y.agg <- aggregate(diff(yy), list(myfac), mean) 
> ## there will be list(hod=cut(time.axis,"hour")) construction actually
> ## 
> 
> 0.03 can be expected average result and some aggregated values ar OK
> but some are wrong as they include values from emptying time. 
> 
> *** This*** is probably what I need, I need to set some logical vector
> which will be TRUE when there was a filling time and FALSE during
> other times. And I need to specify it according a data I have
> available. 
> 
> Best what I was able to do was to consider filling time as a time when
> let say 
> 
> diff(yy) >= 0 
> 
> was between prespecified limits, but you know how it is with real life
> and prespecified limits. 
> 
> Or I can plot my data against time, manually find out regions which
> are correct and make a aggregation only with correct data. But there
> are 24*60*3 values each day so I prefer not to do it manually. 
> 
> Or finally I can throw away any hourly average which is not in set
> limits, but I prefer to throw away as little data as possible. 
> 
> I hope I was able to clarify the issue a bit. 
> 
> Thank you 
> Best regards 
> Petr 
> 
> 
> > 
> > best, 
> > vito 
> > 
> > 
> > 
> > ----- Original Message ----- 
> > From: Petr Pikal <petr.pikal at precheza.cz> 
> > To: <r-help at stat.math.ethz.ch> 
> > Sent: Tuesday, June 15, 2004 1:11 PM 
> > Subject: [R] slope estimations of teeth like data 
> > 
> > 
> > > Dear all 
> > > 
> > > Suppose I have teeth like data similar like 
> > > 
> > > x <- 1:200 
> > > y <- 0.03*x[1:100]+rnorm(100, mean=.001, sd=.03) 
> > > z <- 3-rep(seq(1,100,10),each=10)*.03+rnorm(100,mean=.001, sd=.03)
> > > plot(x,c(y,z)) 
> > > 
> > > and I want to have a gradient estimations for some values from
> > > increasing 
> > part of 
> > > data 
> > > 
> > > like 
> > > 
> > > y.agg <- aggregate(diff(c(y,z)), 
> > > list(rep(seq(1,200,10),each=10)[1:199]), 
> > mean) 
> > > 
> > > y.agg[1:10,] ##is OK, I want that 
> > > y.agg[11:20,] ##is not OK, I do not want that 
> > > 
> > > actual data are similar but more irregular and have subsequent
> > > gradual 
> > increases 
> > > and decreases, more like 
> > > 
> > > set.seed(1) 
> > > yy<-NULL 
> > > for( i in 1:10) yy <- c(yy,c(y,z)[1:floor(runif(1)*200)]) 
> > > length(yy) 
> > > [1] 1098 
> > > 
> > > plot(1:1098,yy) 
> > > 
> > > Is there anybody who has some experience with such data, mainly
> > > how to 
> > extract 
> > > only increasing portions or to filter values of "yy" such as only 
> > aggregated slopes 
> > > from increasing parts are computed and other parts are set to NA. 
> > Sometimes 
> > > actual data have so long parts of steady or even slightly
> > > increasing 
> > values at 
> > > decreasing part that aggregated values are slightly positive
> > > although they 
> > are 
> > > actually from decreasing portion of data. 
> > > 
> > > Thank you 
> > > Petr Pikal 
> > > petr.pikal at precheza.cz 
> > > 
> 
> 
> _______________________________________________
> No banners. No pop-ups. No kidding.


Petr Pikal
petr.pikal at precheza.cz




More information about the R-help mailing list