[R] regression methods for circular(?) data.

Witold Eryk Wolski W.E.Wolski at ncl.ac.uk
Mon Sep 26 20:01:48 CEST 2005


Ted,

I agree with you that if you unwrap the data you can use lm.
And you can separate the data in the way you describe. However, if you 
have thousands of such datasets I do not want to do it by "looking at 
the graph".

Yes the scatter may be larger as in the example and range(y) may be 
larger than 2.

And as you said in order to unwrap the data you have to separate them 
first. It would be easy to do it using for example single linkage 
clustering if they were no overlaps (but they do sometimes). So I were 
just wondering if there are no more fancy methods to do this.

Thanks,

cheers



(Ted Harding) wrote:
> On 26-Sep-05 Witold Eryk Wolski wrote:
> 
>>Hi,
>>
>>I do not know the intercept and slope.
>>And you have to know them in order to do something like:
>>ix<-(y < 0.9*(x-50)/200
>>
>>I am right?
>>
>>cheers
> 
> 
> Although I really knew them from the way you generated the data,
> I "pretended" I did not know them.
> 
> Read below: "If you know the modulus (in your case 1.0)" -- I did
> assume that this was known, i.e. that the data "wrap round" to 0
> above 1.0. Also: "the constants 0.9/200, -50 being chosen to give
> a good separation on the graph" -- I plotted the data, and saw that
> the "wrapped" data were well separated, and that 0.9*(x-50)/200
> was an adequate discriminant function. This was estimated purely by
> eye, by looking at the graph, to find some line that went between
> the two groups of data; no attempt was made to calculate anything
> precisely. Apart from assuming that the modulus was 1.0, and that
> the well-separated data at the bottom right of the graph were
> "wrapped round" data, no other information was used by me!
> 
> So the question remains: If you can assume that the modulus is 1.0,
> and that the wrapped-round data will be well separated, then all
> is simple. All you need to do is to "unwrap" the "wrapped" data
> by adding 1.0, having first identified them by virtue of their
> obvious separation. Then you can estimate the slope by using 'lm'.
> 
> But:-- if you, Witold, can not assume these two things for your
> real data, what can we assume in considering your question?
> Is the modulus unknown, for instance? Is the scatter so large that
> the groups are not well separated? Might we have "twice-wrapped"
> data (i.e. original y > 2)? 
> 
> In short, do your real data look like the data you sent us, and
> are they wrapped at 1.0? or what?
> 
> With thanks, and best wishes,
> Ted.
> 
> 
>>(Ted Harding) wrote:
>>
>>>On 26-Sep-05 nwew wrote:
>>>
>>>
>>>>Dear R-users,
>>>>
>>>>I have the following data
>>>>
>>>>x <- runif(300,min=1,max=230)
>>>>
>>>>y <- x*0.005 + 0.2
>>>>y <- y+rnorm(100,mean=0,sd=0.1)
>>>>y <- y%%1 #  <------- modulo operation
>>>>plot(x,y)
>>>>
>>>>and would like to recapture the slope (0.005) and intercept(0.2).
>>>>I wonder if there are any clever algorithms to do this. I was
>>>>looking at the function lm.cirucalar. Is this the method to use?
>>>>If, which of the references is best too look at?
>>>>
>>>>Eryk
>>>
>>>
>>>Hi Eryk,
>>>
>>>If you know the modulus (in your case 1.0) and you get data that
>>>look like the result of your "plot(x,y)", then I wouldn't mess
>>>about.
>>>
>>>I would simply do something like
>>>
>>>y1<-y
>>>ix <- ix<-(y < 0.9*(x-50)/200)
>>>y1[ix] <- y1[ix]+1.0
>>>lm(y1~x)
>>>
>>>(the constants 0.9/200, -50 being chosen to give a good separation
>>>on the graph).
>>>
>>>On the other hand, if there are good reasons why this very simple
>>>approach is not suitable, then if we knew what they were a more
>>>helpful reply would be easier to formulate!
>>>
>>>Best wishes,
>>>Ted.
>>>
>>>
>>>--------------------------------------------------------------------
>>>E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
>>>Fax-to-email: +44 (0)870 094 0861
>>>Date: 26-Sep-05                                       Time: 15:56:48
>>>------------------------------ XFMail ------------------------------
>>>
>>>______________________________________________
>>>R-help at stat.math.ethz.ch mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide!
>>>http://www.R-project.org/posting-guide.html
>>>
>>>
> 
> 
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 26-Sep-05                                       Time: 18:08:28
> ------------------------------ XFMail ------------------------------
> 
> 


More information about the R-help mailing list