[R] Variance of multiple non-contiguous time periods?

CJ Davies cjohndavies at gmail.com
Tue Nov 4 14:50:25 CET 2014


On 04/11/14 09:11, Jim Lemon wrote:
> On Mon, 3 Nov 2014 12:45:03 PM CJ Davies wrote:
>> ...
>> On 30/10/14 21:33, Jim Lemon wrote:
>> If I understand, you mean to calculate deviations for each individual
>> 'chunk' of each transition & then aggregate the results? This is what
>> I'd been thinking about, but is there a sensible manner within R to
>> achieve this, or is it something for which it would be easier to
>> preprocess the data in an external tool? Is there some way to subset
> the
>> data such that I can work over just contiguous 'chunks'?
>>
> Exactly. If there is some combination of existing variables that can be
> combined to make a set of unique values for each "chunk", you can
> calculate the deviations within each "chunk", then average the squared
> deviations for each type of "chunk", weighting by the duration of the
> "chunks" so that you don't bias the pooled variance toward the longer
> "chunks".
>
> Jim
>

I am stumped for a way of automating this process though. Each line of 
log data looks like this;

2406	55.4	(-11.2, 1.0, -0.9)	(-4.1, 1.0, 0.0)	7.077912	0.9203392	(0.0, 
0.7, -0.1, 0.7)	8.129684	89.41537	-8.212769	(0.0, 0.7, -0.1, 0.7) 
8.129684	89.41537	351.7872	1	0	0	False	0.15	3	37.76761	True	False	0 
transition 1

Where the last variable defines which transition is currently active. 
However to separate these data into 'chunks' would involve making a 
comparison between each line of data & the preceding line of data to 
determine whether it is part of the same contiguous 'chunk'. Is this 
something that would be better achieved using external preprocessing 
written in a language I am more familiar with, as I haven't the foggiest 
how I would approach this within R?

Regards,
CJ Davies



More information about the R-help mailing list