[R] How can I map "by" results to original list of indices or first difference of column of data.frame with two factors?
Mikhail Titov
mlt at gmx.us
Sun Mar 4 04:16:25 CET 2012
"R. Michael Weylandt" <michael.weylandt at gmail.com> writes:
> It'd be doubly helpful if you could post desired output as well.
I beg alls pardon, I suddenly realized that in my case the solution is
trivial. Here is an example with a mock-up data.
Let's generate some data
#+begin_src R
qq <-
expand.grid(
day=seq(ISOdate(2011,1,1),ISOdate(2011,12,31),by='day'),
bar=1:4,
foo=factor(c('A','B','G','I'))
)
ww <-
within(qq,
val <- bar * sin(as.double(day-day[1],"days")
/ as.double(diff(range(day)),"days")
* 2*pi
+ as.numeric(foo)/2
)
)
#+end_src
We can take a look at it with
#+begin_src R :results graphics :exports both :file z.png
library(lattice)
xyplot(val~day|foo,ww,group=ww$bar, type='l')
#+end_src
Now since we ditch first element in each partition anyway,
we can apply diff on entire data set at once.
Then we should ditch very first element in each partition.
#+begin_src R
ww[-1,"diff"] <- diff(ww$val)
ee <- subset(ww, day>ISOdate(2011,1,1))
#+end_src
And a final result
#+begin_src R :results graphics :exports both :file x.png
xyplot(diff~day|foo,ee,group=ee$bar, type='l')
#+end_src
> If you haven't seen it before, the easiest way to post R data is to
> use the dput() function to get a plain-text (mailing list friendly)
> representation. If your data is large, dput(head(DATA, 30)) should
> suffice.
>
> (We wouldn't want to clog those internet tubes...)
>
> Michael
>
> On Sat, Mar 3, 2012 at 8:55 PM, jim holtman <jholtman at gmail.com> wrote:
>> If you would post a subset of your data so that we can see what you
>> are talking about, we could probably help you come up with a solution.
>>
>> On Sat, Mar 3, 2012 at 7:50 PM, Mikhail Titov <mlt at gmx.us> wrote:
>>> Hello!
>>>
>>> I’m having stacked data in a data.frame with 2 factors, ordered POSIXct, and actual value as numeric (as if for lattice::xyplot).
>>>
>>> I would like to calculate first difference using “diff” function
>>> within corresponding subsets/partitions. Since data.frame is
>>> organized by factors and has sorted dates, it seems like "by" is a
>>> good candidate for the job. However it returns just a dumb list of
>>> vectors.
>>>
>>> It seems that I can use either expand.grid to remap results of "by" and hope that I won't mess up order, or I can use "unique(subset(x,select=c(foo,bar)))"
>>>
>>> In overall it looks like quite many steps for such task not
>>> counting assignment of those differences back to original
>>> data.frame starting from 2nd position in each partition (as diff
>>> returns shorter vector).
>>>
>>> Am I on the right track or is there an easier way to do that?
>>>
>>> Mikhail
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
--
Mikhail
-------------- next part --------------
A non-text attachment was scrubbed...
Name: x.png
Type: image/png
Size: 6578 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120303/77d11214/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: z.png
Type: image/png
Size: 6807 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120303/77d11214/attachment-0001.png>
More information about the R-help
mailing list