[R] how to merge within range?

René Mayer mayer at psychologie.tu-dresden.de
Sat May 14 21:08:12 CEST 2011


sqldf is impressive - compiled it now;
the trick with findInterval is nice, too.
thanks guys!!




Zitat von "David Winsemius" <dwinsemius at comcast.net>:

>
> On May 14, 2011, at 2:27 PM, William Dunlap wrote:
>
>> You could use findInterval() along with a trick with c(rbind(...)):
>>
>>> i <- findInterval(x=df.1$time, vec=c(rbind(df.2$from, df.2$to)))
>>> i
>> [1] 1 1 1 2 3 3 3 5 5 6
>
> That's nice. I was working on a slightly different "trick"
>
> findInterval( df.1[,1],t(df.2[,1:2]))
>  [1] 1 1 1 2 3 3 3 5 5 6
>
> I was then trying to get the right indices with (.)'%%' 2 and (.) '%/%' 2
>
>
>> The even-valued outputs would map to NA's, the odds
>> to value[(i+1)/2], but you can use the c(rbind(...)) trick again:
>>
>>> c(rbind(df.2$value, NA))[i]
>> [1]  1  1  1 NA  3  3  3  5  5 NA
>
> I'd like to understand that. Maybe, maybe... ah, got it. At first I  
> didn't realize those were the final answers since they looked like  
> indices. My t(.) trick doesn't generalize as well.
>
>
> My earlier suggestion tht two merges woul do it was based on my  
> erroneous interpretation of the example, since  I thought the task  
> was to match on the end points of the intervals.
>
>>
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org
>>> [mailto:r-help-bounces at r-project.org] On Behalf Of René Mayer
>>> Sent: Saturday, May 14, 2011 11:06 AM
>>> To: David Winsemius
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] how to merge within range?
>>>
>>> thanks David and Ian,
>>> let me make a better example as the first one was flawed
>>>
>>> df.1=data.frame(round((1:10)*100+rnorm(10)), value=NA)
>>> names(df.1) = c("time", "value")
>>> df.1
>>>   time value
>>> 1   101    NA
>>> 2   199    NA
>>> 3   301    NA
>>> 4   401    NA
>>> 5   501    NA
>>> 6   601    NA
>>> 7   700    NA
>>> 8   800    NA
>>> 9   900    NA
>>> 10 1000    NA
>>>
>>> # from and to define ranges within time,
>>> # note that from and to may not match the numbers given in time
>>> df.2=data.frame(from=c(99,500,799),to=c(303,702,950), value=c(1,3,5))
>>> df.2
>>>  from  to value
>>> 1   99 303     1
>>> 2  500 702     3
>>> 3  799 950     5
>>>
>>> what I want is:
>>>   time value
>>> 1   101    1
>>> 2   199    1
>>> 3   301    1
>>> 4   401    NA
>>> 5   501    3
>>> 6   601    3
>>> 7   700    3
>>> 8   800    5
>>> 9   900    5
>>> 10 1000    NA
>>>
>>> @David I don't know what you mean by 2 merges,
>>> René
>>>
>>>
>>>
>>>
>>>
>>> Zitat von "David Winsemius" <dwinsemius at comcast.net>:
>>>
>>>>
>>>> On May 14, 2011, at 9:16 AM, Ian Gow wrote:
>>>>
>>>>> If I assume that the third column in data.frame.2 is named
>>> "val" then in
>>>>> SQL terms it _seems_ you want
>>>>>
>>>>> SELECT a.time, b.val FROM data.frame.1 AS a LEFT JOIN
>>> data.frame.2 AS b ON
>>>>> a.time BETWEEN b.start AND b.end;
>>>>>
>>>>> Not sure how to do that elegantly using R subsetting/merge,
>>>>
>>>> Huh? It's just two merge()'s (... once you fix the error in
>>> the example.)
>>>>
>>>> -- 
>>>> David
>>>>
>>>>> but you might
>>>>> try a package that allows you to use SQL, such as sqldf.
>>>>>
>>>>>
>>>>> On 5/14/11 8:03 AM, "David Winsemius"
>>> <dwinsemius at comcast.net> wrote:
>>>>>
>>>>>>
>>>>>> On May 14, 2011, at 8:12 AM, René Mayer wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>> how can one merge
>>>>>>
>>>>>> And what happened when you typed:
>>>>>>
>>>>>> ?merge
>>>>>>
>>>>>>> two data frames when in the second data frame one column
>>> defines the
>>>>>>> start values
>>>>>>> and another defines the end value of the to be merged range.
>>>>>>> data.frame.1
>>>>>>> time ...
>>>>>>> 13
>>>>>>> 24
>>>>>>> 35
>>>>>>> 46
>>>>>>> 55
>>>>>>> ...
>>>>>>> data.frame.2
>>>>>>> start end
>>>>>>> 24 37 ?h? ?
>>>>>>> ...
>>>>>>>
>>>>>>> should result in this
>>>>>>> 13 NA
>>>>>>> 24 ?h?
>>>>>>> 35 ?h?
>>>>>>> 46 NA
>>>>>>> 55
>>>>>>> ?
>>>>>>
>>>>>> And _why_ would that be?
>>>>>>
>>>>>>
>>>>>>> thanks,
>>>>>>> René
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help at r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained,
>>> reproducible code.
>>>>>>
>>>>>> David Winsemius, MD
>>>>>> West Hartford, CT
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>
>>>> David Winsemius, MD
>>>> West Hartford, CT
>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
> David Winsemius, MD
> West Hartford, CT
>
>



More information about the R-help mailing list