[R] Bus stop sequence matching problem

Adam Lawrence alaw005 at gmail.com
Sat Aug 30 02:46:17 CEST 2014


I am hoping someone can help me with a bus stop sequencing problem in R,
where I need to match counts of people getting on and off a bus to the
correct stop in the bus route stop sequence. I have tried looking
online/forums for sequence matching but seems to refer to numeric sequences
or DNA matching and over my head. I am after a simple example if anyone can
please help.

I have two data series as per below (from database), that I want to
combine. In this example “stop_sequence” includes the equence (seq) of bus
stops and “stop_onoff” is a count of people getting on and off at certain
stops (there is no entry if noone gets on or off).

stop_sequence <- data.frame(seq=c(10,20,30,40,50,60),
ref=c('A','B','C','D','B','A'))
##   seq ref
## 1  10   A
## 2  20   B
## 3  30   C
## 4  40   D
## 5  50   B
## 6  60   A
stop_onoff <-
data.frame(ref=c('A','D','B','A'),on=c(5,0,10,0),off=c(0,2,2,6))
##   ref on off
## 1   A  5   0
## 2   D  0   2
## 3   B 10   2
## 4   A  0   6

I need to match the stop_onoff numbers in the right sto sequence, with the
correctly matched output as follows (load is a cumulative count of on and
off)

desired_output <- data.frame(seq=c(10,20,30,40,50,60),
ref=c('A','B','C','D','B','A'),
on=c(5,'-','-',0,10,0),off=c(0,'-','-',2,2,6), load=c(5,0,0,3,11,5))
##   seq ref on off load
## 1  10   A  5   0    5
## 2  20   B  -   -    0
## 3  30   C  -   -    0
## 4  40   D  0   2    3
## 5  50   B 10   2   11
## 6  60   A  0   6    5

In this example the stop “B” is matched to the second stop “B” in the stop
sequence and not the first because the onoff data is after stop “D”.

Any guidance much appreciated.

Regards
Adam

	[[alternative HTML version deleted]]



More information about the R-help mailing list