[R] Conditional Data Manipulation -Cumulative Product

David L Carlson dcarlson at tamu.edu
Wed Oct 8 05:55:10 CEST 2014


I think this works, at least for your example data. The function SSRuns gets the index values of the starting points and then finds the first ending point that is greater or equal. Then we cycle through the starting points and print the index values from start to stop. Those are combined into a single vector which is used to create each column of the mask for the data.

SSRuns <- function(x, y, rows) {
	a <- which(x>0)
	b <- which(y>0)
	d <- unlist(lapply(seq_along(a), function(i) 
		a[i]:head(b[a[i] <= b], 1)))
	v <- rep(0, rows)
	v[d] <- 1
	return(v)
}
mask <- sapply(StartSignals[,-1], SSRuns, y=StopSignals$Stop, 
	rows=nrow(MainData))
Results <- data.frame(Date=MainData$Date, MainData[,-1]*mask)
Results
         Date   X1   X2   X3   X4   X5
1  2014-01-01 0.00 0.00 0.00 0.00 0.00
2  2014-01-02 0.00 1.51 0.00 0.00 1.24
3  2014-01-03 0.00 0.09 0.20 0.00 0.30
4  2014-01-04 0.00 0.00 0.00 0.00 0.00
5  2014-01-05 1.04 0.00 0.00 0.00 1.23
6  2014-01-06 0.00 0.00 0.76 0.00 0.00
7  2014-01-07 0.00 0.00 1.22 0.66 0.00
8  2014-01-08 0.00 0.00 0.27 0.09 0.00
9  2014-01-09 0.00 0.00 0.00 0.00 0.00
10 2014-01-10 0.00 0.00 1.68 0.98 0.00
11 2014-01-11 0.43 0.00 1.98 1.46 0.00
12 2014-01-12 1.51 0.78 1.63 0.46 1.84
13 2014-01-13 0.26 0.34 0.34 0.97 1.13

David C

-----Original Message-----
From: Pooya Lalehzari [mailto:plalehzari at platinumlp.com] 
Sent: Tuesday, October 7, 2014 8:06 PM
To: David L Carlson
Subject: RE: [R] Conditional Data Manipulation -Cumulative Product

Hi David,
I also made a dput of the Expected Results in case if you want to read it in:
> dput(ExpResults)
structure(list(Date = c("1/1/2014", "1/2/2014", "1/3/2014", "1/4/2014", "1/5/2014", "1/6/2014", "1/7/2014", "1/8/2014", "1/9/2014", "1/10/2014", "1/11/2014", "1/12/2014", "1/13/2014"), X1 = c(0, 0, 0, 0, 1.04, 0, 0, 0, 0, 0, 0.43, 0.65, 0.17), X2 = c(0, 1.51, 0.14, 0, 0, 0, 0, 0, 0, 0, 0, 0.78, 0.27), X3 = c(0, 0, 0.2, 0, 0, 0.76, 0.93, 0.25, 0, 1.68, 3.33, 5.42, 1.84), X4 = c(0, 0, 0, 0, 0, 0, 0.66, 0.06, 0, 0.98, 1.43, 0.66, 0.64), X5 = c(0, 1.24, 0.37, 0, 1.23, 0, 0, 0, 0, 0, 0, 1.84, 2.08)), .Names = c("Date", "X1", "X2", "X3", "X4", "X5"), class = "data.frame", row.names = c(NA,
-13L))

-----Original Message-----
From: David L Carlson [mailto:dcarlson at tamu.edu]
Sent: Tuesday, October 07, 2014 5:03 PM
To: Pooya Lalehzari
Cc: R help
Subject: RE: [R] Conditional Data Manipulation -Cumulative Product

More clear to read, but this is much easier to load into R. Then adding 

StartSignals$Date <- as.Date(StartSignals$Date, "%m/%d/%Y") MainData$Date <- as.Date(MainData$Date, "%m/%d/%Y") StopSignals$Date <- as.Date(StopSignals$Date, "%m/%d/%Y")

Creates date objects out of the character strings.

But what should the final result look like? For example X1 has two start dates, "2014-01-05" and "2014-01-11" and you have stop dates of "2014-01-03", "2014-01-05", "2014-01-08", and "2014-01-13". So for X1 "2014-01-05" is both a start and stop date (value 1.04) and the second start/end would be "2014-01-11" to "2014-01-13" (values .43, 1.51, .26). What do you mean by compounding?

David C


-----Original Message-----
From: Pooya Lalehzari [mailto:plalehzari at platinumlp.com]
Sent: Tuesday, October 7, 2014 2:59 PM
To: David L Carlson
Subject: RE: [R] Conditional Data Manipulation -Cumulative Product

Dear David,
This is the dput output but I think the previous email had it more clearly.


> dput(StartSignals)
structure(list(Date = c("1/1/2014", "1/2/2014", "1/3/2014", "1/4/2014", "1/5/2014", "1/6/2014", "1/7/2014", "1/8/2014", "1/9/2014", "1/10/2014", "1/11/2014", "1/12/2014", "1/13/2014"), X1 = c(0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L), X2 = c(0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L), X3 = c(0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L), X4 = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L), X5 = c(0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L)), .Names = c("Date", "X1", "X2", "X3", "X4", "X5"), class = "data.frame", row.names = c(NA, -13L))
> dput(MainData)
structure(list(Date = c("1/1/2014", "1/2/2014", "1/3/2014", "1/4/2014", "1/5/2014", "1/6/2014", "1/7/2014", "1/8/2014", "1/9/2014", "1/10/2014", "1/11/2014", "1/12/2014", "1/13/2014"), X1 = c(1.92, 0.67, 1.09, 1.81, 1.04, 1.69, 1.57, 0.5, 0, 1.31, 0.43, 1.51, 0.26), X2 = c(1.38, 1.51, 0.09, 1.33, 0.38, 1.12, 1.3, 1.75, 1.26, 1.57, 1.63, 0.78, 0.34), X3 = c(0.83, 1.21, 0.2, 1.57, 1.72, 0.76, 1.22, 0.27, 0.59, 1.68, 1.98, 1.63, 0.34), X4 = c(1.25, 0.06, 1.62, 1.68, 1.98, 1.45, 0.66, 0.09, 0.4, 0.98, 1.46, 0.46, 0.97), X5 = c(1.12, 1.24, 0.3, 1.41, 1.23, 1.99, 1.75, 1.91, 1.81, 1.79, 0.81, 1.84, 1.13)), .Names = c("Date", "X1", "X2", "X3", "X4", "X5"), class = "data.frame", row.names = c(NA,
-13L))
> dput(StopSignals)
structure(list(Date = c("1/1/2014", "1/2/2014", "1/3/2014", "1/4/2014", "1/5/2014", "1/6/2014", "1/7/2014", "1/8/2014", "1/9/2014", "1/10/2014", "1/11/2014", "1/12/2014", "1/13/2014"), Stop = c(0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L)), .Names = c("Date", "Stop"), class = "data.frame", row.names = c(NA, -13L))


-----Original Message-----
From: David L Carlson [mailto:dcarlson at tamu.edu]
Sent: Tuesday, October 07, 2014 3:13 PM
To: Pooya Lalehzari; R help
Subject: RE: [R] Conditional Data Manipulation -Cumulative Product

You need to use plain text, not html in your email. Your data are scrambled (see below). It is better to send your data using the R dput() function:

dput(StartSignals)
dput(MainData)
dput(StopSignals)

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Pooya Lalehzari
Sent: Tuesday, October 7, 2014 11:55 AM
To: R help
Subject: [R] Conditional Data Manipulation -Cumulative Product

Hello,
I have three datasets StartSignals, MainData, StopSignals and need to compound the data for each variable in MainData over dates that fall between the Start and Stop signals. (Stop signals are common and the same to all X1:X5 variables). Please see sample below:
The one way I was thinking of doing this project was to setup a nested "FOR" loop and go through the three data matrices. Is there a more elegant way of doing this?
Thank you.

StartSignals:
Date

X1

X2

X3

X4

X5

1/1/2014

0

0

0

0

0

1/2/2014

0

1

0

0

1

1/3/2014

0

0

1

0

0

1/4/2014

0

0

0

0

0

1/5/2014

1

0

0

0

1

1/6/2014

0

0

1

0

0

1/7/2014

0

0

0

1

0

1/8/2014

0

0

0

0

0

1/9/2014

0

0

0

0

0

1/10/2014

0

0

1

1

0

1/11/2014

1

0

0

0

0

1/12/2014

0

1

0

0

1

1/13/2014

0

0

0

0

0




MainData:
Date

X1

X2

X3

X4

X5

1/1/2014

1.92

1.38

0.83

1.25

1.12

1/2/2014

0.67

1.51

1.21

0.06

1.24

1/3/2014

1.09

0.09

0.2

1.62

0.3

1/4/2014

1.81

1.33

1.57

1.68

1.41

1/5/2014

1.04

0.38

1.72

1.98

1.23

1/6/2014

1.69

1.12

0.76

1.45

1.99

1/7/2014

1.57

1.3

1.22

0.66

1.75

1/8/2014

0.5

1.75

0.27

0.09

1.91

1/9/2014

0

1.26

0.59

0.4

1.81

1/10/2014

1.31

1.57

1.68

0.98

1.79

1/11/2014

0.43

1.63

1.98

1.46

0.81

1/12/2014

1.51

0.78

1.63

0.46

1.84

1/13/2014

0.26

0.34

0.34

0.97

1.13




StopSignals:
Date

Stop

1/1/2014

0

1/2/2014

0

1/3/2014

1

1/4/2014

0

1/5/2014

1

1/6/2014

0

1/7/2014

0

1/8/2014

1

1/9/2014

0

1/10/2014

0

1/11/2014

0

1/12/2014

0

1/13/2014

1



ExpectedResult:

Date

X1

X2

X3

X4

X5

1/1/2014

0

0

0

0

0

1/2/2014

0

1.51

0

0

1.24

1/3/2014

0

0.14

0.2

0

0.37

1/4/2014

0

0

0

0

0

1/5/2014

1.04

0

0

0

1.23

1/6/2014

0

0

0.76

0

0

1/7/2014

0

0

0.93

0.66

0

1/8/2014

0

0

0.25

0.06

0

1/9/2014

0

0

0

0

0

1/10/2014

0

0

1.68

0.98

0

1/11/2014

0.43

0

3.33

1.43

0

1/12/2014

0.65

0.78

5.42

0.66

1.84

1/13/2014

0.17

0.27

1.84

0.64

2.08










***
We are pleased to announce that, as of October 20th, 2014, we will be moving to our new office at:
Platinum Partners
250 West 55th Street, 14th Floor, New York, NY 10019
T: 212.582.2222 | F: 212.582.2424
***
THIS E-MAIL IS FOR THE SOLE USE OF THE INTENDED RECIPIENT(S) AND MAY CONTAIN CONFIDENTIAL AND PRIVILEGED INFORMATION.ANY UNAUTHORIZED REVIEW, USE, DISCLOSURE OR DISTRIBUTION IS PROHIBITED. IF YOU ARE NOT THE INTENDED RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND DESTROY ALL COPIES OF THE ORIGINAL E-MAIL.
	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



***
We are pleased to announce that, as of October 20th, 2014, we will be moving to our new office at:
Platinum Partners
250 West 55th Street, 14th Floor, New York, NY 10019
T: 212.582.2222 | F: 212.582.2424
***
THIS E-MAIL IS FOR THE SOLE USE OF THE INTENDED RECIPIENT(S) AND MAY CONTAIN CONFIDENTIAL AND PRIVILEGED INFORMATION.ANY UNAUTHORIZED REVIEW, USE, DISCLOSURE OR DISTRIBUTION IS PROHIBITED. IF YOU ARE NOT THE INTENDED RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND DESTROY ALL COPIES OF THE ORIGINAL E-MAIL.



***
We are pleased to announce that, as of October 20th, 2014, we will be moving to our new office at:
Platinum Partners
250 West 55th Street, 14th Floor, New York, NY 10019
T: 212.582.2222 | F: 212.582.2424
***
THIS E-MAIL IS FOR THE SOLE USE OF THE INTENDED RECIPIENT(S) AND MAY CONTAIN CONFIDENTIAL AND PRIVILEGED INFORMATION.ANY UNAUTHORIZED REVIEW, USE, DISCLOSURE OR DISTRIBUTION IS PROHIBITED. IF YOU ARE NOT THE INTENDED RECIPIENT, PLEASE CONTACT THE SENDER BY REPLY E-MAIL AND DESTROY ALL COPIES OF THE ORIGINAL E-MAIL.



More information about the R-help mailing list