[R] How to create a new data.frame based on calculation of subsets of an existing data.frame
Ioannou, Ioanna
|o@nn@@|o@nnou @end|ng |rom uc|@@c@uk
Fri Dec 20 22:33:33 CET 2019
Hello Jim ,
Thank you ever so much for your help. I was truly stuck!
This looks much better and yes I can turn them into a matrix no problem. Indeed I need only the results for ER+ETR_H1,PGA and ER+ETR_H2,Sa. One minor point as it is the VC has 4 values for three cases instead of the aforementioned two. In fact, the third is identical to the first. Could you please optimize?
Thank you very much again,
Best,
ioanna
-----Original Message-----
From: Jim Lemon [mailto:drjimlemon using gmail.com]
Sent: Friday, December 20, 2019 9:04 PM
To: Ioannou, Ioanna <ioanna.ioannou using ucl.ac.uk>
Cc: r-help mailing list <r-help using r-project.org>
Subject: Re: [R] How to create a new data.frame based on calculation of subsets of an existing data.frame
Hi Ioanna,
We're getting somewhere, but there are four unique combinations of Taxonomy and IM.type:
ER+ETR_H1,PGA
ER+ETR_H2,PGA
ER+ETR_H1,Sa
ER+ETR_H2,Sa
Perhaps you mean that ER+ETR_H1 only occurs with PGA and ER+ETR_H2 only occurs with Sa. I handled that by checking that there were any rows that corresponded to the condition requested.
Also you want a matrix for each row containing Taxonomy and IM.type in the output. When I run what I think you are asking, I only get a two element list, each a vector of values. Maybe this is what you want, and it could be coerced into matrix format:
D<- data.frame(Ref.No = c(1622, 1623, 1624, 1625, 1626, 1627, 1628, 1629), Region = rep(c('South America'), times = 8), IM.type = c('PGA', 'PGA', 'PGA', 'PGA', 'Sa', 'Sa', 'Sa', 'Sa'), Damage.state = c('DS1', 'DS2', 'DS3', 'DS4','DS1', 'DS2', 'DS3', 'DS4'), Taxonomy = c('ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H2','ER+ETR_H2','ER+ETR_H2','ER+ETR_H2'),
Prob.of.exceedance_1 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_2 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_3 =
c(0.26,0.001,0.00019,0.000000573,0.04,0.00017,0.000215,0.000472),
Prob.of.exceedance_4 =
c(0.72,0.03,0.008,0.000061,0.475,0.0007,0.00435,0.000405),
stringsAsFactors=FALSE)
# names of the variables used in the calculations
calc_vars<-paste("Prob.of.exceedance",1:4,sep="_")
# get the rows for the four damage states DS1_rows <-D$Damage.state == "DS1"
DS2_rows <-D$Damage.state == "DS2"
DS3_rows <-D$Damage.state == "DS3"
DS4_rows <-D$Damage.state == "DS4"
# create an empty list
VC<-list()
# set an index variable for VC
VCindex<-1
# step through all possible values of IM.type and Taxonomy for(IM in unique(D$IM.type)) { for(Tax in unique(D$Taxonomy)) {
# get a logical vector of the rows to be used in this calculation
calc_rows <- D$IM.type == IM & D$Taxonomy == Tax
cat(IM,Tax,calc_rows,"\n")
# check that there are any such rows in the data frame
if(sum(calc_rows)) {
# if so, fill in the four values for these rows
VC[[VCindex]] <- 0.0 * (1- D[calc_rows & DS1_rows,calc_vars]) +
0.02* (D[calc_rows & DS1_rows,calc_vars] -
D[calc_rows & DS2_rows,calc_vars]) +
0.10* (D[calc_rows & DS2_rows,calc_vars] -
D[calc_rows & DS3_rows,calc_vars]) +
0.43 * (D[calc_rows & DS3_rows,calc_vars] -
D[calc_rows & DS4_rows,calc_vars]) +
1.0* D[calc_rows & DS4_rows,calc_vars]
# increment the index
VCindex<-VCindex+1
}
}
}
I think we'll get there.
Jim
On Sat, Dec 21, 2019 at 12:45 AM Ioannou, Ioanna <ioanna.ioannou using ucl.ac.uk> wrote:
>
> Hello Jim,
>
> I made some changes to the code essentially I substitute each 4 lines DS1-4 with one. I estimate VC which in an ideal world should be a matrix with 4 columns one for every exceedance_probability_1-4 and 2 rowsfor each unique combination of taxonomy and IM.Type. Coukd you please check the code I sent last and based on that give your solution?
More information about the R-help
mailing list