[R] lmom and lmomRFA Upper and Lower Bounds Simulation Questions

Mon May 11 16:17:58 CEST 2015

On 2015-05-11 00:01, Douglas Hultstrand wrote:
> Hello,
>
> I am using the lmom and lmomRFA to compute the return frequencies using
> the GEV distribution.Iam trying to generate upper and lower bound
> frequency estimates.
>
> I provided a working example of the code that I am using to estimate the
> upper and lower bounds. Specific questions I have are:
>
> 1) Is the methodology appropriate and applied correctly?
>
> 2) In this example, are the simulated qhat values better than the actual
> fitted data estimates (GEV)?
>
> _*Code Example Below:*_
>
> library(lmom); library(lmomRFA)
> set.seed(1234)
>
> # Example data
> max_data <-
> c(11.14,10.95,10.21,9.88,9.85,9.74,9.74,9.73,9.62,8.95,8.38,8.20)
> moments = samlmu(max_data, sort.data = TRUE)
> parGEV <- pelgev(moments)  # GEV
>
> x <- c(10,25,50,100,500,1000)
> Fx <- (1-(1/x))
> GEV = Fx
> PDS <- 1/(log(x)-log(x-1))
>
> for (i in seq_along(Fx)) {
>       GEV[i] = round(quagev(Fx[i], parGEV), 3)
> }
>
> ################
> #### BOUNDS ###
> ################
> # A data sample
> zz_gev <- quagev(runif(500), parGEV)
>
> # Compute L-moments of the sample, considered as a 1-site region
> rdat_gev <- regsamlmu(zz_gev)
>
> # Fit GEV distribution to the regional L-moments
> rfit_gev <- regfit(rdat_gev,"gev")
>
> # Generate simulations of an artificial 1-site region with frequency
> distribution fitted to the actual data
> sim_gev <- regsimq(rfit_gev$qfunc, nrec = rdat_gev$n, f = 1 - 1 / x )
>
> # Compute error bounds for quantiles of the site's frequency distribution
> CI_gev <- sitequantbounds(sim_gev, rfit_gev)
>
> # 90% Error Bounds/CI
> Clwr <- round(CI_gev$bound.0.05, 3)
> Cupr <- round(CI_gev$bound.0.95, 3)
> qhat <- round(CI_gev$Qhat, 3)
>
> # print data
> data.frame(Fx, GEV, Clwr, Cupr, qhat)
>
> ####################
> #### END BOUNDS ##
> ####################
>
>
> #Plot and visualize the data
> plot(x, GEV, log="x", ylim=c(10.5,12.5), col="blue")
> lines(PDS,GEV, col="blue")
> lines(PDS,Clwr, lty=3)
> lines(PDS,Cupr, lty=3)
> lines(PDS,qhat, col="red", lty=5)
>
> Thanks for your help,
>
> Doug
>

 > 1) Is the methodology appropriate and applied correctly?

Without knowing what you are trying to achieve, it is not possible
to answer that question.  The code that you provide appropriately
and correctly evaluates error bounds for quantile estimates of a
GEV distribution fitted to the data set 'zz_gev'.

The larger point here is that regional frequency analysis for a "region"
containing a single site is equivalent to analysis of a single data set.
In particular, the error bounds returned by sitequantbounds() for a
1-site region are identical to confidence intervals for quantiles
computed for a single data set using a parametric bootstrap.

 > 2) In this example, are the simulated qhat values better than the
 > actual fitted data estimates (GEV)?

See the first sentence of my reply to 1).  'GEV' are the quantiles
of a GEV distribution fitted to 'max_data'.  'CI_gev$Qhat' are the
quantiles of a GEV distribution fitted to 'zz_gev' (you can verify that
'CI_gev$Qhat' is the same as 'quagev(Fx,pelgev(samlmu(zz_gev)))').
For inference about 'max_data', 'GEV' is likely to be better.
For inference about 'zz_gev', 'CI_gev$Qhat' is likely to be better.

J. R. M. Hosking
jrmhosking at gmail.com