---
title: "Incorporating Bayesian Statistics"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Incorporating Bayesian Statistics}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


This vignette shows how to incorporate Bayesian modeling using the package [brms](https://paul-buerkner.github.io/brms/) in specr. When we fit a model with `brms`, the package actually calls Rstan in the background, which, in turn, is an R interface to the statistical programming language `Stan`. Stan is built in the programming language C++ and models have to be compiled using C++ to be run. This is all taken care of by brms, so you just need to run `brm(…)` to fit any model. `brms` uses a syntax for specifying model formulae that is based on the syntax of the commonly known lme4 package (for multilevel modeling). The comparatively easy syntax of `brms` is then converted into Stan code automatically.

That said, since the models have to be compiled in C++, you need to set up your computer so that it can actually use C++. This has to be done only once, before installing brms. Unfortunately, as almost always, how you install C++ depends on your operating system. Check [this vignette](https://github.com/masurp/workshop_bayes/blob/main/exercises/Exercise_02.md) to if you have problems installing brms.

We are also loading the library `furrr` as we want to parallelize the computations.


```r
library(tidyverse)
library(specr)
library(brms)
library(broom.mixed)
library(ggridges)
```


## Setting up custom functions

Although you can simply pass the function `brm` to `setup` and run `specr()` with `workers = 1` without any specific custum function, it usually make sense to set up some custom model fitting and extraction functions to make sure `specr()` does exactly what we want.

For example, I would suggest to create a custom model fitting function and specify relevant parameters for the Bayesian models. Here, I switch off parallel processing as I want to do this via `specr()` (i.e., setting `cores = 1` in `brm`). I also reduce the number of chains to 1 to speed up the fitting process (but we can bump this up if we want). This custom function also allows me to load `brms` itself and `broom.mixed`, which provides the tidy function to extract parameters from the `brm.fit` object.

Furthermore, I also create a custom fit extract function. Most importantly, I add the full `brms.fit` object to the `glance` output. This way, we can later access the entire object in order to e.g., extract posterior distributions or other aspects of the Bayesian models.


```r
# Customized function, not necessary if standard values are to be used
brm_new <- function(formula, data) {
  brm(formula, data,
      silent = 2,    # Don't print progress
      refresh = 0,   # Remove message (not necessary, but nicer in the output)
      iter = 1000)   # I just set this here to reduce computing time, can be higher, of course
}

# New fun2, fit extract function
glance_brm <- function(x) {
  fit2 <- broom::glance(x)
  fit2$full_model <- list(x)  # add full model
  return(fit2)
}

# Setting up specifications
specs <- setup(data = example_data,
              x = c("x1", "x2", "x3"),
              y = c("y1", "y2"),
              model = "brm_new",
              controls = c("c1", "c2"),
              fun2 = glance_brm)
# Check specs
summary(specs)
#> Setup for the Specification Curve Analysis
#> -------------------------------------------
#> Class:                      specr.setup -- version: 1.0.0 
#> Number of specifications:   24 
#> 
#> Specifications:
#> 
#>   Independent variable:     x1, x2, x3 
#>   Dependent variable:       y1, y2 
#>   Models:                   brm_new 
#>   Covariates:               no covariates, c1, c2, c1 + c2 
#>   Subsets analyses:         all 
#> 
#> Function used to extract parameters:
#> 
#>   function(x) broom::tidy(x, conf.int = TRUE)
#> <environment: 0x7f9c7371bad0>
#> 
#> 
#> Head of specifications table (first 6 rows):
#> # A tibble: 6 × 6
#>   x     y     model   controls      subsets formula          
#>   <chr> <chr> <chr>   <chr>         <chr>   <glue>           
#> 1 x1    y1    brm_new no covariates all     y1 ~ x1 + 1      
#> 2 x1    y1    brm_new c1            all     y1 ~ x1 + c1     
#> 3 x1    y1    brm_new c2            all     y1 ~ x1 + c2     
#> 4 x1    y1    brm_new c1 + c2       all     y1 ~ x1 + c1 + c2
#> 5 x1    y2    brm_new no covariates all     y2 ~ x1 + 1      
#> 6 x1    y2    brm_new c1            all     y2 ~ x1 + c1
```


## Estimating the models

Because we are estimating 24 Bayesian models, this may take a while, but brms automatically parallelizes the computations.


```r
results <- specr(specs)
```

Now we can again summarize and plot our results.


```r
summary(results)
#> Results of the specification curve analysis
#> -------------------
#> Technical details:
#> 
#>   Class:                          specr.object -- version: 1.0.0 
#>   Cores used:                     1 
#>   Duration of fitting process:    9.25 sec elapsed 
#>   Number of specifications:       24 
#> 
#> Descriptive summary of the specification curve:
#> 
#>  median  mad   min  max   q25  q75
#>    0.16 0.25 -0.34 0.62 -0.07 0.25
#> 
#> Descriptive summary of sample sizes: 
#> 
#>  median  min  max
#>    1000 1000 1000
#> 
#> Head of the specification results (first 6 rows): 
#> 
#> # A tibble: 6 × 18
#>   x     y     model   controls      subsets formula      effect component group estimate std.error 
#>   <chr> <chr> <chr>   <chr>         <chr>   <glue>       <chr>  <chr>     <chr>    <dbl>     <dbl> 
#> 1 x1    y1    brm_new no covariates all     y1 ~ x1 + 1  fixed  cond      <NA>      0.62      0.04 
#> 2 x1    y1    brm_new c1            all     y1 ~ x1 + c1 fixed  cond      <NA>      0.6       0.04 
#> 3 x1    y1    brm_new c2            all     y1 ~ x1 + c2 fixed  cond      <NA>      0.61      0.04 
#> 4 x1    y1    brm_new c1 + c2       all     y1 ~ x1 + c… fixed  cond      <NA>      0.59      0.04 
#> 5 x1    y2    brm_new no covariates all     y2 ~ x1 + 1  fixed  cond      <NA>     -0.33      0.04 
#> 6 x1    y2    brm_new c1            all     y2 ~ x1 + c1 fixed  cond      <NA>     -0.34      0.04 
#> # … with 7 more variable: conf.low <dbl>, conf.high <dbl>, fit_algorithm <chr>, fit_pss <dbl>
#> #   fit_nobs <dbl>, fit_sigma <dbl>, fit_full_model <list>
plot(results, choices = c("x", "y", "controls"))
```

<img src="https://raw.githubusercontent.com/masurp/specr/master/docs/articles/parallel-bayesian-models_files/figure-html/unnamed-chunk-4-1.png" title="plot of chunk unnamed-chunk-4" alt="plot of chunk unnamed-chunk-4" width="100%" />

Again, the coefficients are median estimates from the posterior and their respective 95% HDIs.

## Inspecting specific models

Because we kept the entire brms models in our result data set, we can explore specific models using the `pull` function.


```r
# Pull entire models from the listed vector
models <- results %>%
  as_tibble %>%
  pull(fit_full_model)

# Summarize e.g., the first model
summary(models[[1]])
#>  Family: gaussian 
#>   Links: mu = identity; sigma = identity 
#> Formula: y1 ~ x1 + 1 
#>    Data: data (Number of observations: 1000) 
#>   Draws: 4 chains, each with iter = 1000; warmup = 500; thin = 1;
#>          total post-warmup draws = 2000
#> 
#> Population-Level Effects: 
#>           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept    -1.10      0.04    -1.17    -1.02 1.00     1915     1356
#> x1            0.62      0.04     0.55     0.69 1.00     1900     1631
#> 
#> Family Specific Parameters: 
#>       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma     1.14      0.03     1.09     1.18 1.00     2903     1542
#> 
#> Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).
```

From these model fit objects, we can also extract posterior distributions for the parameters of interest and stort them in one data frame.


```r
posteriors <- results %>%
  as_tibble %>%
  pull(fit_full_model) %>%
  map(function(x) as_tibble(as_draws_df(x))[, 2] %>%
        gather(key, value)) %>%
  bind_rows(., .id = "id")

# First 10 draws from the first specification for predictor `x2`
head(posteriors, n = 10)
#> # A tibble: 10 × 3
#>    id    key   value
#>    <chr> <chr> <dbl>
#>  1 1     b_x1  0.642
#>  2 1     b_x1  0.655
#>  3 1     b_x1  0.647
#>  4 1     b_x1  0.650
#>  5 1     b_x1  0.611
#>  6 1     b_x1  0.736
#>  7 1     b_x1  0.565
#>  8 1     b_x1  0.624
#>  9 1     b_x1  0.602
#> 10 1     b_x1  0.611
```


## Plotting posterior distributions for all specifications

Using the `ggridges` package, we can plot the posterior distributions of the different specifications.


```r
posteriors %>%
  mutate(id = factor(id, levels = 1:24)) %>%
  ggplot(aes(x = value, y = id, fill = key)) +
  geom_density_ridges() +
  scale_fill_brewer(palette = "Blues") +
  theme_minimal()+
  labs(x = "Posterior",
       y = "Specifications",
       fill = "")
```

<img src="https://raw.githubusercontent.com/masurp/specr/master/docs/articles/parallel-bayesian-models_files/figure-html/unnamed-chunk-7-1.png" title="plot of chunk unnamed-chunk-7" alt="plot of chunk unnamed-chunk-7" width="100%" />

Of course, the posterior distributions are not sorted in any way. With a bit of data wrangling, we can create a specification curve consisting of posterior distributions.


```r
# First panel
p1 <- results %>%
  as_tibble %>%
  mutate(id = as.character(1:nrow(.))) %>%
  arrange(estimate) %>%
  mutate(specifications = factor(as.character(1:nrow(.)),
                                 levels = c(1:24))) %>%
  left_join(posteriors) %>%
  ggplot(aes(x = value, y = specifications)) +
  stat_density_ridges(quantile_lines = TRUE,
                      quantiles = c(0.025, 0.975),
                      alpha = .5,
                      fill = "lightblue",
                      color = "black") +
  coord_flip() +
  theme_minimal() +
  labs(x = "Posterior", y = "") +
  theme(strip.text = element_blank(),
        axis.line = element_line("black", size = .5),
        legend.position = "none",
        panel.spacing = unit(.75, "lines"),
        axis.text = element_text(colour = "black"))

# Second panel
p2 <- results %>%
  as_tibble %>%
  arrange(estimate) %>%
  mutate(specifications = factor(as.character(1:nrow(.)),
                                 levels = c(1:24))) %>%
  gather(key, value, c("x", "y", "controls")) %>%
  mutate(key = factor(key, levels = c("x", "y", "controls"))) %>%
  ggplot() +
  geom_point(aes(x = specifications,
                 y = value),
             shape = 124,
             size = 3.35) +
  theme_minimal() +
  facet_grid(.data$key~1, scales = "free_y", space = "free_y") +
  theme(
    axis.line = element_line("black", size = .5),
    legend.position = "none",
    panel.spacing = unit(.75, "lines"),
    axis.text = element_text(colour = "black"),
    strip.text.x = element_blank()) +
  labs(x = "", y = "")

# Combine
plot_grid(p1, p2,
          ncol = 1,
          align = "hv",
          axis = "tblr",
          rel_heights = c(3, 2))
```

<img src="https://raw.githubusercontent.com/masurp/specr/master/docs/articles/parallel-bayesian-models_files/figure-html/unnamed-chunk-8-1.png" title="plot of chunk unnamed-chunk-8" alt="plot of chunk unnamed-chunk-8" width="100%" />