---
title: "4. Algorithm for visualising the model overlaid on high-dimensional data"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{4. Algorithm for visualising the model overlaid on high-dimensional data}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
options(rmarkdown.html_vignette.check_title = FALSE)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  warning = FALSE, 
  message = FALSE
)
```

This walks through the algorithm for constructing a model from 2-D embedding data and visualising it alongside high-dimensional data. The process involves two major steps:

1. **Constructing the model in the 2-D embedding space** using hexagonal binning and triangulation.
2. **Lifting the model into high dimensions** to link it back to the original data space.


```{r setup}
library(quollr)
library(ggplot2)
library(tibble)
library(dplyr)
library(stats)
```

## Step 1: Construct the 2-D model
  
### Hexagonal Binning

To begin, we preprocess the 2-D embedding and create hexagonal bins over the layout.

```{r}
## To pre-process the data
nldr_obj <- gen_scaled_data(nldr_data = scurve_umap)

## Obtain the hexbin object
hb_obj <- hex_binning(nldr_obj = nldr_obj, b1 = 15, q = 0.1)

all_centroids_df <- hb_obj$centroids
counts_df <- hb_obj$std_cts
```

### Extract bin centroids

Next, we extract the centroid coordinates and standardised bin counts. These will be used to identify densely populated regions in the 2-D space.

```{r}
## To extract all bin centroids with bin counts
df_bin_centroids <- extract_hexbin_centroids(centroids_data = all_centroids_df,
                                             counts_data = counts_df)

benchmark_highdens <- 0

## To extract high-densed bins
model_2d <- df_bin_centroids |>
  dplyr::filter(n_h > benchmark_highdens)

glimpse(model_2d) 
```

### Triangulate the bin centroids

We then triangulate the hexagon centroids to build a wireframe of neighborhood relationships.

```{r}
## Wireframe
tr_object <- tri_bin_centroids(centroids_data = df_bin_centroids)
str(tr_object)
```

### Generate edges from triangulation

Using the triangulation object, we generate edges between centroids. We retain only edges connecting densely populated bins.

```{r}
trimesh_data <- gen_edges(tri_object = tr_object, a1 = hb_obj$a1) |>
  dplyr::filter(from_count > benchmark_highdens,
                to_count > benchmark_highdens)

## Update the edge indexes to start from 1
trimesh_data <- update_trimesh_index(trimesh_data)
glimpse(trimesh_data)
```

### Visualise the triangular mesh

```{r, fig.alt="Triangular mesh."}
trimesh <- ggplot(model_2d, aes(x = c_x, y = c_y)) +
  geom_trimesh() +
  coord_equal() +
  xlab(expression(C[x]^{(2)})) + ylab(expression(C[y]^{(2)})) +
  theme(axis.text = element_text(size = 5),
        axis.title = element_text(size = 7))

trimesh
```


## Step 2: Lift the model into high dimensions
  
### Map bins to high-dimensional observations

We begin by extracting the original data with their assigned hexagonal bin IDs.

```{r}
nldr_df_with_hex_id <- hb_obj$data_hb_id
glimpse(nldr_df_with_hex_id)
```

### Compute high-dimensional coordinates for bins

We calculate the average high-dimensional coordinates for each bin and retain only the ones matching the 2-D model bins.

```{r} 
model_highd <- avg_highd_data(highd_data = scurve, scaled_nldr_hexid = nldr_df_with_hex_id)

model_highd <- model_highd |>
  dplyr::filter(h %in% model_2d$h)

glimpse(model_highd)
```

## Step 3: Visualise the high-dimensional model

We now combine all components—high-dimensional data, the 2-D model, lifted high-dimensional centroids, and the triangulation—and render the model using an interactive tour.

### Prepare data for visualisation

```{r}
df_exe <- comb_data_model(highd_data = scurve, 
                          model_highd = model_highd, 
                          model_2d = model_2d)

```

### Interactive tour of model overlay

```{r}
tour1 <- show_langevitour(point_data = df_exe, edge_data = trimesh_data)

tour1
```