Type: | Package |
Title: | Biclustering via Latent Block Model Adapted to Overdispersed Count Data |
Version: | 0.1.2 |
Description: | Implementation of a probabilistic method for biclustering adapted to overdispersed count data. It is a Gamma-Poisson Latent Block Model. It also implements two selection criteria in order to select the number of biclusters. |
License: | GPL-3 |
URL: | https://github.com/julieaubert/cobiclust |
BugReports: | https://github.com/julieaubert/cobiclust/issues |
Depends: | R (≥ 3.5.0) |
Imports: | assertthat, cluster, stats, testthat |
Suggests: | spelling |
Encoding: | UTF-8 |
Language: | en-US |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | no |
Packaged: | 2024-02-16 09:49:46 UTC; jaubert |
Author: | Julie Aubert |
Maintainer: | Julie Aubert <julie.aubert@inrae.fr> |
Repository: | CRAN |
Date/Publication: | 2024-02-16 12:10:02 UTC |
Calculate the matrix of interaction terms between groups of species and groups of sample
Description
Calculate the matrix of interaction terms between groups of species and groups of sample
Usage
alpha_calculation(
s_ik = s_ik,
t_jg = t_jg,
nu_j = nu_j,
mu_i = mu_i,
K = K,
G = G,
x = x,
exp_utilde = exp_utilde
)
Arguments
s_ik |
s_ik. |
t_jg |
t_jg. |
nu_j |
nu_j. |
mu_i |
mu_i. |
K |
K. |
G |
G. |
x |
a matrix of observations. Columns correspond to biological samples and rows to microorganisms. |
exp_utilde |
exp_utilde. |
Value
a matrix of dimension (K
,G
) of the terms of interactions.
Perform a biclustering adapted to overdispersed count data.
Description
Perform a biclustering adapted to overdispersed count data.
Usage
cobiclust(
x,
K = 2,
G = 3,
nu_j = NULL,
a = NULL,
akg = FALSE,
cvg_lim = 1e-05,
nbiter = 5000,
tol = 1e-04
)
Arguments
x |
the input matrix of observed data. |
K |
an integer specifying the number of groups in rows. |
G |
an integer specifying the number of groups in columns. |
nu_j |
a vector of numeric, corresponding of a column (sampling effort) effect. |
a |
a numeric dispersion parameter (parameter of the gamma distribution). |
akg |
a logical variable indicating whether to use a common dispersion parameter ( |
cvg_lim |
a number specifying the threshold used for convergence criterion. |
nbiter |
the maximal number of iterations for the global loop of variational EM algorithm ( |
tol |
the level of relative iteration convergence tolerance ( |
Value
An object of class cobiclustering
See Also
cobiclustering
for the cobiclustering class.
Examples
npc <- c(50, 40) # nodes per class
KG <- c(2, 3) # classes
nm <- npc * KG # nodes
Z <- diag(KG[1]) %x% matrix(1, npc[1], 1)
W <- diag(KG[2]) %x% matrix(1, npc[2], 1)
L <- 70*matrix(runif(KG[1] * KG[2]), KG[1], KG[2])
M_in_expectation <- Z %*% L %*% t(W)
size <- 50
M <- matrix(
rnbinom(
n = length(as.vector(M_in_expectation)),
mu = as.vector(M_in_expectation), size = size
),
nm[1], nm[2]
)
rownames(M) <- paste('OTU', 1:nrow(M), sep = '_')
colnames(M) <- paste('S', 1:ncol(M), sep = '_')
res <- cobiclust(M, K = 2, G = 3, nu_j = rep(1, 120), a = 1 / size, cvg_lim = 1e-5)
Creation of the cobiclustering class.
Description
Creation of the cobiclustering class.
Usage
cobiclustering(
data = matrix(nrow = 3, ncol = 3, NA),
K = 2,
G = 2,
classification = list(length = 2),
strategy = list(),
parameters = list(),
info = list()
)
Useful function to estimate the parameter a
Description
Useful function to estimate the parameter a
Usage
foo_a(x, nb, left_bound, right_bound)
Arguments
x |
x. |
nb |
nb. |
left_bound |
left_bound. |
right_bound |
right_bound. |
Value
a numeric.
Initialisation of the co-clusters by partitioning around medoids method.
Description
Initialisation of the co-clusters by partitioning around medoids method.
Usage
init_pam(x, nu_j = NULL, a = NULL, K = K, G = G, akg = FALSE)
Arguments
x |
The output of the cobiclust function. |
nu_j |
a vector of numeric, corresponding of a column effect, may be interpreted as a sampling effort. The length is equal to the number of columns. |
a |
an numeric. |
K |
an integer specifying the number of groups in rows. |
G |
an integer specifying the number of groups in columns. |
akg |
a logical variable indicating whether to use a common dispersion parameter ( |
Value
A list of
nu_j
nu_j.
mu_i
mu_i.
t_jg
t_jg.
s_ik
s_ik.
pi_c
pi.
rho_c
rho.
a
a.
exp_utilde
exp_utilde.
exp_logutilde
exp_logutilde.
alpha_c
alpha.
Is an object of class cobiclustering ?
Description
Is an object of class cobiclustering ?
Usage
is.cobiclustering(object)
Arguments
object |
an object of class cobiclustering. |
Calculate the lower bound
Description
Calculate the lower bound
Usage
lb_calculation(
x = x,
qu_param = qu_param,
s_ik = s_ik,
pi_c = pi_c,
t_jg = t_jg,
rho_c = rho_c,
mu_i = mu_i,
nu_j = nu_j,
alpha_c = alpha_c,
a = a,
akg = TRUE
)
Arguments
x |
a matrix of observations. Columns correspond to biological samples and rows to microorganisms. |
qu_param |
qu_param. |
s_ik |
s_ik. |
pi_c |
pi_c. |
t_jg |
t_jg. |
rho_c |
rho_c. |
mu_i |
mu_i. |
nu_j |
nu_j. |
alpha_c |
a matrix the terms of interactions. |
a |
a. |
akg |
a logical variable indicating whether to use a common dispersion parameter ( |
Value
a list of 2 elements.
lb
value of the lower bound.
ent
value of the entropy term.
Calculate the BIC penalty
Description
Calculate the BIC penalty
Usage
penalty(x)
Arguments
x |
an object of class biclustering. |
Value
the value of the BIC penalty.
Calculate approximate conditional moment of the third hidden layer U
Description
Calculate approximate conditional moment of the third hidden layer U
Usage
qu_calculation(
s_ik = s_ik,
t_jg = t_jg,
x = x,
mu_i = mu_i,
nu_j = nu_j,
alpha_c = alpha_c,
a = a
)
Arguments
s_ik |
s_ik. |
t_jg |
t_jg. |
x |
a matrix of observations. Columns correspond to biological samples and rows to microorganisms. |
mu_i |
mu_i. |
nu_j |
a vector of numeric, corresponding of a column (sampling effort) effect. |
alpha_c |
alpha_c. |
a |
a numeric dispersion parameter (parameter of the gamma distribution). |
Value
A list of 4 elements.
a_tilde
a_tilde.
b_tilde
b_tilde.
exp_utilde
exp_utilde.
exp_logutilde
exp_logutilde.
Calculate the approximate conditional moments of the third hidden variable U and its log
Description
Calculate the approximate conditional moments of the third hidden variable U and its log
Usage
qukg_calculation(
s_ik = s_ik,
t_jg = t_jg,
x = x,
mu_i = mu_i,
nu_j = nu_j,
alpha_c = alpha_c,
a = a
)
Arguments
s_ik |
s_ik. |
t_jg |
t_jg. |
x |
a matrix of observations. Columns correspond to biological samples and rows to microorganisms. |
mu_i |
mu_i. |
nu_j |
nu_j. |
alpha_c |
alpha_c. |
a |
a0. |
Value
A list of 4 elements.
a_tilde
a_tilde.
b_tilde
b_tilde.
exp_utilde
exp_utilde.
exp_logutilde
exp_logutilde.
Calculate selection criteria.
Description
Calculate selection criteria.
Usage
selection_criteria(x, K = NULL, G = NULL)
Arguments
x |
The output of the cobiclust function. |
K |
The number of groups in rows. |
G |
The number of groups in columns. |
Value
A dataframe with 7 columns.
vICL
the vICL selection criterion.
BIC
the BIC selection criterion.
penKG
the value of the BIC penalty.
lb
the value of the lower bound of the log-likelihood.
entZW
the value of the entropy of the latent variables Z and W.
K
the number of groups in rows.
G
the number of groups in columns.
Summary of an object of class Cobiclust
Description
Summary of an object of class Cobiclust
Usage
## S3 method for class 'cobiclustering'
summary(object, ...)
Arguments
object |
an object of class cobiclustering. |
... |
ignored |