Variational Inference#
The generative model#
All spatialvi-tools models are based on amortized variational inference (AVI). The core idea is to learn a probabilistic generative model \(p_\theta(x \mid z)\) of observed gene expression \(x\) conditioned on a low-dimensional latent variable \(z\), alongside an approximate posterior \(q_\phi(z \mid x)\) parameterized by an encoder neural network.
Training maximizes the Evidence Lower Bound (ELBO):
Gene likelihood#
Most models support two gene likelihood distributions:
Negative Binomial (NB): default; models overdispersed count data.
Poisson: simpler; suitable for very low-count spatial data.
Spatial priors (scVIVA, ResolVI)#
Spatial models extend the standard VAE with a niche-aware prior that conditions the latent distribution on the cellular neighbourhood, encoding microenvironment structure directly into the latent space.
KL annealing#
To stabilize early training, the KL divergence term is annealed from 0 to 1 over the first
n_epochs_kl_warmup epochs. This prevents the model from collapsing to the prior before the
encoder has learned a meaningful representation.
References#
Lopez et al. (2018) Deep generative modeling for single-cell transcriptomics. Nature Methods.
Levy et al. (2025) scVIVA. bioRxiv.