Boltzmann machine is a universal learning machine

The Boltzmann machine can learn arbitrary probability distributions by means of minimizing the KL-divergence between ground-truth data and neural activity in its visibles. During inference (or “sleep”), inputs are clamped while the Boltzmann machine confabulates an output over time by sampling the Boltzmann distribution using the Metropolis sampler. However, due to the time span necessary for inference, vanilla Boltzmann machines aren’t feasible. That said, restricted Boltzmann machines make them borderline tractable.