Hypothesis Subspace

How does contrastive dreaming relate to concrete challenges in alignment?

Contrastive dreaming offers a tiny class of tricks for improving the generalization of objective functions implemented as evaluators which are trained via gradient descent. In this, it aims to refine the model's representation of human intent, helping with objective robustness in regimes of limited data. At the same time, contrastive dreaming might help surface edge cases for humans or more expensive evaluators to weigh in during oversight schemes.