In its vanilla formulation, contrastive dreaming would improve the robustness of evaluators based trained via gradient descent, especially in settings with limited data (i.e. virtually all settings except some obscure ones running on synthetic). More broadly, the technique could be seen as referring to trying to intentionally craft an input percept for an evaluator which maximizes the reported evaluation and is invariably nasty.
Besides objective robustness and facilitating oversight in a way reminiscent of active learning (i.e. prioritizing impactful data points), contrastive learning might perhaps help craft adversarial circumstances for an agent to act in. For instance, it might be desirable for the agent to aim to revert this pseudo-wireheaded scenario into one closer to the original version, despite this meaning going directly against the reward gradient.
I'm reminded here of multiple leagues of agents being used to train AlphaStar: perhaps agents could be matched with evaluators of different performance (in reflecting human intent), and learn to systematically outplay their biases. Huh. It'd be like pointing the AGI not at a target specified by one evaluator, but at a target implicitly specified by a trajectory of increasingly precise evaluators.