Hypothesis Subspace

How could league training be applied to contrastive dreaming?

Actually, contrastive dreaming (CD), league training (LT), and adversarial training regimes (GAN) might all three work together really neatly. I see three broad major setups:

CD + LT

  1. Train evaluator E1 to penalize its contrastive dreams D1 while correlating with human ratings.
  2. Train evaluator E2 to penalize D1 and D2 while correlating with human ratings.
  3. ...
  • E → figure its systematic blindspots.
  • A → N/A
  • E > A

GAN + LT

  1. Train evaluator to discriminate among human plans / the agent A1's artificial plans.
  2. Train evaluator to discriminate among human plans / A1's plans / A2's plans.
  3. ...
  • E → figure agent's systematic tricks.
  • A → figure evaluator's systematic blindspots.
  • E = A

CD + GAN + LT

  1. Train evaluator E1 to discriminate among human plans / A1's plans / its contrastive dreams D1.
  2. Train evaluator E2 to discriminate among human plans / A1's plans / D1 / A2's plans, D2.
  3. ...
  • E → figure agent's systematic tricks and its own blindspots.
  • A → figure evaluator's systematic blindspots.
  • E > A

I have shuffled plans as evaluation targets with world states somewhat. World states most intuitively go with regression targets, while plans go with classification ones. However, that's not mandatory. You can assign humanness to plans to frame it as a regression problem, or classify world states as appropriate/inappropriate, for instance.

How could league training be applied to contrastive dreaming?