How would a training story look like for differentiable colonies?
Initialize the simulation. This consists in randomly placing particles across the available N-dimensional space and randomly filling up their potential property slots. Let the totality of particles describe the state of the simulation.
Run the simulation state through a transformer as a feed-forward pass. The resulting set of particles is then piped again through the same transformer. Each pass describes one discrete timestep of the simulation.
After a fixed number of timesteps, stop the simulation and measure the "lifelihood" of the final part of the simulation, the end game. This is based on (1) the amount of information contained in the final simulation states individually (i.e. spatial entropy), and (2) the amount of information contained in the sequence of final simulation states as a whole (i.e. temporal entropy).
Backpropagate through the unrolled transformer chain to incentivize it to maximize the lifelihood. How about maximum lifelihood estimation?
Repeat steps 1-4 a large number of times (based on different simulation seeds), to train the model to reliably turn noise into high-lifelihood simulations.
If replicators culture emerges, try to translate the knowledge they synthesized into human-legible terms. In a sense, you'd learn from artificial and general intelligences without them having strong priors on the seed human world.