Hypothesis Subspace

Are training arrangements inspired by parametric ecologies likely to have capability externalities?

Most model arrangements which seem compatible with the parametric ecologies framing (e.g. adversarial training, backtranslation, diffusion) have only been recognized as being so retroactively, in a descriptive way. Taking stock of those possible manifestations of a broader alphabet of arrangements, it seems that they've mostly been developed in order to specifically bolster capabilities. For instance, diffusion lead to unprecedented performance in image generation, and backtranslation has helped push the state-of-the-art on machine translation.

This begs the question of whether pursuing research on parametric ecologies as a generalization of training regimes might similarly push capabilities. This feels like a sensible concern. Often, the best outcome seems to be that both (1) models developed for practical use, and (2) models developed to oversee and evaluate the former are being pushed further. In a sense, the concerning model might be kept in check by an evaluator which proxies human preferences, while both would co-evolve into higher capabilities at the same time. For instance, in a GAN-like training regime, both generator and discriminator improve steadily over time (given some experience in the alchemy of hyperparam tuning).

However, that seems like a neutral outcome at best. What if the generator is a step ahead? Is it okay if the arms race escalates beyond human oversight capabilities? We'd ideally want training regimes and model arrangements which bolster the safety mechanisms and alignment-critical components in particular, and leave raw capabilities trailing a bit behind. We'd perhaps want a stronger focus on stability beyond the sensitive equilibrium of an arms race. The feasibility of this better outcome probably depends on the specifics and generativity of the parametric ecologies.

What's more, scaling laws indicate that currently data availability is the main impediment against pushing the state of the art. As parametric ecologies focus on clever uses of data and models so as to guide learning with limited resources (e.g. denoising, backtranslation), it feels like this might be particularly problematic, as the general gain in capabilities might be counterfactually rare.

Are training arrangements inspired by parametric ecologies likely to have capability externalities?