Defensibility Maximization
A research agenda aimed at developing AI systems that act in morally defensible ways.
I'm interested in developing autonomous systems that act ethically by design. My work is organized into two complementary agendas. Defensibility maximization is a relatively mature line of work focused on bootstrapping a morally defensible alignment target, while agency regularization is a more nascent direction focused on ensuring that the target is being pursued by weeding out dissidence.
I'm grateful for having been supported in my work, either financially or through compute, either recently or less so, by Long-Term Future Fund, TPU Research Cloud, Conjecture, Andreas Stuhlmüller, Stability AI, and AI Safety Camp. If I'm not working on alignment, chances are I'm either running or hosting a game night.
A research agenda aimed at developing AI systems that act in morally defensible ways.
An interactive book series focused on rendering philosophy computable, quantifiable, and verifiable.
A learning project focused on recent trends in machine learning.