Mild Optimisation
by Maris Sala
This proposal is from the article "Alignment for Advanced Machine Learning Systems" where Taylor et al. propose 8 research areas organised around the question: "As learning systems become increasingly intelligent and autonomous, what design principles can best ensure that their behavior is aligned with the interests of the operators?"
Many of the concerns discussed by Bostrom (2014) in the book Superintelligence describe cases where an advanced AI system is maximizing an objective as hard as possible. Perhaps the system was instructed to make paperclips, and it uses every resource at its disposal and every trick it can come up with to make literally as many paperclips as is physically possible. Perhaps the system was instructed to make only 1000 paperclips, and it uses every resource at its disposal and every trick it can come up with to make sure that it definitely made 1000 paperclips (and that its sensors didn’t have any faults).
In all of these cases, intuitively, we want some way to have the AI system just “not try so hard.” The problem of mild optimization is:
- How can we design AI systems and objective functions that, in this intuitive sense, don’t optimize more than they have to?
Many modern AI systems are “mild optimizers” simply due to their lack of resources and capabilities. As AI systems improve, it becomes more and more difficult to rely on this method for achieving mild optimization. As noted by Russell (2014), the field of AI is classically concerned with the goal of maximizing the extent to which automated systems achieve some objective. Developing formal models of AI systems that “try as hard as necessary but no harder” is an open problem, and may require significant research.
Related work:
- Regularization
- Early stopping
Directions for future research are discussed in the source.