Shortest and not the steepest path will fix the inner-alignment problem
by Sabrina Zaki
Replacing the 'stochastic gradient descent' SGD) with something that takes the shortest and not the steepest path should just about fix the whole inner-alignment problem
Deep Learning