AI Safety Ideas
Open-ended
Open

Convert a simple dense network into symbolic code

by Esben Kran

If given a fully trained model, can you meaningfully reduce the dimensionality / make it more interpretable to different degrees by creating a compiler that compiles neural networks into a program?

  1. Write out a program that takes the weights of a neural networks and outputs a program directly representing the network.
  2. Implement a "variable of fuzzy representation" μ to allow for different compressions of the symbolic program based on the variable's representation of "how precisely should the program represent the network".

Examples of possible implementations at different μ are:

  • Complete re-representation of the network through direct activation function + variable execution
  • Continuous neural inputs into binary if/else
  • Abstracting the layer representation

I recommend using a 3-layer, sub-50 neuron network for iteration and ability to execute the compiled program.

Deep LearningInterpretability & ExplainabilityTheory

Answers

No answers yet.

Discussion