AI Safety Ideas
Open-ended
Open

ML Chinese Whispers

by Gurkenglas

Given a network, train another to reconstruct an input from the bottom half of the third layer.

Given an input, sample ten input-guesses to visualize what that half-layer remembers about the input.

Deep LearningInterpretability & ExplainabilityReview

Answers

No answers yet.

Discussion

  • Esben Kran

    This seems like a project that can really be expanded for many network types and interesting cases.

  • Gurkenglas

    Followup projects include "find an interesting half of a layer" and "fine-tune the original network to produce more interesting parts".