Open-ended

▲ 2 ▼

Open

Circuit investigation: Compare tasks for nL model to a (n+1)L model

by Sabrina Zaki

Look for tasks that an nL model cannot do but a (n+1)L model can - look for a circuit!

Proposal:

Build the infrastructure to do this - run two models over a lot of text and look for big log prob differences (maybe floor the log probs at eg 5, to avoid overfitting to times that one network was incredibly wrong)
Just take a bunch of text with interesting patterns and run the models over it, look for tokens they do really well on, and try to reverse engineer what’s going on - I expect there’s a lot of stuff in here!

Answers 0

No answers yet

Discussion 0

No comments yet.