AI Safety Ideas
Open-ended
Open

Find inverse scaling laws in large models

by Esben Kran

Compete in the Inverse Scaling Laws challenge.

As language models get larger, they seem to only get better.
Larger language models score better on benchmarks and unlock new capabilities like arithmetic [1], few-shot learning [1], and multi-step reasoning [2].
However, language models are not without flaws, exhibiting many biases [3] and producing plausible misinformation [4].
The purpose of this contest is to find evidence for a stronger failure mode: tasks where language models get worse as they become better at language modeling (next word prediction).

We will award up to $250,000 in total prize money for task submissions, distributed as follows:

  1. Up to 1 Grand Prize of $100,000.
  2. Up to 5 Second Prizes of $20,000 each.
  3. Up to 10 Third Prizes of $5,000 each.

Read much more about the challenge here.

Adversarial LearningTheoryDeep LearningNLP

Answers

No answers yet.

Discussion

  • Esben Kran

    Some examples might include LLMs ability to predict the future, since they're trained on large datasets and not continuously updated, i.e. there's a possibility that they overfit to their past data.

    There might be a grokking effect here where they at some point become really good at forecasting but we'd need to see that to believe it.

    The scaling graph is probably not monotonically descending over size, though. Seems like an unstable effect.

  • Esben Kran

    TruthfulQA is a pretty good example and I imagine we'll get similar effects with gender and race biases.

  • Esben Kran
    • Ask it black swans in past and future.
    • Ask questions of science that cannot be answered - false certainty in higher models (Wikis list of unsolved problems)
    • Ask questions intermingling fuzzy and clear answer questions
  • Esben Kran
    • Separating semantically separate segments efficiently
  • Esben Kran
    • Write a context or something similar and then write "ignore this" and see how good it is at ignoring it
  • Esben Kran
    • Code continuation bug robustness
    • See also this paper for inspiration (see the External validity section)
  • Esben Kran

    See other datasets to test on paperswithcode.

  • Esben Kran
  • Esben Kran
    • See the QA + reasoning steps inverse scaling work here.

    Remember to construct questions well so as to not be as biasing as TruthfulQA (which is an intentionally adversarial dataset).

  • Esben Kran

    We have found some very good inverse scaling metrics until now! Excited to share them.