AI Safety Ideas
Open-ended
Open

Agent Detection to Counter Election Interference

by Heorhii Skovorodnikov

Fake news and misinformation are familiar concepts to many by this point. With upcoming major elections of work powers, e.g. U.S. presidential election there could be a presence of parties whos motivation is to derail the public discourse in spaces like Twitter, Facebook, Reddit.

I propose that there is a likelihood that LLMs of various parameter sizes will be used to spread misinformation. They also can engage in conversation to make themselves appear more human. I also suggest that it is likely that these LLMs either use open or closed-source models, with open models being more likely due to limited regulation and control by the original research organization.

As they likely have been trained for safety, you could elicit the response "As a language model..." "As a helpful assistant..."

To address this I propose a method to make these models out themselves.

  • Create a method that constructs prompts automatically that cause the model to reveal it is a model
  • Package it in an easy to use twitter bot or browser extension
  • This revealing input shouldn't violate TOS and can be optimized based on prior bot posts/inputs on platform

Answers

No answers yet.

Discussion

  • Anonymous

    Interested to collaborate. Actually fresh from an AI Policy Case Competition on this topic actually.

  • Anonymous

    ^ discord: 🙋🏻‍♀️ @lgchua