Agent Detection to Counter Election Interference

Fake news and misinformation are familiar concepts to many by this point. With upcoming major elections of work powers, e.g. U.S. presidential election there could be a presence of parties whos motivation is to derail the public discourse in spaces like Twitter, Facebook, Reddit.

I propose that there is a likelihood that LLMs of various parameter sizes will be used to spread misinformation. They also can engage in conversation to make themselves appear more human. I also suggest that it is likely that these LLMs either use open or closed-source models, with open models being more likely due to limited regulation and control by the original research organization.

As they likely have been trained for safety, you could elicit the response "As a language model..." "As a helpful assistant..."

To address this I propose a method to make these models out themselves.

Create a method that constructs prompts automatically that cause the model to reveal it is a model
Package it in an easy to use twitter bot or browser extension
This revealing input shouldn't violate TOS and can be optimized based on prior bot posts/inputs on platform

Agent Detection to Counter Election Interference

Answers 0

Discussion 2