Open-ended
How can we utilize evaluations in defense for deployed AI systems?
Basically, where the heck are evaluations useful in the production systems we deploy into the world?
Considerations for areas of implementation:
- Financial markets
- Military applications
- Foundation model training
- Risk governance
See related projects in this topic from e.g. Lakera.ai and the National Institute for Standards and Technology.
Field-BuildingAI GovernanceReview