[Draft] AI misuse enforcement has a blind spot

AI providers ban accounts, not users. That distinction matters more than it sounds.

When a provider detects misuse, whether it is an influence operation, an attempt to extract dangerous information, or large-scale model distillation, the standard response is to restrict the offending account. But accounts are cheap. A new email address, an anonymous SIM card, and a VPN are enough to start over. OpenAI has reported that in multiple disruption rounds against state-linked influence campaigns, banned accounts were “consistently replaced by new ones exhibiting similar usage patterns.” North Korean threat actors whose accounts were terminated were observed re-registering with different email addresses and resuming similar activities. Anthropic has documented comparable patterns.

Continue reading →

Mar 12, 2026

[Draft] Preventing gaming of AI evaluations: a case study of the Volkswagen diesel emissions scandal

This is a first draft of a post, written by Vinay Hiremath with initial research done along with Rebecca Hawkins and David Varga as part of a one-day AI governance research sprint.

In this post, the Volkswagen diesel emissions scandal uncovered in 2014 is used as a case study for effective enforcement of AI evaluations governance. A discussion of the incentives, motivations, and divergent outcomes between two different regulatory frameworks in the diesel emissions scandal is used to underline the importance of robust governance of AI evaluations and better inform the design of such governance.

Continue reading →

Mar 17, 2024