The Trump administration has struck a significant reversal on AI oversight. The US Department of Commerce has reached agreements with Google DeepMind, Microsoft, and xAI to review early versions of their new AI models before they are released to the public — a move that signals the administration is no longer comfortable with a purely hands-off approach to frontier AI.

The agreements were confirmed by the Commerce Department's new AI Safety Institute successor, CAISI, which will conduct the evaluations. The news came alongside separate reporting that the White House is studying an executive order that would require AI models to pass security vetting before commercial release.

A Quiet Policy Shift

The significance of the move is in the context. When the Trump administration took office, it largely dismantled the Biden-era AI safety framework, rolling back the October 2023 executive order and initially signalling that AI regulation would be minimal. The new agreements suggest that position has evolved — driven, sources say, by national security concerns rather than consumer protection.

CAISI, the Commerce Department's AI safety testing body, will evaluate frontier models from Google, Microsoft, and xAI before public release under the new voluntary agreements.
CAISI, the Commerce Department's AI safety testing body, will evaluate frontier models from Google, Microsoft, and xAI before public release under the new voluntary agreements.

The concern is specific: that adversaries could exploit vulnerabilities in frontier AI models — or that models could be used to assist in the development of weapons of mass destruction, cyberattacks on critical infrastructure, or large-scale disinformation campaigns. The vetting framework is designed to identify those risks before models reach the public.

Voluntary, For Now

The current agreements are voluntary. Google DeepMind, Microsoft, and xAI have agreed to participate, but there is no legal obligation to do so — and notably, OpenAI and Anthropic are not listed among the initial signatories. The White House executive order under study would change that, potentially making pre-release security testing mandatory for all frontier models.

"The Trump administration, which took a noninterventionist approach to artificial intelligence, is now discussing imposing oversight on AI models before they are released."

— The New York Times, May 4, 2026

What the Tests Actually Cover

CAISI has not published a detailed methodology, but the evaluations are expected to focus on biosecurity risks (can the model provide meaningful uplift to someone attempting to create a biological weapon?), cybersecurity risks (can it assist in developing novel malware or exploiting critical infrastructure?), and broader safety properties including deception and manipulation capabilities.

The framework mirrors, in broad strokes, the approach taken by the UK AI Safety Institute — which has conducted evaluations of frontier models from several major labs. The difference is that the US framework is being implemented under an administration that has been publicly sceptical of AI regulation.

Industry Response

The three companies that signed agreements have framed their participation as responsible leadership. Google DeepMind CEO Demis Hassabis has previously argued that safety testing is in the industry's long-term interest. Microsoft's position is more complex — the company is simultaneously a major government AI contractor and a commercial AI provider, giving it strong incentives to be seen as a cooperative partner on safety.

xAI's participation is the most surprising. Elon Musk has been publicly critical of AI regulation and was a vocal opponent of the Biden administration's AI executive order. His company's agreement to participate in pre-release testing suggests either a pragmatic calculation about government relations or a genuine shift in position.

What Comes Next

The executive order, if it materialises, will be the most consequential US AI governance action in years. Watch for two things: whether it applies only to the largest frontier models or to a broader class of AI systems, and whether it includes enforcement mechanisms or remains aspirational. The difference between a toothless framework and a meaningful one will be in those details.