
Now loading...
On June 12, the U.S. government implemented export controls on Anthropic’s latest artificial intelligence models, Claude Fable 5 and Claude Mythos 5. This required the company to deny access to these models for users both within and outside the U.S. Due to the immediate nature of the order and the inability to verify users’ nationalities in real-time, Anthropic temporarily suspended access to both models for all users.
As of June 30, the restrictions on Fable 5 and Mythos 5 have been removed. Starting tomorrow, July 1, Claude Fable 5 will again be available to users around the world via the Claude Platform, Claude.ai, Claude Code, and Claude Cowork. For customers using Pro, Max, Team, and select Enterprise plans, Fable 5 will be part of usage limits for the first week, after which it will be accessible through usage credits. The company aims to restore access on platforms like AWS, Google Cloud, and Microsoft Foundry as swiftly as possible.
Access to Mythos 5 has also resumed for a select number of U.S. organizations, following government approval received on June 26. Anthropic is continuing its efforts to expand access for a broader range of partners involved in its Glasswing program.
Further updates from Anthropic cover several areas including a timeline of events leading to the export controls, the company’s approach to implementing safeguards, the development of a shared framework for the industry, and enhanced collaboration with the U.S. government. Details will discuss how the measures put in place were designed to protect against misuse of AI models and the importance of a consistent methodology for addressing potential vulnerabilities.
Fable 5 and Mythos 5 were launched on June 9, with the former including stronger safeguards intended to enhance safety for general usage. The controls arose after researchers from Amazon reported methods to bypass certain safety measures, which allowed the model to reveal software vulnerabilities. Anthropic’s investigations revealed similar vulnerabilities could also be identified by less advanced models.
In response, Anthropic has developed an improved safety classifier to address these concerns, which blocks the identified bypass technique over 99% of the time. Nonetheless, the adjustments may lead to increased blocking of benign requests, a situation Anthropic plans to continually refine. The new safety measures come at the cost of potentially denying access to legitimate cybersecurity tasks considered low-risk.
The company clarified that while Claude Mythos 5 can effectively uncover and exploit software weaknesses, Claude Fable 5 offers no such capabilities due to its stringent safeguards. The team has effectively doubled its resources to ensure that these safety mechanisms function properly and mitigate misuse.
Additionally, there is an identified need for a uniform industry standard to evaluate the severity of AI jailbreak techniques, which complicates the response to new vulnerabilities. Anthropic is collaborating with other major companies such as Amazon and Microsoft to create a framework that will assist in evaluating the nature of jailbreaks.
This collaborative approach aims to ensure accurate assessments of how these jailbreaks could be weaponized, enhancing the overall safety of advanced AI models. The proposed framework will include four criteria to gauge jailbreak severity, namely capability gain, breadth of effectiveness, ease of use, and discoverability.
Looking ahead, Anthropic intends to solidify its partnerships with the U.S. government in advancing AI security measures following the June 2 Executive order focused on promoting AI innovation and security. These collaborations involve enhancing pre-release evaluations of models and rapid information sharing concerning jailbreaks.
As the landscape of AI continues to evolve, Anthropic’s efforts reflect a commitment to ensuring safe and responsible development of new technologies, prioritizing both security and collaboration with industry partners and government entities to address emerging risks and benefits associated with AI advancements.
