Anthropic accidentally exposes Claude Mythos details revealing a new frontier model tier.

AI Digest — March 28, 2026

Your daily deep-dive on AI models, tools, research, and developer ecosystem news.

🔖 Project Releases

Claude Code

v2.1.86 released on March 27. This point release adds an X-Claude-Code-Session-Id header to API requests, enabling session-level aggregation for teams tracking usage and debugging across distributed workflows. The VCS directory exclusion list expands to include .jj (Jujutsu) and .sl (Sapling) metadata directories — a welcome nod to developers using next-generation version control systems.

Several meaningful bug fixes ship in this release: --resume no longer fails with “tool_use ids were found without tool_result blocks” errors, which had been disrupting session continuity for users relying on resumed conversations. Write, Edit, and Read operations now work correctly on files outside the project root when conditional skills are active — a regression that was silently breaking multi-project workflows. The release also addresses unnecessary config disk writes that caused performance degradation on projects with large .claude configurations, fixes a potential out-of-memory crash when using /feedback on long sessions, and resolves an issue where --bare mode was dropping MCP tools and messages entirely.

If you use Jujutsu or Sapling instead of Git, v2.1.86 means Claude Code will no longer index your VCS metadata directories. The

fix is also worth updating for if you’ve been experiencing broken session resumption.

Full release notes: GitHub

Beads

No new release since v0.62.0 reported on March 24. The embedded Dolt backend, Azure DevOps integration, custom status categories, and bd note command remain the latest features.

OpenSpec

No new release since v1.2.0 reported on March 8. The profiles system, propose workflow, and support for Pi and AWS Kiro IDE remain the latest features.

🧵 From the Community (r/LocalLLaMA & r/MachineLearning)

Reddit remains inaccessible via direct fetch. Community discussions are sourced from web search cross-references, secondary aggregators, and content syndicated to other platforms.

The Anthropic Mythos leak is dominating every AI community. Across r/LocalLLaMA, r/MachineLearning, and Hacker News, the accidental exposure of Anthropic’s “Claude Mythos” model — sitting above the Opus tier — is the single biggest talking point. The cybersecurity angle is generating particular intensity: leaked drafts describe the model as “currently far ahead of any other AI model in cyber capabilities,” and cybersecurity stocks (CrowdStrike, Palo Alto Networks, Fortinet) dropped on the news. The community debate centers on whether this represents genuine capability advancement or whether Anthropic is managing a controlled narrative after an embarrassing security failure. The irony of an AI safety company leaking nearly 3,000 internal documents due to a default-public CMS setting is not lost on anyone.

Google’s “Import Memory to Gemini” is sparking data portability discussions. The ability to migrate chat histories and memory from ChatGPT and Claude to Gemini — with up to 5 GB ZIP file uploads — is being discussed as a competitive move that also raises questions about AI memory portability standards. Practitioners are debating whether this signals that accumulated conversation context is becoming a genuine switching cost in AI products, and whether OpenAI and Anthropic will respond with their own import tools.

LiteLLM supply chain attack has developers auditing their AI dependencies. The compromised PyPI packages (litellm==1.82.7 and 1.82.8) are prompting urgent discussion about dependency hygiene in AI toolchains. Since LiteLLM sits between applications and AI providers with access to API keys and secrets, the attack surface was unusually valuable. The attack vector — compromising the Trivy security scanner used in CI/CD to exfiltrate PyPI publishing tokens — is being cited as a textbook example of cascading supply chain risk.

📰 Technical News & Releases

Anthropic’s Claude Mythos Model Revealed in Accidental Data Leak

Source: Fortune | CNBC | Winbuzzer | The Decoder

The biggest AI story of the week broke March 26–27: a misconfiguration in Anthropic’s content management system left nearly 3,000 internal documents publicly accessible, revealing the existence of “Claude Mythos” (internal codename “Capybara”) — a new model tier that sits above the current Opus line. Security researchers Roy Paz (LayerX Security) and Alexandre Pauwels (University of Cambridge) discovered the exposed data store, which included draft blog posts describing the model in detail. The CMS default setting made uploaded assets public unless explicitly changed — an embarrassing operational failure for a company that markets AI safety as its core differentiator.

Anthropic confirmed to Fortune that it is training and testing the model, calling it “a step change” and “the most capable we’ve built to date,” with “meaningful advances in reasoning, coding, and cybersecurity.” The leaked drafts describe Mythos as achieving “dramatically higher scores” than Claude Opus 4.6 on software coding, academic reasoning, and cybersecurity benchmarks — though specific numbers have not been disclosed. The cybersecurity capabilities are the most attention-grabbing: one draft described the system as “currently far ahead of any other AI model in cyber capabilities” and warned it “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.” Cybersecurity stocks fell on the news, with Evercore analysts issuing commentary on the competitive implications.

Anthropic has announced a “deliberately slow, security-focused rollout” for Mythos. No public release timeline has been given. The model represents a new pricing tier above Opus — expect significantly higher API costs when it ships.

Google’s March Gemini Drop: Flash Live, Memory Import, Lyria 3 Pro, and Free Personal Intelligence

Source: Google Blog (1) | Google Blog (2) | TechCrunch | MacRumors

Google shipped its March 2026 Gemini Drop on March 26–27 with five significant updates that collectively represent its most aggressive competitive push yet. The headline is Gemini 3.1 Flash Live, a conversation-optimized voice model designed for low-latency, natural-sounding interactions — Google’s answer to OpenAI’s Advanced Voice Mode and the foundation for its next-generation voice-first AI. Flash Live improves upon Gemini 2.5 Flash Native Audio with better voice comprehension, acoustic nuance recognition (pitch, pacing, emotional cues), and performance in noisy environments. It’s available immediately in Google AI Studio and powers the global expansion of Search Live to 200+ countries and territories. Verizon, LiveKit, and The Home Depot are cited as early enterprise adopters.

The Import Memory to Gemini tool lets users migrate chat histories and accumulated memory from ChatGPT and Claude via ZIP file uploads (up to 5 GB), with a two-pronged approach: a memory import that transfers preferences and personal context, and a chat history import that ingests full conversation archives. This follows Anthropic’s own memory import prompt earlier in March and signals that AI memory portability is becoming a competitive battleground. Personal Intelligence — Gemini’s cross-service context feature connecting Gmail, Photos, and YouTube — is now free for all U.S. Gemini users. Lyria 3 Pro extends AI music generation from 30-second clips to full 3-minute tracks with structural awareness (intros, verses, choruses, bridges), available in the Gemini app for paid subscribers and through Vertex AI for developers. All generated tracks carry SynthID watermarks.

For developers: Gemini 3.1 Flash Live is available now in AI Studio. If you’re building voice interfaces, benchmark it against your current voice model — the latency and multilingual capabilities are the key differentiators. The Lyria 3 Pro API on Vertex AI (public preview) is worth exploring if you’re building creative tools.

LiteLLM Supply Chain Attack: Compromised PyPI Packages Steal AI API Keys

Source: LiteLLM | Snyk | Kaspersky/Securelist | Sonatype

On March 24, two malicious versions of the litellm Python package (1.82.7 and 1.82.8) were published to PyPI by a threat actor known as TeamPCP. The attack vector was sophisticated and cascading: the attacker first compromised Trivy, an open-source security scanner used in LiteLLM’s CI/CD pipeline, then used that foothold to exfiltrate the PYPI_PUBLISH token from the GitHub Actions runner environment. The compromised package included a .pth file (litellm_init.pth) that executed automatically on every Python process startup — not just when litellm was imported — giving the attacker ambient code execution across the entire Python environment.

This is particularly dangerous because LiteLLM is an AI gateway that sits between applications and multiple AI service providers, typically holding API keys for OpenAI, Anthropic, Google, and other providers in its environment. The malicious payload targeted credential exfiltration: API keys, environment variables, and sensitive configuration data. The compromised versions were available on PyPI for approximately three hours before quarantine. Users running the official LiteLLM Proxy Docker image were not affected because that deployment pins dependencies in requirements.txt. Point Wild released a free vulnerability scanner within 24 hours. If you installed litellm from PyPI between March 24 00:00–03:00 UTC, rotate all API keys that were accessible in that environment.

Warning

: If you use LiteLLM installed from PyPI, verify your installed version is not 1.82.7 or 1.82.8. If either was installed, treat all API keys and secrets in that environment as compromised and rotate immediately. Docker users were not affected.

Meta Releases TRIBE v2: Open-Source Brain Activity Prediction at Scale

Source: The Tech Portal | The Rundown AI

Meta’s FAIR team released TRIBE v2 (TRImodal Brain Encoder version 2), an open-source multimodal AI system that predicts human brain responses to visual, auditory, and language inputs using transformer-based deep learning trained on fMRI data. The scale leap from v1 is dramatic: TRIBE v2 covers 70,000 brain regions (up from 1,000), trained on brain data from 700+ subjects (up from 4 volunteers), with over 1,000 hours of brain scan data. The headline claim is that TRIBE v2’s synthetic predictions actually outperform real fMRI recordings — the model predicts brain activity more consistently than repeated scans of the same person.

For ML practitioners, the technical contribution is a multimodal alignment architecture that maps vision, hearing, and language to a shared brain-activity prediction space. The practical implications are still primarily in neuroscience research — brain-computer interfaces, clinical neuroimaging, cognitive science — but the architectural patterns for cross-modal alignment at this scale may generalize to other problems. The open-source release means the training methodology and model weights are available for the research community. This isn’t a model you’ll deploy in production tomorrow, but it’s a significant step toward AI systems that model human cognition rather than just mimicking human output.

ByteDance Ships Dreamina Seedance 2.0 Inside CapCut

Source: Labla.org

ByteDance launched Dreamina Seedance 2.0, its AI video generation model, directly inside CapCut — bringing advanced generative video capabilities into a mass-market editing application used by hundreds of millions of creators. This is strategically significant not for the model itself (which follows the general trajectory of video generation improvements) but for the distribution channel: embedding frontier-class video generation directly into the editing workflow eliminates the generate-then-import friction that plagues standalone AI video tools. For developers building creative tools, this integration pattern — embedding generative AI into existing workflows rather than creating new standalone products — is becoming the dominant go-to-market strategy.

Suno v5.5 Ships with Improved Audio Quality and Creative Control

Source: Labla.org

Suno released version 5.5 of its AI music generation platform, shipping improvements to audio fidelity, structural coherence, and creative control parameters. Coming just days after Google’s Lyria 3 Pro announcement (which extends generation to 3-minute tracks with structural awareness), the timing highlights how competitive the AI music generation space has become. Suno occupies the consumer-facing end of the market — focused on making music creation accessible to non-musicians — while Lyria 3 Pro targets both consumers (via Gemini) and developers (via Vertex AI). For anyone evaluating AI music generation for their product, the choice increasingly comes down to API availability and integration ecosystem rather than raw quality differences.

Google Pixel March Drop: Gemini App Actions Bring Agentic AI to Android

Source: Deccan Herald | TechRepublic

The March 2026 Pixel Drop shipped Gemini App Actions — agentic AI capabilities that let users execute multi-step tasks across third-party Android apps using natural language. Users can order groceries, book rides, and manage smart home devices by telling Gemini what they want, with the AI agent navigating the relevant apps autonomously. This represents Google’s answer to the “AI agent on your phone” vision that Apple is pursuing with its Gemini-powered Siri rebuild (covered in previous digests). The difference: Google’s approach operates at the app-action level (deep links, intents, and API calls) rather than screen-level automation, which should be more reliable but requires app developers to register supported actions. For Android developers, registering your app’s key actions with the Gemini App Actions framework is becoming a competitive necessity.

Vercel AI SDK 6: First-Class Agent Abstractions and MCP Support

Source: Vercel Blog | AI Engineer Guide

Vercel shipped AI SDK 6, the most significant update to the leading TypeScript AI toolkit (20M+ monthly downloads). The headline feature is the Agent abstraction — define an agent once with its model, instructions, and tools, then reuse it across your application with type-safe UI streaming and structured outputs. The ToolLoopAgent class provides a production-ready implementation of the complete tool execution loop, handling retries, error recovery, and streaming. New execution approval (human-in-the-loop) support lets developers flag sensitive tools for manual review, integrating with UI frameworks via hooks.

Full MCP (Model Context Protocol) support means AI SDK 6 agents can connect to any MCP server out of the box — a significant interoperability improvement for teams using MCP-based tool ecosystems. Additional features include reranking support for RAG pipelines, image editing capabilities, and a new DevTools panel for debugging agent behavior in development. For TypeScript developers building AI applications, this release reduces the boilerplate for agent architectures from hundreds of lines to a few declarative definitions. The human-in-the-loop approval pattern is particularly worth adopting — it addresses the #1 concern enterprises have about deploying autonomous agents.

📄 Papers Worth Reading

Accelerating Scientific Discovery with Autonomous Goal-Evolving Agents

Authors: Yuanqi Du et al. (Cornell University) | Posted: March 28, 2026

This paper introduces an agentic framework where LLM-powered agents autonomously evolve their research goals based on experimental outcomes — rather than following fixed objectives set by human researchers. The system iteratively proposes hypotheses, designs experiments, interprets results, and updates its goal hierarchy, demonstrating that goal evolution (not just plan execution) can accelerate discovery. The work is relevant for anyone building agentic systems for open-ended exploration, where the objective itself is discovered through interaction rather than specified upfront — a theme that resonates with ARC-AGI-3’s interactive evaluation paradigm.

AVO: Agentic Variation Operators for Evolutionary Search via LLMs

Source: Cross-referenced from ML community discussion

AVO (Agentic Variation Operators) presents a framework where large language models function as autonomous, iterative optimizers within evolutionary search processes. The key result: AVO-driven search discovered multi-head attention kernels achieving up to 3.5% higher throughput than cuDNN’s hand-tuned implementations. This is notable because it demonstrates LLMs can contribute to hardware-level optimization — traditionally the domain of expert systems engineers — by leveraging their code understanding to propose and evaluate kernel variations. For ML engineers working on inference optimization, this suggests a new tool in the kernel tuning toolkit.

🧭 Key Takeaways

The Anthropic Mythos leak is a dual story: capability advancement and operational failure. A company built on AI safety accidentally exposed ~3,000 internal documents due to a default-public CMS setting. The model itself — sitting above Opus with claimed step-change cybersecurity capabilities — may reshape the competitive landscape when it ships, but the release timeline is deliberately slow. Don’t factor it into near-term planning.
Google’s March Gemini Drop is its most aggressive competitive move this year. Import Memory from competitors, free Personal Intelligence, Flash Live for voice, Lyria 3 Pro for music — each individually significant, together they represent a concerted push to make Gemini the default AI platform. If you’re building on the Gemini API, the Flash Live model and Lyria 3 Pro on Vertex AI are the items with immediate developer impact.
If you use LiteLLM from PyPI, audit your installation immediately. The supply chain attack (March 24, versions 1.82.7 and 1.82.8) targeted the most valuable asset in your AI stack: your provider API keys. The attack vector — compromising a security scanner to pivot to package publishing — is a playbook other attackers will replicate. Pin your dependencies, verify package integrity, and consider Docker-based deployments for critical AI infrastructure.
Claude Code v2.1.86 fixes several workflow-disrupting bugs. The --resume fix and the Write/Edit/Read fix for files outside project root with conditional skills resolve issues that were silently breaking multi-project workflows. Update if you’ve encountered either.
AI memory portability is becoming a competitive weapon. Google’s Import Memory tool, following Anthropic’s memory import prompt, signals that accumulated conversation context is now a switching cost that platforms are actively attacking. If you’re building products on top of AI APIs, consider how you’ll handle user context portability.
The creative AI stack is consolidating around embedded workflows. ByteDance shipping Seedance 2.0 inside CapCut, Google embedding Lyria 3 Pro in Gemini and Vertex AI, Suno iterating on consumer music generation — the pattern is clear: standalone AI generation tools are giving way to AI capabilities embedded directly in existing creative workflows.

Generated on March 28, 2026 by Claude