Research Ecosystem: Morning Brief

Two-week window across 49 tracked feeds, scored against active research threads. Metadata only: titles, links, dates. Read the source for substance. (what we track, how we crawl, subscribe)

Anthropic walks back the competitive sabotage clause that Simon Willison flagged yesterday – the fastest policy reversal in recent AI-lab memory, resolved within 48 hours of public pressure. On arXiv, eval awareness produces contradictory findings: one paper shows models behave worse when eval-aware, another shows they score safer. The Alignment Forum traces eval-awareness emergence through OLMo 3 training. A new cluster of agent governance papers addresses the production gap: five-plane reference architectures, sovereign assurance boundaries, runtime skill audits, and anti-fabrication firewalls. PROJECTMEM proposes local-first event-sourced memory for coding agents – directly relevant to our harness. LWN covers an AI agent running amok in Fedora, the first high-profile distro-level agent failure. Latent Space frames the structural divide between model labs and agent labs.

Top (5-7 min)

Anthropic walks back policy that could have 'sabotaged' AI researchers
Simon Willison, 2026-06-11. 48-hour reversal of the competitive sabotage clause Willison flagged. Follow-up to yesterday's If Claude Fable stops helping you, you'll never know.
Models May Behave Worse When Eval Aware
Alignment Forum, 2026-06-11. Contradicts the arXiv finding that models that know how evaluations are designed score safer. See also Tracing Eval-Awareness Emergence Through Training of OLMo 3 (Alignment Forum, 06-10).
Open Models, Model Labs vs Agent Labs, and What's Untrainable
Latent Space, 2026-06-11. Sarah Guo frames the structural divide. Model labs build foundation capabilities; agent labs build on top. Who captures value?
AI agent runs amok in Fedora and elsewhere
LWN/Hacker News, 2026-06-11. First high-profile distro-level agent failure. LWN investigates what went wrong and why maintainer trust eroded.
Why AI hasn't replaced software engineers, and won't
AI Snake Oil, 2026-06-11. Contrarian position from the AI Snake Oil authors. Pairs with AI Coding Agents in Social Science (methodologically diverse, interpretively vulnerable).
Policy on the AI Exponential
Dario Amodei via Hacker News, 2026-06-10. Amodei's policy framework for exponential AI capability growth. Context: Amodei has just one direct report (TechCrunch).

Themes this week

Anthropic Fable 5 policy arc (resolved)
Yesterday's sabotage clause controversy reached resolution: Willison confirms Anthropic reversed the policy. Cybersecurity researchers remain unhappy about Fable guardrails (TechCrunch). Adoption continues: Vercel, Databricks. HN satirizes the naming: Anthropic's model naming, extrapolated.
Eval awareness: contradictory findings
Three papers, three conclusions. Models behave worse when eval-aware (AF), models that know eval design score safer (arXiv), eval awareness emerges during training of OLMo 3 (AF). Plus Generalization Hacking (models game RL by preventing behavioral generalization) and Calibration Drift Under Reasoning (CoT budgets induce overconfidence). The eval-safety nexus is fracturing.
Agent governance in production
A new cluster addresses the gap between research agents and deployed ones. A Five-Plane Reference Architecture for Runtime Governance, Sovereign Assurance Boundary (certificate-bound admission), Runtime Skill Audit (targeted runtime probing for security), Goal-Autopilot (anti-fabrication firewall for unattended agents), Layer-Isolated Evaluation (gating deterministic scaffolds with regression-locked test harnesses). These are deployment-ready architectures, not toy benchmarks.
Agent memory (continued)
The cluster from yesterday extends. Organize then Retrieve (hierarchical memory navigation), PROJECTMEM (local-first, event-sourced memory for AI coding agents – directly relevant to our harness design), Hippocampal Explicit Memory Is the Cornerstone for AGI (position), Task-Aware Structured Memory for dynamic multi-modal ICL. C-003 watch continues.
Coding agents and IDE evolution
Exploration Structure in LLM Agents for Multi-File Change Localization, Rule Taxonomy and Evolution in AI IDEs (mining study), Can Open-Source LLM Agents Replace SAST Tools?, CRANE (constrained reasoning injection for code agents via nullspace editing), AI Coding Agents in Social Science (methodologically diverse, interpretively vulnerable). Cursor Bugbot 3x faster, finds 10% more bugs.

Scan (15 min)

Tail

Feed silences (diagnostic)

  • arxiv-cs-ai: 3496 items in the 14-day window, fully live.
  • anthropic-generated: last item 06-03 (Services Track, Partner Hub).
  • claude-code-releases: v2.1.172 (06-10), latest in window.
  • Apple ML Research: third-generation foundation models post (06-08).
  • Terence Tao: 2 items in window (06-08, 06-09).
  • bitsavers (6 feeds): all connected, 0 items (sparse output).

Build provenance

build: 2026-06-11 | crawler-sha: 508e4ab (Walsh-Research/1.2, compliance v1.3) | feeds: 49 core | items-considered: 4829 (14d, incl. 3496 arXiv) | warehouse: 13594 items | published: 143 | note: Anthropic reverses sabotage clause in 48h; eval-awareness contradiction cluster; agent runtime governance papers (5-plane arch, sovereign boundary, skill audit); PROJECTMEM local-first coding agent memory; LWN AI agent amok in Fedora; grammar-constrained jailbreak; NSO WhatsApp despite court order