Research Ecosystem: Morning Brief
Two-week window across the tracked feeds (60 core feeds this run: agents, evals, interp, formal methods, surveillance, BSD, Clojure/Scheme, SDR, aviation), scored against active research threads. Metadata only: titles, links, dates. Read the source for substance. (what we track, how we crawl)
The arXiv cs.AI firehose is back after the weekend lull (654 items in window, the largest single source); yesterday's brief shed it as Sunday-quiet. The four generated gap-fill feeds (DeepMind, Cursor, Claude Code, Anthropic) remain in with second-hand provenance flagged inline; DeepMind has now aged out of the 14-day window (last post 05-01).
Top (5-7 min)
- No honor among (ad-tech) thieves
- Pluralistic, 2026-05-25. Doctorow on ad-tech firms defrauding each other, the surveillance-economy underbelly that the adversarial ad-tech thread tracks. Fresh today and the cleanest statement of how the incentives rot from the inside.
- Claude Compliance API support with Cloudflare CASB
- Cloudflare, 2026-05-21. Agent governance shipped as a CASB product, paired with Claude Managed Agents. The commercial mirror of the Walsh-Research compliance contract, from the infra seat rather than the spec seat.
- Agentic software development hypothesis
- Marc Brooker, 2026-05-20. States the agentic-SDLC claim as something falsifiable rather than assumed. The CPRR move applied to the loudest claim in the field this quarter.
- Frontier Risk Report (February to March 2026)
- METR, 2026-05-19. Capability elicitation and risk evaluation from a named eval org. Primary material for the eval-under-constraints thread, the empirical counterweight to lab self-reporting.
- Vega: zero-knowledge proofs for digital identity in the age of AI
- Microsoft Research, 2026-05-21. ZK identity proofs pitched as the answer to agent-era verification. The constructive counterpart to the age-verification critique below; worth holding the two side by side.
Themes this week
- Every model lab is now an agent lab
- Latent Space says it outright; OpenAI's feed is wall-to-wall Codex (Gartner leader, Codex from anywhere); Cursor ships Composer 2.5; Microsoft ships MagenticLite and Fara. The product surface has converged on agents, which puts the weight on the harness, sandbox, and compliance layers rather than the model.
- Identity and surveillance get an infrastructure layer
- the same week brings Doctorow on ad-tech fraud, an FTC active-listening settlement, a Markup win on data brokers, and Microsoft's ZK identity proposal. The critique and the build-out are now arguing over the same substrate.
- Eval realism is the contested ground
- Alignment Forum argues for evaluating model behaviors and that risk reports must address deployment-time spread; METR ships a frontier risk report; Microsoft adds SocialReasoning-Bench. The move is from leaderboard deltas to whether an evaluation measures behavior that shows up under deployment.
- Agent-build claims meet scrutiny
- booster framing (Latent Space, OpenAI) meets empirical pushback (AI Snake Oil on the $916 OS, Torvalds on kernel bugs) and falsifiable claims (Brooker's hypothesis).
Scan (15 min)
- Agents and harnesses
- All Model Labs are now Agent Labs, Latent Space, 05-23, the thesis stated outright
- Giving Agents Computers – Ivan Burazin, Daytona, Latent Space, 05-21, sandbox-as-substrate
- Railway: The Agent-Native Cloud – Jake Cooper, Latent Space, 05-20, infra repositioned around agents
- Datasette Agent, Simon Willison, 05-21, a concrete local-first agent over SQLite
- The Open Agent Leaderboard, Hugging Face, 05-18, an open eval surface for agents
- Expert Clojure Workflows for AI Agents: Four Skills, Planet Clojure, 05-14, agent skills in a Lisp shop
- Composer 2.5, Cursor, 05-18, Cursor's in-house agent model (generated feed)
- Claude Code v2.1.150, Claude Code releases, 05-23, the harness shipping multiple builds a day (generated feed)
- AI labs
- OpenAI named a Leader in enterprise coding agents (Gartner), OpenAI, 05-22
- An OpenAI model disproved a central conjecture in discrete geometry, OpenAI, 05-20
- Speed-of-light text generation with Nemotron-Labs diffusion LMs, Hugging Face, 05-23
- Open model bonanza: Gemma 4, DeepSeek V4, Kimi K2.6, GLM-5.1, Interconnects, 05-16
- Empirical Research Assistance (ERA): from Nature publication to computational discovery, Google Research, 05-19
- Project Glasswing: an initial update, Anthropic, 05-22, the lab side of the Cloudflare red-team writeup (generated feed)
- Eval, interpretability, safety
- The Case for Evaluating Model Behaviors, Alignment Forum, 05-20
- Risk reports need to address deployment-time spread of misalignment, Alignment Forum, 05-15
- The safe-to-dangerous shift is a fundamental problem for eval realism, Alignment Forum, 05-14
- SocialReasoning-Bench: whether agents act in users' best interests, Microsoft Research, 05-11
- Further Notes on AI Delegation and Long-Horizon Reliability, Microsoft Research, 05-15
- On AI Security, Schneier, 05-20
- Formal methods, distsys, correctness
- When did the bug start?, Antithesis, 05-11, causality analysis on deterministic traces
- Assumptions weaken properties, Hillel Wayne, 05-20, the cost of every assumption in a spec
- What's Easy Now? What's Hard Now?, Marc Brooker, 05-18
- Chess invariants, Murat Demirbas, 05-21, invariants as a teaching device
- How we 40x'd the performance of a class of temporal queries, XTDB, 05-11
- Surveillance and critique
- No honor among (ad-tech) thieves, Pluralistic, 05-25, the ad-tech economy defrauding itself
- FTC settlement over the "active listening" AI marketing claim, Simon Willison, 05-22
- Californians can more easily escape data brokers after a Markup investigation, The Markup, 05-22
- Bypassing On-Camera Age-Verification Checks, Schneier, 05-15, the technical case against age-gating
- Signal warns it would pull out of Canada over the lawful-access bill, Citizen Lab, 05-14
- The FBI wants to buy nationwide access to license plate readers, 404 Media, 05-18
- Cloudflare and bot infrastructure
- Claude Compliance API support with Cloudflare CASB, Cloudflare, 05-21, agent governance as a product
- Announcing Claude Managed Agents on Cloudflare, Cloudflare, 05-19, a managed control plane for third-party agents
- Project Glasswing: what Mythos showed us, Cloudflare, 05-18, frontier-model red-team writeup
- When idle is not idle: a Linux kernel optimization became a QUIC bug, Cloudflare, 05-12
- Systems, BSD, homelab
- OpenBSD 7.9 released, LWN, 05-21
- FreeBSD Foundation Executive Director tries daily-driving FreeBSD on a laptop, Phoronix, 05-24
- 664: No one misses SPARC, BSD Now, 05-21
- Which ZFS Storage Metrics Matter for Database Performance, Klara Systems, 05-13
- BPF support in GCC 16 and beyond, LWN, 05-21
- SDR, RF, aviation
- GopherTrunk: a pure-Go trunked radio scanner (P25, DMR, TETRA, NXDN), RTL-SDR, 05-19
- MHI RJ to retrofit CRJ fleet with traffic-awareness (ADS-B In), The Air Current, 05-22
- ACAS X is already in the cockpit: the AI-and-ATC debate is three years too late, Leeham, 05-24
- Air France, Airbus guilty of corporate manslaughter in the 2009 AF447 crash, Slashdot, 05-23
- Clojure and Scheme
- Clojure Deref (May 19, 2026), Planet Clojure, 05-19
- Clojure 1.12.5, Planet Clojure, 05-12
- Hoot 0.9.0 released, Spritely Institute, 05-13, Scheme to WebAssembly
- soot, solar, sedimentation, sin, and 'centers, Andy Wingo, 05-16
- Agent-Ready Stack, Planet Clojure, 05-11, a Lisp shop's agent tooling baseline
Tail
- Pope Leo XIV's first encyclical says AI must serve humanity, not the powerful few, Hacker News, 05-25
- California executive order directs agencies to prepare for AI-driven workforce disruption, Slashdot, 05-25
- IBM spins off the first pure-play quantum chip foundry, Hacker News, 05-25
- Jira Is Turing-Complete, Hacker News, 05-25
- Microsoft open-sources the earliest DOS source code discovered to date, Hacker News, 05-24
- The last six months in LLMs in five minutes, Simon Willison, 05-19
Feed silences (diagnostic)
arxiv-cs-ai: recovered. 276 items this run, 654 in the 14-day window, the single largest source. Yesterday's brief shed it as Sunday-quiet; the weekday announcement batch is back.deepmind-blog(generated): aged out of the window. Latest post is 05-01, now outside 14 days, so the DeepMind gap-fill contributed nothing this run. Monthly cadence; lean on first-party labs until the next post.Dan Luu,James Bornholt,Netflix Tech Blog: errors this run (XML parse deep in the archive feed; host-side connection; TLS path). Bornholt and Netflix are transient and left in place to re-check; Dan Luu stays demoted.harvard-seas: 0 items again (the Localist API returned no events this run).Logic Magazine: still demoted (feed URL serves a Netlify 404 page). No longer crawled.- Generated feeds
cursor-blog,claude-code-releases,anthropic-generatedcurrent via 304;deepmind-blogstale (see above). Third-party LLM-scraped RSS for publishers with no first-party feed; provenance second-hand, on a C-004 stability watch.
Build provenance
build: 2026-05-25 | crawler-sha: 44e3db1 (Walsh-Research/1.2, compliance v1.2) | feeds: 60 core (incl. 4 generated gap-fill, 1 aged out) | items-considered: 1059 (14d, incl. 654 arXiv) | published: 49 | note: arXiv firehose recovered post-weekend; DeepMind generated feed aged out of window; 3 feed errors (Dan Luu/Bornholt/Netflix)