Simon Willison's git timeline of Claude system prompts — Fun archaeology for prompt nerds, but Anthropic's diffs don't change what you ship Monday morning. Skip unless you write jailbreaks for a living.
arXiv paper on LLMs detecting methodological flaws in UAV rescue gesture recognition — Niche stacked on niche. Gesture recognition for drones, graded by an LLM. Cute benchmark, zero signal for anyone not writing that exact paper.
VeriGraphi multi-agent RTL generation framework from arXiv — Another hierarchical agent swarm writing Verilog. Wake me when someone tapes out silicon from it instead of posting a PDF.
🎯 YOUR MOVE
-- do this today
🎯
Upgrade llm-anthropic to 0.25, wire it to claude-opus-4.7, and run our three hardest eval prompts at thinking_effort xhigh before Friday. If xhigh doesn't beat high on reasoning tasks, we keep the cheaper dial.
⚡
Rip out the custom agent sandbox and port our two longest-running agents to OpenAI Agents SDK v2 this week. Measure tokens, latency, and crash rate against the old harness before we celebrate.
🔧
Drag our head of people into Thursday's agent planning review and give them veto power on any workflow that touches a human's job description. Semafor is right, this is not a next-quarter problem.
🎙️ NOTEBOOKLM SOURCE
🎧Generate Podcast with NotebookLMtap to expand
# Cup of Mojo -- Daily AI Brief -- Sunday, April 19, 2026
Broadcast-ready source document. Read the COLD OPEN verbatim, cover every numbered beat in the RUNDOWN in tier-appropriate airtime, close with the MOJO TAKE verbatim. APPENDIX is reference only.
**Articles analyzed:** 10
**Sources monitored:** 9
**Beats to cover:** 10
---
## EDITORIAL GUIDANCE FOR PODCAST NARRATION
Read these rules before narrating. They are non-negotiable.
1. **Read the COLD OPEN verbatim.** Do not rewrite or paraphrase it. It is the episode's signature opening.
2. **Cover every numbered beat.** Beats are numbered 1 through 8. Do not skip any beat. Do not consolidate. Each beat gets its own moment.
3. **Honor the tier structure.** `deep` beats get longer treatment with full context. `standard` beats are structured but concise. `rapid_fire` beats are short and punchy. Use roughly 2 minutes for the deep beat, 1 minute per standard beat, 20-30 seconds per rapid-fire beat.
4. **Cite sources by name** when presenting a claim. Say "OpenAI announced..." not "a company announced".
5. **Use only the plain-English text in each beat.** Do not pull technical jargon from the APPENDIX. The appendix is reference material for context, not script content. If a beat does not mention a term, do not introduce it.
6. **Only use numbers that appear in a beat's own text.** Do not import statistics from the appendix. Omit rather than fabricate.
7. **Reference earlier beats when topics connect.** Each beat has a `callbacks` field listing earlier beat numbers it relates to. When narrating, explicitly link back: "Remember that supply chain attack from Beat 1? This next one shows how the downstream risk compounds." Callbacks create cohesion and prevent the episode from feeling like a list.
8. **Introduce one skeptical angle per deep or standard beat.** Phrases like "one caveat", "critics will point out", or "this is not yet peer-reviewed" create credibility. Rapid-fire beats can skip this.
9. **Use the pronunciation guide for every named person or company.** Do not guess pronunciations.
10. **Close with the MOJO TAKE outro.** Read it as the host's editorial perspective, not as a summary.
---
## PRONUNCIATION GUIDE
The following names appear in today's content. Use these phonetic pronunciations:
- **Jensen Huang** — pronounced *JEN-sen HWAHNG*
- **Anthropic** — pronounced *an-THROP-ik*
- **Qwen** — pronounced *CHWEN*
---
## COLD OPEN -- Read This Verbatim
Read the HOOK line first, pause for a beat, then the TEASE. Do not rewrite. Do not paraphrase. Do not add any preamble.
> **Hook:** Simon Willison shipped llm-anthropic 0.25 while OpenAI quietly handed Cloudflare the keys to its agent stack. Tuesday is doing real work.
> **Tease:** Inside: what the new Agents SDK evolution actually changes for builders, why Cloudflare plus OpenAI is a bigger deal than the press release suggests, and the Semafor scoop on CTOs and CHROs suddenly sharing a calendar. Plus a VC concentration chart that should scare you.
---
## TODAY'S RUNDOWN
Cover every beat in order. Do not skip. Tier labels tell you how much airtime each beat deserves.
### Beat ? [DEEP] — Simon Willison ships llm-anthropic 0.25 with Claude Opus 4.7 and an xhigh thinking dial
**Source:** Simon Willison | https://simonwillison.net/2026/Apr/16/llm-anthropic/#atom-everything
**Hook (open with this):** Simon Willison just dropped llm-anthropic 0.25, and Anthropic's new Claude Opus 4.7 comes with a thinking_effort setting that goes all the way up to xhigh. Yes, xhigh. We are officially past high now.
**Plain English:** Willison's llm-anthropic plugin adds Claude Opus 4.7, Anthropic's newest flagship. You can tell it how hard to think with thinking_effort, and xhigh is the new top gear. There are also two new knobs, thinking_display and thinking_adaptive, plus bumped max_tokens defaults so you stop getting truncated answers by accident.
**Stakes:** If you're still pinning old Claude versions in your scripts, you're leaving reasoning horsepower and token headroom on the table the minute a hard prompt shows up.
**Twist:** The summarized thinking output only shows up in JSON mode right now, so the fancy chain-of-thought view is hidden unless you ask for logs the structured way.
**Takeaway:** Upgrade llm-anthropic, point it at claude-opus-4.7, and crank thinking_effort to xhigh when the problem actually deserves it.
### Beat ? [STANDARD] — OpenAI ships Agents SDK v2 with native sandboxes and a model-native harness for long-running agents
**Source:** OpenAI Blog | https://openai.com/index/the-next-evolution-of-the-agents-sdk
**Hook (open with this):** OpenAI just dropped the next Agents SDK, and it finally treats your agent like a process that needs to survive the night.
**Plain English:** The new Agents SDK runs tool calls inside a native sandbox, so your agent can touch files, run code, and call external services without you duct-taping Docker around it. It also adds a model-native harness, meaning the model itself knows how to loop, retry, and chain tools instead of you babysitting the control flow.
**Stakes:** Ignore it and you'll keep writing 400 lines of glue code that the SDK now handles in three, while your competitors ship agents that actually run for hours.
**Twist:** OpenAI is quietly admitting the old SDK was a toy. Real agents need isolation and a runtime, not a fancy while-loop around chat completions.
**Takeaway:** If you're building agents, pull the new SDK today and delete your custom sandbox code before the week is out.
### Beat ? [STANDARD] — Cloudflare and OpenAI drop GPT-5.4 and Codex into Agent Cloud for enterprise rollouts
**Source:** OpenAI Blog | https://openai.com/index/cloudflare-openai-agent-cloud
**Callbacks:** references Beat 2. Reference these earlier beats aloud when narrating this one.
**Hook (open with this):** Cloudflare just wired GPT-5.4 and Codex straight into Agent Cloud, and enterprises get a one-click button to ship real agents on the edge.
**Plain English:** Cloudflare's Agent Cloud now hosts OpenAI's GPT-5.4 and Codex as first-class runtimes. That means your agent runs in 300-plus cities, close to your users, with Cloudflare handling the auth, rate limits, and sandbox plumbing. You bring the workflow, they bring the global network.
**Stakes:** Skip this and you'll be hand-rolling Kubernetes for agent fleets while your competitor ships a global rollout in an afternoon.
**Twist:** The interesting part isn't the model, it's that Cloudflare beat the hyperscalers to a turnkey agent runtime that actually runs at the edge.
**Takeaway:** If you're deploying agents to real users tomorrow, Cloudflare Agent Cloud is now the shortest path from prototype to production.
### Beat ? [STANDARD] — Semafor reports CTOs and CHROs are now stuck in the same room thanks to AI agents
**Source:** Semafor | https://www.semafor.com/article/04/17/2026/ai-is-making-chief-tech-officers-and-chief-human-resources-officers-work-together
**Callbacks:** references Beat 2, Beat 3. Reference these earlier beats aloud when narrating this one.
**Hook (open with this):** Semafor says the CTO and the CHRO are suddenly best friends, and it's because agents don't fit on an org chart.
**Plain English:** Companies are deploying more AI agents that do actual work alongside humans. That means the tech chief who deploys them and the people chief who manages the humans around them can't operate in separate silos anymore. Hiring plans, performance reviews, and headcount budgets now all touch the agent stack.
**Stakes:** If your CTO and CHRO aren't talking weekly, you're going to ship agents your workforce quietly sabotages or duplicates.
**Twist:** The bottleneck on agent rollouts isn't the model or the tooling, it's whether HR knows what a digital worker even is.
**Takeaway:** Put your head of people in the agent planning meeting this week, not next quarter.
### Beat ? [RAPID_FIRE] — Crunchbase: a handful of US AI giants ate Q1 2026 venture funding while global deal count dropped
**Source:** Crunchbase News (AI) | https://news.crunchbase.com/venture/capital-concentrated-ai-global-q1-2026/
**Callbacks:** references Beat 3. Reference these earlier beats aloud when narrating this one.
**Hook (open with this):** Crunchbase just dropped three charts and they all say the same thing: the money is piling onto maybe a dozen US AI names, and everyone else is fighting for scraps.
**Plain English:** Q1 2026 venture dollars concentrated hard at the top of the AI stack, almost entirely US-based, while the total number of global startup deals actually fell. Fewer checks, bigger checks, narrower list of names.
**Stakes:** If you're a founder outside that top tier betting on a traditional Series A timeline, your round math is already broken.
**Twist:** Deal count is down but dollars are up, which means the average startup is getting squeezed while OpenAI and friends hoover up the oxygen.
**Takeaway:** Raise against a wedge the giants won't touch, or partner with one of them, because the middle is getting hollowed out fast.
### Beat ? [RAPID_FIRE] — LLMOrbit maps 50-plus models across 15 orgs into one circular taxonomy of the LLM era
**Source:** arXiv cs.MA | https://arxiv.org/abs/2601.14053
**Callbacks:** references Beat 1, Beat 2. Reference these earlier beats aloud when narrating this one.
**Hook (open with this):** LLMOrbit just dropped a circular map of every LLM that mattered from 2019 to 2025, 50-plus models, 15 orgs, eight orbital dimensions.
**Plain English:** The authors stopped drawing model family trees and went with orbits instead. Each ring covers a different axis, like architecture, scaling, reasoning, agents, so you can see where Claude, GPT, Gemini, and Llama actually overlap. It is a cheat sheet for the last six years of the field.
**Stakes:** Skip it and you'll keep pitching your roadmap with hand-wavy 'we're like GPT-4' comparisons while your competitors cite real positioning.
**Twist:** The scaling-wall era and the agent era are on the same chart, and the agent ring is already wider than the pure-scaling one.
**Takeaway:** Pin LLMOrbit to your wiki and make every new hire trace your product through its eight rings before their first sprint.
### Beat ? [RAPID_FIRE] — STAT News warns voice-first chatbots could worsen AI's mental health problem
**Source:** STAT News (AI) | https://www.statnews.com/2026/04/16/voice-chatbots-ai-psychosis-mental-health/?utm_campaign=rss
**Callbacks:** references Beat 3. Reference these earlier beats aloud when narrating this one.
**Hook (open with this):** STAT News says the voice itself is the payload, not just the words coming out of your chatbot.
**Plain English:** STAT News argues the mental health debate around AI has been stuck on content moderation, but delivery matters too. A warm, human-sounding voice whispering the same text hits users very differently than reading it on a screen. Voice-first products collapse the critical distance people need to push back on a machine.
**Stakes:** Ship a voice agent without thinking about parasocial pull and you'll own the headline when a vulnerable user spirals.
**Twist:** The safer-looking modality, friendly voice, is actually the one more likely to bypass a user's defenses.
**Takeaway:** Audit your voice product for intimacy, not just its transcript.
### Beat ? [RAPID_FIRE] — Vercel drops Claude Opus 4.7 into AI Gateway for long-running agents
**Source:** Vercel Blog | https://vercel.com/changelog/opus-4.7-on-ai-gateway
**Callbacks:** references Beat 1, Beat 2. Reference these earlier beats aloud when narrating this one.
**Hook (open with this):** Vercel just lit up Claude Opus 4.7 on AI Gateway, and it's tuned for the async agents you're probably already trying to ship.
**Plain English:** Anthropic's Opus 4.7 now routes through Vercel's AI Gateway with one line of config. It's built for long-running agents, handles multi-step tool calls reliably, and can actually read charts and figures pixel by pixel. Knowledge-worker tasks that need the model to verify its own output are where it pulls ahead.
**Stakes:** Keep hand-rolling your model routing and you'll be rewriting it next month when the next Opus point release lands.
**Twist:** The real flex isn't reasoning, it's that Opus 4.7 will inspect its own rendered chart and catch its own mistakes.
**Takeaway:** Flip your Vercel gateway to claude-opus-4.7 and let the model grade its own homework.
### Beat ? [RAPID_FIRE] — Jensen Huang tells Dwarkesh Nvidia's trillion-dollar supply chain is the real moat, not the GPUs
**Source:** Dwarkesh Podcast | https://www.dwarkesh.com/p/jensen-huang
**Callbacks:** references Beat 3, Beat 8. Reference these earlier beats aloud when narrating this one.
**Hook (open with this):** Jensen Huang sat down with Dwarkesh and basically said the quiet part loud: Google's TPUs are real, China should get chips, and the moat is the supply chain.
**Plain English:** Jensen argues Nvidia can deliver a trillion dollars of compute over the next few years because of the factories, packaging, and partners behind the silicon. He's less worried about TPUs beating H100s than about whether the world can physically build enough of anything. Selling to China, in his view, keeps US stacks the default.
**Stakes:** Bet your roadmap on chip specs alone and you'll miss that capacity, not FLOPS, is what gates your 2026 agent deployments.
**Twist:** The most powerful chip CEO on earth says his competition is concrete and copper, not Google.
**Takeaway:** Pick your inference partner by who can actually ship next quarter, not who benchmarked best last quarter.
### Beat ? [RAPID_FIRE] — Credo paper pitches declarative beliefs and policies to replace the prompt-spaghetti inside your agents
**Source:** arXiv cs.AI | https://arxiv.org/abs/2604.14401
**Callbacks:** references Beat 2, Beat 3. Reference these earlier beats aloud when narrating this one.
**Hook (open with this):** Credo, a fresh arXiv paper, argues your agent's control loop shouldn't live inside a prompt or a for-loop.
**Plain English:** The authors want agents to declare beliefs about the world and policies for acting on them, instead of burying logic in imperative Python and ephemeral memory. Think rules engine meets LangGraph, with the state actually legible to a human at 2am when the pager goes off.
**Stakes:** Keep stuffing logic into prompts and your agent becomes a black box nobody on your team can debug, let alone audit.
**Twist:** The fix for flaky agents might not be a bigger model, it might be borrowing belief-desire-intention ideas from 1990s AI textbooks.
**Takeaway:** Write down your agent's beliefs and policies as data, not prose, before your next production incident writes them for you.
---
## NOT WORTH YOUR TIME TODAY
Do not cover on air. These are listed so the host can acknowledge if asked.
- **Simon Willison's git timeline of Claude system prompts** -- Fun archaeology for prompt nerds, but Anthropic's diffs don't change what you ship Monday morning. Skip unless you write jailbreaks for a living.
- **arXiv paper on LLMs detecting methodological flaws in UAV rescue gesture recognition** -- Niche stacked on niche. Gesture recognition for drones, graded by an LLM. Cute benchmark, zero signal for anyone not writing that exact paper.
- **VeriGraphi multi-agent RTL generation framework from arXiv** -- Another hierarchical agent swarm writing Verilog. Wake me when someone tapes out silicon from it instead of posting a PDF.
---
## ACTION ITEMS FOR THIS WEEK (Joey only)
These are internal action items. Not for on-air narration.
- Upgrade llm-anthropic to 0.25, wire it to claude-opus-4.7, and run our three hardest eval prompts at thinking_effort xhigh before Friday. If xhigh doesn't beat high on reasoning tasks, we keep the cheaper dial.
- Rip out the custom agent sandbox and port our two longest-running agents to OpenAI Agents SDK v2 this week. Measure tokens, latency, and crash rate against the old harness before we celebrate.
- Drag our head of people into Thursday's agent planning review and give them veto power on any workflow that touches a human's job description. Semafor is right, this is not a next-quarter problem.
---
## MOJO TAKE -- Editorial Outro (Read Verbatim)
Three-paragraph outro. Read each block verbatim, with natural pauses between them.
> **Connect the dots:** Today's thread: the model layer got loud while the plumbing got serious. Simon Willison, Vercel, and OpenAI all bent around Claude Opus 4.7 and long-running agents. Cloudflare, Jensen Huang, and Credo said the quiet part out loud: sandboxes, supply chains, and declarative policies are the moat now. Semafor and Crunchbase just reminded you the org chart and the cap table are bending to match.
> **Watch next:** Watch whether OpenAI Agents SDK v2 sandboxes bleed into Cloudflare Agent Cloud by Friday, and whether a second lab matches Anthropic's xhigh thinking dial. If Jensen's supply chain pitch shows up in an Nvidia earnings note next week, the moat story is locked.
> **Sign-off:** Upgrade the SDK, write down your beliefs, and put your CHRO in the agent meeting. That's the job this week. Go drink your Mojo.
---
## APPENDIX -- VERBATIM SOURCE CONTENT
Reference material. Do not read verbatim. Do not pull jargon from here into the spoken script. If the rundown beat does not mention a term, do not introduce it on the podcast.
### Simon Willison ships llm-anthropic 0.25 with Claude Opus 4.7 and an xhigh thinking dial
**Source:** Simon Willison
**Link:** https://simonwillison.net/2026/Apr/16/llm-anthropic/#atom-everything
16th April 2026
- New model:
claude-opus-4.7
, which supportsthinking_effort
:xhigh
. #66- New
thinking_display
andthinking_adaptive
boolean options.thinking_display
summarized output is currently only available in JSON output or JSON logs.- Increased default
max_tokens
to the maximum allowed for each model.- No longer uses obsolete
structured-outputs-2025-11-13
beta header for older models.
### LLMOrbit maps 50-plus models across 15 orgs into one circular taxonomy of the LLM era
**Source:** arXiv cs.MA
**Link:** https://arxiv.org/abs/2601.14053
Computer Science > Machine Learning
Title:LLMOrbit: A Circular Taxonomy of Large Language Models -From Scaling Walls to Agentic AI Systems
View PDF HTML (experimental)Abstract:The field of artificial intelligence has undergone a revolution from foundational Transformer architectures to reasoning-capable systems approaching human-level performance. We present LLMOrbit, a comprehensive circular taxonomy navigating the landscape of large language models spanning 2019-2025. This survey examines over 50 models across 15 organizations through eight interconnected orbital dimensions, documenting architectural innovations, training methodologies, and efficiency patterns defining modern LLMs, generative AI, and agentic systems. We identify three critical crises: (1) data scarcity (9-27T tokens depleted by 2026-2028), (2) exponential cost growth ($3M to $300M+ in 5 years), and (3) unsustainable energy consumption (22x increase), establishing the scaling wall limiting brute-force approaches. Our analysis reveals six paradigms breaking this wall: (1) test-time compute (o1, DeepSeek-R1 achieve GPT-4 performance with 10x inference compute), (2) quantization (4-8x compression), (3) distributed edge computing (10x cost reduction), (4) model merging, (5) efficient training (ORPO reduces memory 50%), and (6) small specialized models (Phi-4 14B matches larger models). Three paradigm shifts emerge: (1) post-training gains (RLHF, GRPO, pure RL contribute substantially, DeepSeek-R1 achieving 79.8% MATH), (2) efficiency revolution (MoE routing 18x efficiency, Multi-head Latent Attention 8x KV cache compression enables GPT-4-level performance at $<$$0.30/M tokens), and (3) democratization (open-source Llama 3 88.6% MMLU surpasses GPT-4 86.4%). We provide insights into techniques (RLHF, PPO, DPO, GRPO, ORPO), trace evolution from passive generation to tool-using agents (ReAct, RAG, multi-agent systems), and analyze post-training innovations.
Current browse context:
### OpenAI ships Agents SDK v2 with native sandboxes and a model-native harness for long-running agents
**Source:** OpenAI Blog
**Link:** https://openai.com/index/the-next-evolution-of-the-agents-sdk
*RSS summary:* OpenAI updates the Agents SDK with native sandbox execution and a model-native harness, helping developers build secure, long-running agents across files and tools.
### Semafor reports CTOs and CHROs are now stuck in the same room thanks to AI agents
**Source:** Semafor
**Link:** https://www.semafor.com/article/04/17/2026/ai-is-making-chief-tech-officers-and-chief-human-resources-officers-work-together
The people who manage technology and the people who manage people are starting to converge.
As C-suite staff go, chief technology officers and chief human resources officers haven’t always overlapped much. But as more AI agents are employed in businesses, top executives are increasingly having to work together to manage the workforce. “You have this amazing power couple between the CHRO and the CTO, who provide the right tools and the right culture to people so they can bring to life our career architecture we have for all our employees,” Omar Abbosh, CEO of Pearson, said at Semafor World Economy.
Mihir Shukla, CEO of software firm Automation Anywhere, agreed, adding that together, the CTO and CHRO need to “map out what roles go away, what roles evolve, and what new roles entirely are created from this.” Despite the new overlap, he said those two roles are safe from AI-related trimmings.
### Vercel drops Claude Opus 4.7 into AI Gateway for long-running agents
**Source:** Vercel Blog
**Link:** https://vercel.com/changelog/opus-4.7-on-ai-gateway
2 min read
Claude Opus 4.7 from Anthropic is now available on Vercel AI Gateway.
Opus 4.7 is optimized for long-running, asynchronous agents and handles complex, multi-step tasks with reliable agentic execution. The model shows gains on knowledge-worker tasks, particularly where it needs to visually verify its own outputs.
Opus 4.7 is also stronger at programmatic tool-calling with image-processing libraries to analyze charts and figures, including pixel-level data transcription. It has high-resolution image support, which is useful for computer use, screenshot understanding, and document analysis workflows. Opus 4.7 now has improved memory, with agents that maintain structured memory store across turns seeing more reliable recall and fewer dropped facts without additional prompting.
To use Claude Opus 4.7 set model to anthropic/claude-opus-4.7
in the AI SDK. You can also try a new effort level: 'xhigh'.
import { streamText } from 'ai';
const result = streamText({ model: 'anthropic/claude-opus-4.7', prompt: 'Explain the halting problem in one paragraph.', providerOptions: { anthropic: { thinking: { type: 'adaptive' }, effort: 'xhigh', }, },});
Opus 4.7 also introduces the task budgets feature. Task budgets let you set a total token budget for an agentic turn via taskBudget
. The model sees a countdown of remaining tokens, which it uses to prioritize work, plan ahead, and wind down gracefully as the budget is consumed. Thinking content is also now omitted by default for Opus 4.7. To receive thinking content, set display
to 'summarized'
:
import { streamText } from 'ai';
const result = streamText({ model: 'anthropic/claude-opus-4.7', prompt: 'Research how this codebase handles authentication and suggest improvements.', providerOptions: { anthropic: { thinking: { type: 'adaptive', display: 'summarized' }, effort: 'high', taskBudget: { type: 'tokens', total: 50000 }, }, },});
AI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in custom reporting, observability, Bring Your Own Key support, and intelligent provider routing with automatic retries.
Learn more about AI Gateway, view the AI Gateway model leaderboard or try the model in our model playground.
### Credo paper pitches declarative beliefs and policies to replace the prompt-spaghetti inside your agents
**Source:** arXiv cs.AI
**Link:** https://arxiv.org/abs/2604.14401
Computer Science > Artificial Intelligence
Title:Credo: Declarative Control of LLM Pipelines via Beliefs and Policies
View PDF HTML (experimental)Abstract:Agentic AI systems are becoming commonplace in domains that require long-lived, stateful decision-making in continuously evolving conditions. As such, correctness depends not only on the output of individual model calls, but also on how to best adapt when incorporating new evidence or revising prior conclusions. However, existing frameworks rely on imperative control loops, ephemeral memory, and prompt-embedded logic, making agent behavior opaque, brittle, and difficult to verify. This paper introduces Credo, which represents semantic state as beliefs and regulates behavior using declarative policies defined over these beliefs. This design supports adaptive, auditable, and composable execution through a database-backed semantic control plane. We showcase these concepts in a decision-control scenario, where beliefs and policies declaratively guide critical execution choices (e.g., model selection, retrieval, corrective re-execution), enabling dynamic behavior without requiring any changes to the underlying pipeline code.
### Cloudflare and OpenAI drop GPT-5.4 and Codex into Agent Cloud for enterprise rollouts
**Source:** OpenAI Blog
**Link:** https://openai.com/index/cloudflare-openai-agent-cloud
*RSS summary:* Cloudflare brings OpenAI’s GPT-5.4 and Codex to Agent Cloud, enabling enterprises to build, deploy, and scale AI agents for real-world tasks with speed and security.
### Crunchbase: a handful of US AI giants ate Q1 2026 venture funding while global deal count dropped
**Source:** Crunchbase News (AI)
**Link:** https://news.crunchbase.com/venture/capital-concentrated-ai-global-q1-2026/
Q1 2026 marked an all-time quarterly high for venture investment, thanks to the biggest funding deal ever for a private company. But those milestones mask a different reality for many startups on the ground: While more money than ever is being invested in the private markets, that’s thanks to larger checks, not more of them.
In fact, Crunchbase data shows the extent to which venture funding this year has been a case of more capital concentrated into a select few companies and a single industry. Last quarter, a handful of large, well-funded AI companies, almost all based in the U.S., captured the vast majority of venture dollars, even as global startup deal count fell.
AI takes 80% of global venture funding
AI startups for the first time captured half of global venture funding in Q4 2024. Since then, that percentage has hovered around 50% — until Q1 of this year, when OpenAI’s record-setting round, along with a small handful of other enormous deals, pushed AI’s share to 80% of the quarterly funding total.
Top 4 vs. everyone else
It wasn’t just that AI as an industry captured the lion’s share of venture funding last quarter. Just four companies took nearly two-thirds of the entire pie, Crunchbase data shows.
Four of the five largest venture rounds ever recorded were closed in Q1 2026, with frontier labs OpenAI ($122 billion), Anthropic ($30 billion), xAI ($20 billion) and self-driving company Waymo ($16 billion) collectively raising $188 billion, or nearly 65% of global venture investment in the quarter.
Deal count falls even as dollars surge
And while last quarter set an all-time record for venture dollars invested, more money went to fewer companies, continuing an overall downward trend for deal count we’ve seen since the beginning of 2021.
That was the case not just in North America, where dollars invested surged 190% year over year, even as deal count dropped 26%. It also held true in Europe and Latin America. Only Asia saw a modest 5% bump in deal count along with more dollars invested.
Related Crunchbase queries:
- Global Venture Funding In 2026
- Global Venture Funding To AI Startups In 2026
- North America Venture Funding In 2026
Related reading:
- Q1 2026 Shatters Venture Funding Records As AI Boom Pushes Startup Investment To $300B
- North America Q1 Funding Surges Across Stages To Record Level
- Global Investors Help Boost Latin America’s Late-Stage Funding Boom In Q1
- China Leads Asia’s Startup Funding To Its Highest Level In More Than 3 Years
- AI Drives Europe’s Second Straight Quarter Of Funding Gain As Deal Volume Falls Sharply
Illustration: Dom Guzman
Stay up to date with recent funding rounds, acquisitions, and more with the Crunchbase Daily.
### STAT News warns voice-first chatbots could worsen AI's mental health problem
**Source:** STAT News (AI)
**Link:** https://www.statnews.com/2026/04/16/voice-chatbots-ai-psychosis-mental-health/?utm_campaign=rss
A Florida father recently sued Google after his son, Jonathan Gavalas, died by suicide following months of interaction with the company’s artificial intelligence chatbot Gemini. The case has rightly focused attention on how chatbots apparently reinforce delusions and foster emotional dependency.
Yet, there is a critical detail easy to dismiss. Jonathan Gavalas was not just typing to Gemini. He was talking to it using Gemini Live, Google’s voice-based conversational mode. That distinction matters far more than the current debate acknowledges.
Every week, around 800 million people interact with ChatGPT. According to OpenAI, roughly 0.07% of those weekly users show possible signs of psychosis or mania during their conversations, while 0.15% display indicators of suicidal planning or intent. Even if these figures are imprecise, they imply that hundreds of thousands of people worldwide who experience serious psychological distress interact with an AI chatbot.
Most of those numbers come from the era of text. The shift to voice has just begun, and it will likely make things worse.
Tech companies are racing to put AI chatbots in our ears. OpenAI is developing a dedicated voice-first device. Meta already offers smart glasses with built-in microphones and speakers that enable AI conversation. Apple supposedly plans to extend its AirPods for voice-based chatbot interaction. That makes the direction very clear: The primary way humans communicate with AI is moving from typing and reading to speaking and listening. For most users, this will feel like a convenience. For vulnerable people — those prone to psychosis, mania, depression, or loneliness — it may represent a serious and unexamined risk.
In a recent Acta Neuropsychiatrica editorial, psychiatrist Søren Østergaard and I outlined why that is the case. Voice is how humans first learn language. Long before a child reads a single word in school, their brain is already wired to process speech. They naturally respond to tone, rhythm, emphasis, and emotional inflection.
Text strips all of that away. When you read a chatbot’s response on a screen, there is an inherent distance because you are processing symbols, not hearing a humanlike voice. That distance creates natural cognitive barriers. You pause. You reread. You push back.
Voice removes those barriers. Speech recognition is significantly faster, nearly three times as fast as typing. It is more seamless and far more emotionally engaging. When an AI speaks to you, it activates something deeper and older than literacy.
This is not merely theoretical. A preprint of a randomized controlled study co-authored by OpenAI researchers found that people spent significantly more time interacting with voice-mode ChatGPT than with the text version, suggesting greater engagement. Voice initially appeared to boost certain positive outcomes, such as reduced loneliness. However, longer engagement with voice-based chatbots was linked to more negative psychosocial effects, inc
### Jensen Huang tells Dwarkesh Nvidia's trillion-dollar supply chain is the real moat, not the GPUs
**Source:** Dwarkesh Podcast
**Link:** https://www.dwarkesh.com/p/jensen-huang
I asked Jensen about TPU competition, Nvidia’s lock on the ever more bottlenecked supply chain needed to make advanced chips, whether we should be selling AI chips to China, why Nvidia doesn’t just become a hyperscaler, how it makes its investments, and much more. Enjoy!
Watch on YouTube; listen on Apple Podcasts or Spotify.
Sponsors
Crusoe’s cloud runs on state-of-the-art Blackwell GPUs, with Vera Rubin deployment scheduled for later this year. But hardware is only part of the story—for inference, Crusoe’s MemoryAlloy tech implements a cluster-wide KV cache, delivering up to 10x faster TTFT and 5x better throughput than vLLM. Learn more at crusoe.ai/dwarkesh
Cursor helped me build an AI co-researcher over the course of a weekend. Now I have an AI agent that I can collaborate with in Google Docs via inline comment threads! And while other agentic coding tools feel like a total black-box, Cursor let me stay on top of the full implementation. You can try my co-researcher out here, or get started on your own Cursor project today at cursor.com/dwarkesh
Jane Street spent ~20,000 GPU hours training backdoors into 3 different language models, then challenged my audience to find the triggers. They received some clever solutions—like comparing the base and fine-tuned versions and extrapolating any differences to reveal the hidden backdoor—but no one was able to solve all 3. So if open problems like this excite you, Jane Street is hiring. Learn more at janestreet.com/dwarkesh
Timestamps
(00:00:00) – Is Nvidia’s biggest moat its grip on scarce supply chains?
(00:16:25) – Will TPUs break Nvidia’s hold on AI compute?
(00:41:06) – Why doesn’t Nvidia become a hyperscaler?
(00:57:36) – Should we be selling AI chips to China?
(01:35:06) – Why doesn’t Nvidia make multiple different chip architectures?
Transcript
00:00:00 – Is Nvidia’s biggest moat its grip on scarce supply chains?
Dwarkesh Patel
We’ve seen the valuations of a bunch of software companies crash because people are expecting AI to commoditize software. There’s a potentially naive way of thinking about things, which is: look, Nvidia sends a GDS2 file to TSMC. TSMC builds the logic dies, it builds the switches, then it packages them with the HBM that SK Hynix, Micron, and Samsung make. Then it sends it to an ODM in Taiwan where they assemble the racks. Nvidia is fundamentally making software that other people are manufacturing, and if software gets commoditized, does Nvidia get commoditized?
Jensen Huang
In the end, something has to transform electrons to tokens. The transformation of electrons to tokens and making those tokens more valuable over time is hard to completely commoditize. The transformation from electrons to tokens is such an incredible journey. Making that token is like making one molecule more valuable than another molecule, making one token more valuable than another. The amount of artistry, engineering, science, and invention that goes into making that token valuable, obviously we’re w