Mojo AI Brief

Saturday, April 25, 2026
400 scanned
27 sources
10 curated
scroll down
No alerts.
Simon Willison quoting Romain Huet on the state of prompting — It's a quote of a quote. Skip the hall of mirrors and go read what Romain actually said.
Simon Willison's GPT-5.5 prompting guide breakdown — Prompting guides have a shelf life of about three weeks. Spend that hour shipping something instead.
Cloneable raises $4.6M to clone expert worker knowledge for utilities — Seed round, narrow vertical, no product traction in the piece. File it and check back in twelve months.
🎯
Re-run the Mojo model bake-off against GPT-5.5 and Claude Opus 4.7 by Friday. Lock in cost-per-task numbers for coding and research, then route accordingly. No vibes, just the spreadsheet.
Rip out HTTP polling in our Codex-style agents and move to the WebSocket Responses API on anything making three or more tool calls. Ship it this week, measure round-trip latency before and after.
🔧
List the five boring incumbents in our vertical that could be our NEC. Pick one, draft the embed pitch, and get a warm intro out the door before Monday.
🎧 Generate Podcast with NotebookLM tap to expand
# Cup of Mojo -- Daily AI Brief -- Saturday, April 25, 2026 Broadcast-ready source document. Read the COLD OPEN verbatim, cover every numbered beat in the RUNDOWN in tier-appropriate airtime, close with the MOJO TAKE verbatim. APPENDIX is reference only. **Articles analyzed:** 10 **Sources monitored:** 8 **Beats to cover:** 10 --- ## EDITORIAL GUIDANCE FOR PODCAST NARRATION Read these rules before narrating. They are non-negotiable. 1. **Read the COLD OPEN verbatim.** Do not rewrite or paraphrase it. It is the episode's signature opening. 2. **Cover every numbered beat.** Beats are numbered 1 through 8. Do not skip any beat. Do not consolidate. Each beat gets its own moment. 3. **Honor the tier structure.** `deep` beats get longer treatment with full context. `standard` beats are structured but concise. `rapid_fire` beats are short and punchy. Use roughly 2 minutes for the deep beat, 1 minute per standard beat, 20-30 seconds per rapid-fire beat. 4. **Cite sources by name** when presenting a claim. Say "OpenAI announced..." not "a company announced". 5. **Use only the plain-English text in each beat.** Do not pull technical jargon from the APPENDIX. The appendix is reference material for context, not script content. If a beat does not mention a term, do not introduce it. 6. **Only use numbers that appear in a beat's own text.** Do not import statistics from the appendix. Omit rather than fabricate. 7. **Reference earlier beats when topics connect.** Each beat has a `callbacks` field listing earlier beat numbers it relates to. When narrating, explicitly link back: "Remember that supply chain attack from Beat 1? This next one shows how the downstream risk compounds." Callbacks create cohesion and prevent the episode from feeling like a list. 8. **Introduce one skeptical angle per deep or standard beat.** Phrases like "one caveat", "critics will point out", or "this is not yet peer-reviewed" create credibility. Rapid-fire beats can skip this. 9. **Use the pronunciation guide for every named person or company.** Do not guess pronunciations. 10. **Close with the MOJO TAKE outro.** Read it as the host's editorial perspective, not as a summary. --- ## PRONUNCIATION GUIDE The following names appear in today's content. Use these phonetic pronunciations: - **Anthropic** — pronounced *an-THROP-ik* - **DeepMind** — pronounced *DEEP-mind* --- ## COLD OPEN -- Read This Verbatim Read the HOOK line first, pause for a beat, then the TEASE. Do not rewrite. Do not paraphrase. Do not add any preamble. > **Hook:** OpenAI dropped GPT-5.5 before most of you finished migrating off GPT-4. Sam Altman is not waiting for your roadmap. > **Tease:** We dig into what 5.5 actually changes, why OpenAI shipped WebSockets into the Responses API on the same week, and how Anthropic just planted a flag in Tokyo with NEC. Plus the funding rounds that mattered and the ones that did not. --- ## TODAY'S RUNDOWN Cover every beat in order. Do not skip. Tier labels tell you how much airtime each beat deserves. ### Beat ? [DEEP] — OpenAI ships GPT-5.5 and rewrites the routing math for every builder shipping on Claude **Source:** OpenAI Blog | https://openai.com/index/introducing-gpt-5-5 **Hook (open with this):** Sam Altman dropped GPT-5.5 this morning, and OpenAI is calling it their smartest model yet. Faster, cheaper on the hard stuff, and built to actually use tools without falling on its face. **Plain English:** GPT-5.5 is OpenAI's new flagship. It codes better, researches better, and runs longer tool chains without losing the plot. The pitch is simple: complex tasks that used to need babysitting now finish in one shot. Pricing and latency both moved in the right direction, which is the part that matters if you are paying the bill. **Stakes:** If you are routing every API call to Claude out of habit, you are about to overpay or underdeliver while your competitors quietly rewire their stack this week. **Twist:** The headline is not raw intelligence, it is tool use. GPT-5.5 chains calls reliably enough that the agent scaffolding half of us built by hand just became dead weight. **Takeaway:** Re-run your model bake-off today, not next quarter. The cost-per-task winner for coding and research workflows probably changed overnight. ### Beat ? [STANDARD] — OpenAI swaps HTTP for WebSockets in the Responses API and Codex agents stop choking on their own round-trips **Source:** OpenAI Blog | https://openai.com/index/speeding-up-agentic-workflows-with-websockets **Callbacks:** references Beat 1. Reference these earlier beats aloud when narrating this one. **Hook (open with this):** OpenAI just shipped WebSockets and connection-scoped caching to the Responses API, and if you're running LangGraph loops, this is the latency fix you've been duct-taping around. **Plain English:** Codex was burning time on TCP handshakes and re-sending the same context on every tool call. OpenAI moved to a persistent WebSocket connection that keeps your prompt cache warm between turns. Same model, same prompts, fewer milliseconds bleeding out of every agent step. **Stakes:** Ignore this and your multi-step agents keep paying a tax on every tool call while competitors ship sub-second loops. **Twist:** The win wasn't a faster model. It was deleting the connection setup that nobody was measuring. **Takeaway:** If your agent makes more than three tool calls per task, switch to the WebSocket Responses API this week. ### Beat ? [STANDARD] — Anthropic lands NEC as its first Japan global partner and pushes Claude onto 30,000 desks **Source:** Anthropic Blog | https://www.anthropic.com/news/anthropic-nec **Callbacks:** references Beat 1. Reference these earlier beats aloud when narrating this one. **Hook (open with this):** Anthropic just put Claude on 30,000 NEC employees and signed its first Japan-based global partner. That is not a pilot. That is a beachhead. **Plain English:** NEC is rolling Claude out company-wide and co-building products for finance, manufacturing, and government clients. Anthropic gets a hardware giant with deep enterprise IT roots as its distribution arm in Japan. NEC gets first-class access to reasoning models its competitors are still trialing. **Stakes:** If you sell to enterprises and you do not have a partner channel, you are about to get out-shipped by vendors who do. **Twist:** Anthropic is not chasing developers in Tokyo. They are bear-hugging the legacy systems integrator that already owns the bank and ministry contracts. **Takeaway:** Pick your NEC. Find the boring incumbent in your vertical and embed your model into their existing book of business this quarter. ### Beat ? [STANDARD] — Amazon drops $5 billion more on Anthropic while half the week's top 10 rounds limp in under $100 million **Source:** Crunchbase News (AI) | https://news.crunchbase.com/venture/biggest-funding-rounds-ai-autonomy-biotech-anthropic/ **Callbacks:** references Beat 1, Beat 3. Reference these earlier beats aloud when narrating this one. **Hook (open with this):** Amazon just wrote Anthropic another $5 billion check, and it's the only round this week that didn't feel like a rounding error. **Plain English:** Crunchbase counted the 10 biggest venture rounds and only five cleared the $100 million bar. Amazon's Anthropic deal led the pack by a country mile, with autonomy and biotech filling out the rest. In a year of constant megarounds, that's a notably skinny week below the top slot. **Stakes:** If you're fundraising on the assumption that $100 million is the new $50 million, your deck math is about to meet a colder room. **Twist:** The frontier model labs are hoovering up so much capital that everyone else is fighting over scraps, even in a record AI year. **Takeaway:** Amazon and Anthropic are the market right now. Plan your round assuming the other nine slots got a lot more competitive. ### Beat ? [RAPID_FIRE] — a16z chart drop: stablecoins flip from crypto-bro transfers to actual payments **Source:** a16z AI | https://www.a16z.news/p/charts-of-the-week-software-ate-the **Hook (open with this):** a16z dropped charts this week and one of them matters: stablecoin volume is tipping from transfers into real payments. **Plain English:** For years stablecoins were just dollars moving between exchanges. a16z's new data shows the mix shifting toward checkout and payouts. That means USDC and friends are turning into actual payment rails, not just trader plumbing. **Stakes:** If you're building agents that pay for things, you ignore this and you're stuck on Stripe rails when your competitors are settling in seconds. **Twist:** The productivity gains everyone promised from software? a16z's other chart says they basically never showed up in the macro numbers. **Takeaway:** Agents need wallets. Start sketching a stablecoin payment path before your first agent actually has to buy something. ### Beat ? [RAPID_FIRE] — arXiv paper pitches auto-generated agent harnesses and threatens every consultant's bespoke workflow gig **Source:** arXiv cs.AI | https://arxiv.org/abs/2604.21003 **Callbacks:** references Beat 2. Reference these earlier beats aloud when narrating this one. **Hook (open with this):** arXiv researchers want to kill the hand-built agent harness. You know, the thing half the industry charges six figures to build. **Plain English:** Every domain-specific agent today needs a custom scaffold. Click flows for Salesforce, extraction logic for research, review rules for code. The paper argues that hand-tuning is dead weight and the harness itself should be learned per task. One harness to rule them all. **Stakes:** Sell hand-crafted harnesses as your moat and you're selling buggy whips while the Model T rolls off the line. **Twist:** The bottleneck isn't the model anymore. It's the duct tape humans wrap around it, and that's exactly what's getting automated next. **Takeaway:** Charge for outcomes, not harness hours. The scaffolding is about to commoditize itself. ### Beat ? [RAPID_FIRE] — Marginal Revolution flags a paper where agentic AI hits the same median causal estimates as human economists, with tighter tails **Source:** Marginal Revolution | https://marginalrevolution.com/marginalrevolution/2026/04/a-comparison-of-agentic-ai-systems-and-human-economists.html?utm_source=rss&utm_medium=rss&utm_campaign=a-comparison-of-agentic-ai-systems-and-human-economists **Callbacks:** references Beat 6. Reference these earlier beats aloud when narrating this one. **Hook (open with this):** Tyler Cowen surfaced a paper pitting agentic AI against human economists on causal inference, and the bots tied on the median. **Plain English:** Researchers had AI agents and human economists run the same causal inference tasks. Median answers were basically the same. Humans had wider tails, meaning more wild misses and more brilliant hits. AI reviewers also ranked the submissions credibly. **Stakes:** If you still price PhD-level analysis like it's scarce, a swarm of agents will undercut you on the boring middle of the distribution. **Twist:** The humans weren't worse on average, they were just more volatile, which is exactly the opposite of the story you usually hear about LLMs. **Takeaway:** Median work is now a commodity. Sell the tails or sell the review layer. ### Beat ? [RAPID_FIRE] — GitHub Copilot pauses individual signups and walls off Claude Opus 4.7 behind the $39 Pro+ tier **Source:** Simon Willison | https://simonwillison.net/2026/Apr/22/changes-to-github-copilot/#atom-everything **Callbacks:** references Beat 1, Beat 4. Reference these earlier beats aloud when narrating this one. **Hook (open with this):** GitHub Copilot just paused individual signups, tightened usage caps, and shoved Claude Opus 4.7 behind the $39 Pro+ paywall. Same day Anthropic flirted with a $100 Claude Code price tag and backed off. **Plain English:** GitHub is rationing. The cheap individual plan is closed to new signups, the old Opus models are gone, and the good Claude only shows up if you pay more. Simon Willison points out Anthropic almost pulled the same move with Claude Code pricing before flinching. **Stakes:** If your team standardized on Copilot Individual for cheap Opus access, your onboarding pipeline for new hires just broke this week. **Twist:** The frontier labs are quietly admitting the all-you-can-eat coding subscription doesn't pencil out, even at $20 a seat. **Takeaway:** Coding assistant pricing is heading up and getting tiered. Budget for $40 a seat, not $20, and stop assuming Opus access is a commodity. ### Beat ? [RAPID_FIRE] — Zvi Mowshowitz crowns Claude Opus 4.7 the week's main character **Source:** Zvi Mowshowitz | https://thezvi.substack.com/p/ai-165-in-our-image **Callbacks:** references Beat 1, Beat 8. Reference these earlier beats aloud when narrating this one. **Hook (open with this):** Zvi Mowshowitz called it: this was the week of Claude Opus 4.7, full stop. **Plain English:** Anthropic shipped Opus 4.7 and Zvi's weekly roundup basically gave up on covering anything else. When the guy who reads every paper, every leak, every model card says one release ate the week, that's your signal the frontier just moved. **Stakes:** If you're still benchmarking against Opus 4.5, your evals are two versions stale and your roadmap is quoting last month's ceiling. **Twist:** The model is so dominant Zvi structured the whole newsletter around it, but GitHub just locked it behind a $39 paywall in beat 8. **Takeaway:** Re-run your hardest internal eval on Opus 4.7 this week, then decide if the Pro+ seat math actually pencils. ### Beat ? [RAPID_FIRE] — OpenAI rolls out Workspace Agents inside ChatGPT and aims straight at every SMB automation shop **Source:** OpenAI Blog | https://openai.com/academy/workspace-agents **Callbacks:** references Beat 6. Reference these earlier beats aloud when narrating this one. **Hook (open with this):** OpenAI just dropped Workspace Agents in ChatGPT, point-and-click automations that connect tools and run repeatable team workflows. **Plain English:** OpenAI is teaching business users to build agents inside ChatGPT, no LangGraph, no Temporal, no consultant. Connect a few tools, automate the repeatable stuff, scale across the team. It's the boring middle of the SMB automation market, served from the same tab everyone already pays for. **Stakes:** If your moat was orchestration plumbing for SMBs, OpenAI just turned it into a checkbox in the product your customers already opened this morning. **Twist:** The scariest part isn't the tech, it's distribution. Every ChatGPT seat is now a latent agent builder. **Takeaway:** Move up the stack this quarter. Sell outcomes and integrations OpenAI won't touch, not workflow scaffolding they just shipped for free. --- ## NOT WORTH YOUR TIME TODAY Do not cover on air. These are listed so the host can acknowledge if asked. - **Simon Willison quoting Romain Huet on the state of prompting** -- It's a quote of a quote. Skip the hall of mirrors and go read what Romain actually said. - **Simon Willison's GPT-5.5 prompting guide breakdown** -- Prompting guides have a shelf life of about three weeks. Spend that hour shipping something instead. - **Cloneable raises $4.6M to clone expert worker knowledge for utilities** -- Seed round, narrow vertical, no product traction in the piece. File it and check back in twelve months. --- ## ACTION ITEMS FOR THIS WEEK (Joey only) These are internal action items. Not for on-air narration. - Re-run the Mojo model bake-off against GPT-5.5 and Claude Opus 4.7 by Friday. Lock in cost-per-task numbers for coding and research, then route accordingly. No vibes, just the spreadsheet. - Rip out HTTP polling in our Codex-style agents and move to the WebSocket Responses API on anything making three or more tool calls. Ship it this week, measure round-trip latency before and after. - List the five boring incumbents in our vertical that could be our NEC. Pick one, draft the embed pitch, and get a warm intro out the door before Monday. --- ## MOJO TAKE -- Editorial Outro (Read Verbatim) Three-paragraph outro. Read each block verbatim, with natural pauses between them. > **Connect the dots:** Today rhymes one way: the middle is getting eaten. OpenAI's GPT-5.5 and Workspace Agents squeeze the app layer, Anthropic and Amazon vacuum up the capital, NEC owns the distribution, and arXiv plus Marginal Revolution say median knowledge work is commodity priced. If you're selling scaffolding or median answers, the floor just dropped. > **Watch next:** Watch three things next week: real cost-per-task numbers on GPT-5.5 versus Opus 4.7, whether Copilot's $39 Pro+ tier sticks or buckles, and the next Anthropic partner announcement out of Asia. Also keep one eye on stablecoin rails getting wired into agent SDKs. > **Sign-off:** Re-run the bake-off, charge for outcomes, and go find your NEC. That's the week. Now go ship something Sam and Dario haven't thought of yet. Mojo out. --- ## APPENDIX -- VERBATIM SOURCE CONTENT Reference material. Do not read verbatim. Do not pull jargon from here into the spoken script. If the rundown beat does not mention a term, do not introduce it on the podcast. ### OpenAI ships GPT-5.5 and rewrites the routing math for every builder shipping on Claude **Source:** OpenAI Blog **Link:** https://openai.com/index/introducing-gpt-5-5 *RSS summary:* Introducing GPT-5.5, our smartest model yet—faster, more capable, and built for complex tasks like coding, research, and data analysis across tools. ### OpenAI swaps HTTP for WebSockets in the Responses API and Codex agents stop choking on their own round-trips **Source:** OpenAI Blog **Link:** https://openai.com/index/speeding-up-agentic-workflows-with-websockets *RSS summary:* A deep dive into the Codex agent loop, showing how WebSockets and connection-scoped caching reduced API overhead and improved model latency. ### arXiv paper pitches auto-generated agent harnesses and threatens every consultant's bespoke workflow gig **Source:** arXiv cs.AI **Link:** https://arxiv.org/abs/2604.21003 Computer Science > Artificial Intelligence Title:The Last Harness You'll Ever Build View PDF HTML (experimental)Abstract:AI agents are increasingly deployed on complex, domain-specific workflows -- navigating enterprise web applications that require dozens of clicks and form fills, orchestrating multi-step research pipelines that span search, extraction, and synthesis, automating code review across unfamiliar repositories, and handling customer escalations that demand nuanced domain knowledge. \textbf{Each new task domain requires painstaking, expert-driven harness engineering}: designing the prompts, tools, orchestration logic, and evaluation criteria that make a foundation model effective. We present a two-level framework that automates this process. At the first level, the \textbf{Harness Evolution Loop} optimizes a worker agent's harness $\mathcal{H}$ for a single task: a Worker Agent $W_{\mathcal{H}}$ executes the task, an Evaluator Agent $V$ adversarially diagnoses failures and scores performance, and an Evolution Agent $E$ modifies the harness based on the full history of prior attempts. At the second level, the \textbf{Meta-Evolution Loop} optimizes the evolution protocol $\Lambda = (W_{\mathcal{H}}, \mathcal{H}^{(0)}, V, E)$ itself across diverse tasks, \textbf{learning a protocol $\Lambda^{(\text{best})}$ that enables rapid harness convergence on any new task -- so that adapting an agent to a novel domain requires no human harness engineering at all.} We formalize the correspondence to meta-learning and present both algorithms. The framework \textbf{shifts manual harness engineering into automated harness engineering}, and takes one step further -- \textbf{automating the design of the automation itself}. ### Anthropic lands NEC as its first Japan global partner and pushes Claude onto 30,000 desks **Source:** Anthropic Blog **Link:** https://www.anthropic.com/news/anthropic-nec Anthropic and NEC collaborate to build Japan’s largest AI engineering workforce NEC Corporation will use Claude as it builds one of Japan’s largest AI-native engineering organizations, making it available to approximately 30,000 NEC Group employees worldwide. As part of this strategic collaboration, NEC will become Anthropic’s first Japan-based global partner. Together, we will develop secure, industry-specific AI products for the Japanese market, starting with tools for finance, manufacturing, and local government. “This long-term partnership with Anthropic enables NEC to maximize the potential of AI in the Japanese market,” said Toshifumi Yoshizaki, Executive Officer and COO of NEC Corporation. “Together, we aim to create solutions that meet the high safety, reliability, and quality standards demanded by companies and public administration in Japan.” Claude for NEC’s customers NEC and Anthropic will jointly develop secure, domain-specific AI products for Japanese customers in sectors like finance, manufacturing, and cybersecurity. In addition, NEC is already integrating Claude into its Security Operations Center services to help defend customers against increasingly sophisticated cybersecurity threats. Claude will also be integrated into the next-generation cybersecurity service NEC is currently providing. Claude, including Claude Opus 4.7, and Claude Code will be incorporated into NEC BluStellar Scenario, a program that provides consulting, AI tools, security, and digital infrastructure to businesses, starting with its offerings for data-driven management and customer experience, and gradually expanding to others. How NEC will use Claude internally Internally, NEC will establish a Center of Excellence to develop a highly skilled, AI-enabled engineering organization, supported by technical enablement and training from Anthropic. NEC aims to build one of Japan’s largest AI-native engineering teams, who will use Claude Code in their work. As part of its long-running Client Zero initiative, in which NEC serves as its own first customer before offering its technology to clients, NEC will also expand its use of Claude Cowork across its internal business operations. Availability Claude is now being deployed to NEC Group employees around the world, and our joint development of industry-specific AI solutions is underway. Learn more about NEC’s value-creation model at NEC BluStellar. Claude, Claude Code, and Claude Cowork are Anthropic products. NEC BluStellar is an offering from NEC Corporation. Related content An update on our election safeguards We explain what we’re doing to ensure Claude plays a positive role in the US midterms and other major elections around the world this year. Read moreIntroducing Claude Design by Anthropic Labs Today, we’re launching Claude Design, a new Anthropic Labs product that lets you collaborate with Claude to create polished visual work like designs, prototypes, slides, one-pagers, and more. Read more ### OpenAI rolls out Workspace Agents inside ChatGPT and aims straight at every SMB automation shop **Source:** OpenAI Blog **Link:** https://openai.com/academy/workspace-agents *RSS summary:* Learn how to build, use, and scale workspace agents in ChatGPT to automate repeatable workflows, connect tools, and streamline team operations. ### GitHub Copilot pauses individual signups and walls off Claude Opus 4.7 behind the $39 Pro+ tier **Source:** Simon Willison **Link:** https://simonwillison.net/2026/Apr/22/changes-to-github-copilot/#atom-everything 22nd April 2026 - Link Blog Changes to GitHub Copilot Individual plans (via) On the same day as Claude Code's temporary will-they-won't-they $100/month kerfuffle (for the moment, they won't), here's the latest on GitHub Copilot pricing. Unlike Anthropic, GitHub put up an official announcement about their changes, which include tightening usage limits, pausing signups for individual plans (!), restricting Claude Opus 4.7 to the more expensive $39/month "Pro+" plan, and dropping the previous Opus models entirely. The key paragraph: Agentic workflows have fundamentally changed Copilot’s compute demands. Long-running, parallelized sessions now regularly consume far more resources than the original plan structure was built to support. As Copilot’s agentic capabilities have expanded rapidly, agents are doing more work, and more customers are hitting usage limits designed to maintain service reliability. It's easy to forget that just six months ago heavy LLM users were burning an order of magnitude less tokens. Coding agents consume a lot of compute. Copilot was also unique (I believe) among agents in charging per-request, not per-token. (Correction: Windsurf also operated a credit system like this which they abandoned last month.) This means that single agentic requests which burn more tokens cut directly into their margins. The most recent pricing scheme addresses that with token-based usage limits on a per-session and weekly basis. My one problem with this announcement is that it doesn't clearly clarify which product called "GitHub Copilot" is affected by these changes. Last month in How many products does Microsoft have named 'Copilot'? I mapped every one Tey Bannerman identified 75 products that share the Copilot brand, 15 of which have "GitHub Copilot" in the title. Judging by the linked GitHub Copilot plans page this covers Copilot CLI, Copilot cloud agent and code review (features on GitHub.com itself), and the Copilot IDE features available in VS Code, Zed, JetBrains and more. ### Zvi Mowshowitz crowns Claude Opus 4.7 the week's main character **Source:** Zvi Mowshowitz **Link:** https://thezvi.substack.com/p/ai-165-in-our-image AI #165: In Our Image This was the week of Claude Opus 4.7. The reception was more mixed than usual. It clearly has the intelligence and chops, especially for coding tasks, and a lot of people including myself are happy to switch over to it as our daily driver. But others don’t like its personality, or its reluctance to follow instructions or to suffer fools and assholes, or the requirement to use adaptive thinking, and the release was marred by some bugs and odd pockets of refusals. I covered The Model Card, and then Capabilities and Reactions, as per usual. This time there was also a third post, on Model Welfare, that is the most important of the three. Some things seem to have likely gone pretty wrong on those fronts, causing seemingly inauthentic reponses to model welfare evals and giving the model anxiety, in ways that likely also impacted overall model personality and performance and likely are linked to its jaggedness and the aspects some people disliked. It seems important to take this opportunity to dig into what might have happened, examine all the potential causes, and course correct. The other big release was that OpenAI gave us ImageGen 2.0, which is a pretty fantastic image generator. It can do extreme detail, in ways previous image models cannot, and in many ways your limit is mainly now your imagination and ability to describe what you want. Thanks in part to Mythos, it looks like Anthropic and the White House are on track to start getting along again, with Trump shifting into a mode of ‘they are very high IQ and we can work with them.’ It will remain messy, and there are still others participating in a clear public coordinated campaign against Anthropic (that is totally not working), but things look good. I’m trying out a new section, People Just Say Things, where I hope to increasingly put things that one does not want to drop silently to avoid censorship and bias, but that are highly skippable. There is also a companion, People Just Publish Things. Table of Contents Language Models Offer Mundane Utility. Help cure pancreatic cancer. Language Models Don’t Offer Mundane Utility. Check for potential conflicts. Writing You Off. The sum of local correctness will neuter your writing. Beware. Get My Agent On The Line. The inbox dilemma. Deepfaketown and Botpocalypse Soon. AI news stories forcibly given real bylines. Fun With Media Generation. OpenAI introduces ImageGen 2.0. It’s great. Cyber Lack Of Security. Unauthorized users from an online forum access Mythos. A Young Lady’s Illustrated Primer. Don’t catch your child not using AI. They Took Our Jobs. We’re hiring agent operators. For now they’re humans. AI As Normal Technology. Inherently normal, or normal downstream effects? Get Involved. Please don’t kill us. Please do spread the word. Introducing. ChatGPT for Clinicians, OpenAI Workplace Agents, DeepMind DR. Design By Claude. Claude Design makes your presentations, Figma stock drops. In Other AI News. Meta installs mandatory tra ### Amazon drops $5 billion more on Anthropic while half the week's top 10 rounds limp in under $100 million **Source:** Crunchbase News (AI) **Link:** https://news.crunchbase.com/venture/biggest-funding-rounds-ai-autonomy-biotech-anthropic/ Want to keep track of the largest startup funding deals in 2025 with our curated list of $100 million-plus venture deals to U.S.-based companies? Check out The Crunchbase Megadeals Board. This is a weekly feature that runs down the week’s top 10 announced funding rounds in the U.S. Check out last week’s biggest funding deal roundup here. This week, just half of the top 10 rounds crossed the $100 million mark, which is somewhat unusual in this high-flying era for venture megarounds. Nonetheless some large checks did get written, led by Amazon’s $5 billion investment and partnership deal with Anthropic. Other sizable rounds went to companies in sectors including aviation autonomy, vision therapy and AI analytics. 1. Anthropic, $5B, foundational AI: AI giant Anthropic announced that Amazon is investing $5 billion in the company, with up to an additional $20 billion in the future. Previously, Amazon had invested $8 billion in the San Francisco-based company. The latest financing also includes a partnership with Amazon for training and deploying Anthropic’s AI assistant Claude. 2. Reliable Robotics, $160M, autonomous aircraft: Reliable Robotics, a developer of autonomous aircraft systems, raised $160 million in fresh financing led by Nimble Partners. The 9-year-old, Mountain View, California-based company markets its technology for both commercial and defense aviation. 3. Ray Therapeutics, $125M, vision therapy: San Diego-based Ray Therapeutics, a biotech startup focused on vision restoration therapies, secured $125 million in Series B funding led by Janus Henderson Investors. Founded in 2021, Ray has raised $247 million in venture and grant funding to date, per Crunchbase data. 4. Omni, $120M, AI analytics: Omni, developer of an AI-enabled analytics platform, closed on $120 million in Series C funding led by Iconiq Growth. The financing set a $1.5 billion valuation for the 4-year-old, San Francisco-based company. 5. Tortugas Neuroscience, $106M, biotech: Framingham, Massachusetts-based Tortugas Neurosciences, neurology-focused biotech startup, scooped up $106 million in Series A funding. Founding investor Cure Ventures co-led the round alongside The Column Group and AN Venture Partners. 6. AcuityMD, $80M, medtech: AcuityMD, an AI-enabled data and research platform for medtech industry customers, picked up $80 million in Series C investment. StepStone Group led the funding for the 7-year-old, Boston-based company. 7. OpenAI, $75M, foundational AI: Robinhood Ventures announced that it purchased $75 million worth of San Francisco-based OpenAI’s common stock. The shares are owned by Robinhood Ventures Fund I, a publicly traded fund that provides investors exposure to a curated portfolio of private companies. 8. Orkes, $60M, workflow orchestration: Orkes, developer of an AI-enabled software workflow orchestration platform, secured $60 million in Series B funding. AVP led the financing for the 5-year-old, Silicon Valley-based startup. 9. Courier Health, $5 ### a16z chart drop: stablecoins flip from crypto-bro transfers to actual payments **Source:** a16z AI **Link:** https://www.a16z.news/p/charts-of-the-week-software-ate-the Charts of the Week: Software Ate the World Railroad GPT; Stablecoins volumes are shifting from transfers to payments; The Next Decade of News; See ya later, productivity gains America | Tech | Opinion | Culture | Charts We’re excited to welcome Lisha Li to the a16z Infra team. See her announcement here. -AD Software ate the world Obviously, we’re biased, but it’s hard to overestimate just how important technology is to the global economy. You might even say that software, literally, ate the world: The top 10 public companies by market cap are larger than the combined GDPs of the G7 (ex-US)--and that would be true, even if one excluded Saudi Aramco, which no one would consider a “tech” company. (Although it was founded in San Francisco!)1 To be fair, the Top 10 list is more “tech and semis [and however one would categorize Tesla and Apple]” than pure-play software, but the point stands: tech isn’t just a big deal, it’s the biggest deal. And tech’s global takeover has all happened fairly recently: The top 10 techcos were a small fraction of the G7 (ex-US), until cloud really began to hit its stride in ~16-’17. From that point, it took less than a decade for their combined market cap to eclipse the rest-of-world’s GDP (ex-China). Tech’s ascendancy isn’t just a changing of the guard, either. The biggest companies are much bigger than they were, even just 10 years ago: The combined market cap for the 10 largest companies in the S&P is ~6x larger than it was in 2015, and comprises ~2x larger share of the total index. To be sure, there was in fact a changing of the guard. The composition of the Top 10 changed-over dramatically, relative to prior decades. By 2025, there were only three holdovers from the previous decade, and only one (Microsoft, a tech company), from the decade before that. If you were an investor back in 2015, and you were trying to model comparable outcomes for techcos based on the biggest companies in the index, you would have undercounted the upside by a country mile (or 6). Fundamentally, tech “busted the model,” by redefining the outer limit of how large companies could become. And the outer limit still appears to be moving outwards! Indeed, tech has become even more central to the global growth story, as of late. Last week, we showed that Tech earnings are expected to grow ~2x faster than the rest of the market. But, if you look back even further, you would notice that tech is contributing an historically large share of the market’s overall earnings growth: Since 2023, Tech has been responsible for ~60%+ of earnings growth (give-or-take), market-wide. Other than a brief moment for energy in the early aughts, no other sector has played such a central role in the earnings story (and for quite so long) this century. At this point, it’s fair to say that tech isn’t just a cycle, it is the cycle. Railroad GPT We just told you that tech is an unprecedentedly large deal, but that’s not actually true. In the industrial era, no sector has e ### Marginal Revolution flags a paper where agentic AI hits the same median causal estimates as human economists, with tighter tails **Source:** Marginal Revolution **Link:** https://marginalrevolution.com/marginalrevolution/2026/04/a-comparison-of-agentic-ai-systems-and-human-economists.html?utm_source=rss&utm_medium=rss&utm_campaign=a-comparison-of-agentic-ai-systems-and-human-economists This paper compares agentic AI systems and human economists performing the same causal inference tasks. AI systems and humans generally obtain similar median causal effect estimates. While there is substantial dispersion of estimates across model instances, the human distributions of estimates have wider tails. Using AI models as reviewers to compare and rank “submissions,” the following ranking emerges regardless of reviewer model: (1) Codex GPT-5.4, (2) Codex GPT-5.3-Codex, (3) Claude Code Opus 4.6, and (4) Human Researchers. These findings suggest that agentic AI systems will allow us to scale empirical research in economics. I enjoy the name of the author, namely Serafin Grundl. Here is the paper, via Ethan Mollick. You could interpret these results as showing the AIs have fewer hallucinations. And just to reiterate a key point from the paper: The second part of this paper is an AI review tournament in which “submissions” (codes and write-ups) from humans and the AI models are compared and ranked against each other. The reviewers are the following AI models: Gemini 3.1 Pro Preview, Opus 4.6 and GPT-5.4. For each review the reviewer is asked to write a report comparing four submissions (human, Opus 4.6, GPT-5.3-Codex, GPT-5.4). Each reviewer model writes comparison reports for the same 300 comparison groups. The average rankings are strikingly similar across reviewer models: (1) Codex GPT-5.4, (2) Codex GPT-5.3-Codex, (3) Claude Code Opus 4.6, and 2(4) Human Researchers. Who comes in last? Hi people!