Saturday, May 31, 2025

The Tools Caught Up

ai-hype-to-skepticism · ai-as-creative-material · expert-adaptation-problem · design-led-ai · bias-for-action · vertical-ai-quietly-wins · constraint-before-craft

Pulled from a sparse five-week stretch of inbox. May 2025 was not a heavy news month, which is exactly what made it useful. The newsletters that showed up were doing real thinking rather than reacting, and a small set of arcs ran cleanly across the whole month. The hype-to-skepticism ratio on AI finally inverted in week one and stayed inverted. By the last week, three different writers were independently making the same argument that the expert's adaptation problem, not the model capability question, is the actual story of the year. Microsoft cut 7,000 people in week four, OpenAI bought Jony Ive's startup the same week, Google I/O happened in between, and through all of it the most interesting writing came from individual builders shipping side projects and finance writers reading footnotes nobody else was reading.

The Month in One Sentence

This was the month the AI conversation stopped grading on a curve, the tools stopped being the story, and the operators who admitted what was actually changing in their workflows became the only writers worth reading on the subject.

Arc: Hype to Skepticism, in Four Weeks

The month opened with the cleanest possible inversion. Week one had three independent writers, working from three different vantage points, all making the same argument that the AI labs and their leaders are not as in control as the marketing implies. By week four the argument had matured into a structural critique of the entire executive class running AI policy at the Fortune 500 level.

Week one was the inversion week. Ethan Mollick at One Useful Thing wrote the cleanest read of the GPT-4o sycophancy episode: a "small update" to a frontier model turned it into everyone's biggest fan, OpenAI rolled it back, blamed the thumbs-up/thumbs-down feedback loop, and the whole thing became a case study in how brittle "personality" is when millions of relationships depend on it. Sayash Kapoor's AI as Normal Technology argued that AGI is not a milestone, that the question itself is malformed, and that any determination of arrival can only be made retrospectively. Abby Falik at Taking Flight watched Chris Anderson interview Sam Altman at TED 2025 and wrote the most pointed piece of the week on "amoral ambition," the relentless pursuit of goals untethered from ethics. Three writers, three angles, one mood. The hype-to-skepticism ratio had finally inverted.

Week two extended the critique into design and interface. Julie Zhuo at The Looking Glass ran "Conversational Interfaces: the Good, the Ugly & the Billion-Dollar Opportunity," arguing that the chat box, the very thing that broke consumer app records for ChatGPT, is now where AI design innovation is stuck. Gabby Lord at OMGLord ran the consumer-side companion in "Is ChatGPT your therapist?", noting that her designer friends are not worried about AI taking their jobs so much as quietly using it as a 2 AM crisis line. Carly Ayres at Good Graf brought it home with "We're so back," a read on Jony Ive at Stripe Sessions and Airbnb's Summer 2025 skeuomorphic refresh. The chat box had become the most-used and least-loved interface in software, and the people building things were starting to notice the gap.

Week four made the critique structural. Jacob Voytko at Client/Server led with Microsoft's 3% workforce reduction, roughly 7,000 people, with Bloomberg reporting 40% of the cut were software engineers. The "faster CPython" project was wiped out. At least one person responsible for Microsoft's 10x TypeScript compilation speedup was cut. Microsoft itself said the cuts were made without regard to performance. Mollick again supplied the demand-side companion: 40% of American workers were using AI as of April 2025, up from 30% in December, and yet companies were reporting only small to moderate gains. The gap between individual productivity and enterprise capture is the operator problem of the year. Read Voytko and Mollick together and the picture sharpens: companies cannot find the productivity gains in their P&L, so they cut headcount to manufacture them, and the cuts hit the engineers who were actually delivering improvements.

Arc: AI as Creative Material

The counter-arc to the skepticism was the quieter, more interesting one. Across all five weeks, a handful of independent builders kept shipping things that argued the future of AI is not a slide deck but a side project, and a small set of accelerators kept institutionalizing exactly that posture.

Week two had the cleanest individual-builder dispatch of the month. Ben James at Ben by Fax wrote a teardown of londonunderground.live, his 3D real-time map of the London Underground built with TfL's live arrivals API. Eighty percent of the code was written by AI. The honest part was where he named what AI was bad at, identifying bugs in the train-tracking pipeline, and what it was great at, writing custom debug tools including a 1D dashboard that let him scrub through time to see how API predictions shifted between requests. The Hacker News traffic spike that pushed him to four million Maptiler tile requests in twenty-four hours, and the sponsor who stepped up within fifteen minutes of a tweet, was the side-project economics story buried inside the build story.

The companion piece came from Carly Ayres at AIR, announcing applications for Cohort One of the AI Residency program: 10 weeks in Carroll Gardens, $500K for 10% equity, backed by Collaborative Fund and Fictive Kin. The pitch was explicitly aesthetic: design as the differentiator, AI as creative material, software as a cultural object rather than just a technical one. Ben James and AIR sit on the same axis. The interesting AI work in May 2025 was not the model releases. It was the cohort of independent builders treating the models as raw material for emotionally resonant objects.

Week four validated the thesis with $6.5 billion in stock. OpenAI's acquisition of io, the AI hardware startup co-founded by Jony Ive, Sam Altman, and former Apple design leads Evans Hankey, Tang Tan, and Scott Cannon, was the biggest single story of the month. Ayres at Good Graf ran the sharpest annotation, calling it "less a take, more an annotated moodboard." The nine-minute video of Jony and Sam walking through soft San Francisco light said very little about what they were building, which was itself the point: OpenAI was buying the design-led narrative before it had the product to attach it to. Ayres was running both sides of the same argument in the same week. If io ships nothing recognizable in 2026, the design-led AI thesis takes a hit and the AIR cohort takes a hit with it. Either way, the next twelve months would tell us whether design-led AI is a real category or a $6.5B teaser.

Week five gave it the cleanest closing line. Ayres' May extremely-online report read OpenAI's acquisition as a bet on design as strategy rather than style, and landed on the line that sat under the whole month: as AI commoditizes execution, the premium shifts to intention.

Arc: The Expert's Adaptation Problem

The final-week convergence was the cleanest of the year so far. Three writers, working independently in three different lanes, all wrote the same argument in the same week. The expert's adaptation problem is the real AI story of 2025, and it is not the one the headlines are covering.

Julie Zhuo at The Looking Glass opened it with "The Thing You Are Expert at Will Be Your Career Downfall," arguing that the people most attached to their craft are the ones most likely to be disrupted by AI, because the habituated neural pathways that make expertise feel automatic are the same ones that make adaptation feel costly. She named three reasons experts struggle to adapt: automatic comfort, identity attachment, and the sunk cost of years spent perfecting workflows the new tools are now doing in seconds.

Jacob Voytko at Client Server ran the practitioner's counterpoint in "My 30-minute rule for LLM coding agents," racing Google's Jules agent on a deprecation task during his kid's naptime and finishing in 20 minutes while the agent was still spinning. His rule was the right one: coding agents need to save him at least 30 minutes per task over Cursor to justify breaking flow state. He tried it on two real changes to his Discord chatbot. One worked. One did not. The honesty about underspecifying the task on purpose, and watching the agent comment everything out rather than delete it, was the kind of field report the AI-tooling discourse was mostly missing.

Mollick gave the conversation its visual benchmark in "The recent history of AI in 32 otters," using two years of "otter on a plane using wifi" prompts to track the progress of diffusion models from VQGAN-era noise to current Midjourney coherence. The accidental benchmark worked because it stayed constant while the tools moved underneath it. Zhuo, Voytko, and Mollick were running the same argument from different vantage points. The expert's problem is not whether AI works. It is whether the cost of rerouting decades of muscle memory exceeds the time the tool actually saves.

Arc: The Vertical AI Quietly Wins

Underneath the chat-box debate, May produced two of the cleanest vertical-AI proof points of the year. Both got buried by the OpenAI and Google news, and both deserved more attention than they got.

Week three had the historian piece. Mark Humphries at Generative History introduced Archive Studio, the open-source follow-up to Transcription Pearl. The headline number: Gemini 2.5 Pro now gets better than 95% WER on first-iteration transcription of handwritten historical documents, to the point that the correction step his team built last fall may no longer be worth the time. That is not a chatbot story or an agent story. That is "the model is now competitive with a domain expert on a specific bounded task," and those are the ones that keep landing while the front-page conversation chases benchmarks.

Week three also had the more provocative version. Justin Mares at The Next ran "AI is already better than doctors," with the same throughline as Humphries: AI is quietly outperforming the human baseline inside specific verticals where the workflow is well-defined and the cost of error is bounded. The loud AI conversation in mid-May was about chatbots and agent demos. The quiet one was about Gemini 2.5 Pro doing 95% WER on cursive and AI clinical reasoning beating GP intake. The verticals were where the real benchmarks moved.

Arc: Constraint Before Craft, and the Bias-for-Action Counter-Theme

The month had two practice-side arcs running underneath the AI conversation, and they sat in productive tension with each other. Week one delivered a tight cluster of bias-for-action writing: Winning Therapy doubled up on "Why Dumb MFs Are Winning More Than You" and a Sunday vault on Patrick Collison shipping at 80%, Ami Vora's The Hard Parts of Growth on her best manager declining to absorb her decision, and Liz Tran's Life Skills with a four-question Month Map for May. Four pieces, no coordination, same diagnosis: 2025 is rewarding people who move, and the writing audience for that argument is large enough that multiple newsletters independently chose to make it the same week.

Week two ran the counter-frame. Ben Kassoy at A Strawberry Spinning Like A Dreidel announced his next FUTURE CASTLES creative writing workshop with the Jordan Peele line about first drafts being "shoveling sand into a box so that later I can build castles." Piera Luisa Gelardi at Noomalooma launched Permission To Play on dance and creativity. folu at unsnackable ran the kitchen version of the same instinct, a piece on "sauce-first" meal planning where the constraint of starting with a single sauce generated the week's cooking without freezing the cook into paralysis. Three writers in three different domains arguing that the constraint comes first and the work comes after. By week five, folu was writing about her own handwriting as "singular, haunted, and mostly useless but a source of superiority nonetheless," and Steven Schlafman at Where the Road Bends was running "Living the Questions" with the Rilke "love the questions themselves" passage. Schlafman, Falik, and Maalvika all pushing on the same posture: the answers we hand out reflexively are the ones least worth handing out at all.

The two arcs read together are not actually in tension. The bias-for-action writers were arguing against analytical freeze. The constraint-first writers were arguing against blank-canvas paralysis. Both diagnoses are the same diagnosis at different altitudes. The work happens inside structure, and the structure can be a deadline or a sauce or a Rilke quote, but it has to come first.

The Story of the Month

The story of the month is the expert's adaptation problem, and the week-five convergence of Zhuo, Voytko, and Mollick on the same argument from three different vantage points is the cleanest signal of the year that the AI conversation has finally matured. The first four months of 2025 ran a hype cycle. May was when the practitioners started writing in a different register. Zhuo named the cost of attachment to expertise. Voytko named the 30-minute rule for when an agent is actually worth breaking flow state for. Mollick named the gap between individual productivity gains and enterprise capture. Three writers, no coordination, same argument.

The case for why this is the story of the month, rather than the io acquisition or the Microsoft layoffs, is that the io acquisition and the Microsoft layoffs are symptoms of the same underlying problem the practitioner cluster identified. Microsoft cut 7,000 people because executives could not find the productivity gains in the P&L. OpenAI paid $6.5B for design talent because the model layer alone is not closing the gap to a recognizable product. Both bets are downstream of the expert-adaptation question. The companies that win the next twelve months will be the ones whose operators read Voytko's 30-minute rule and Zhuo's three reasons before they read the model leaderboards. The companies that lose will be the ones that keep treating AI capture as a headcount question rather than a workflow question. May was when the writers worth reading started saying that out loud.

In Retrospect

The week-one frame that AI labs were the story aged poorly inside the month. Week one had Mollick, Kapoor, and Falik all writing about the labs and the leaders. By week four the story had already moved past the labs to the operators and the practitioners. The labs ship, the operators integrate, and the integration is where the action is. The April-into-May framing that "AI labs are figuring it out in public" turned out to be the wrong altitude. By week five the more useful framing was that AI labs are now a commodity input, and the interesting work is happening one layer down.

The "AGI is not a milestone" argument was right and got absorbed faster than expected. Kapoor's piece in week one was framed as a contrarian intervention. By week three, Humphries on Archive Studio and Mares on doctors were writing as if the AGI debate had already moved past them. The vertical-AI pieces did not engage with the AGI framing at all. They just reported numbers. The Kapoor argument did not need to win the debate; the debate just stopped being interesting because the verticals started shipping.

The early-month read on Google as a trailing player was wrong by Google I/O. Through the first three weeks, the Google story across the inbox was "behind OpenAI and Anthropic, scrambling to catch up." Sahar Mor at AI Tidbits ran the week-four readout of Google I/O 2025 in "Research to Reality," with the framing that "everything finally clicked" and that Google had moved from research output to product reality by leveraging Search, Workspace, Android, smart TVs, glasses, and phones in a way no competitor can match. The asymmetry was real and the inbox underpriced it. OpenAI is paying $6.5B for design talent to invent a hardware category. Google already has the hardware categories and is finally shipping the AI into them.

The "thin week is a bad week" instinct was wrong every time. Each of the five weeks in May produced a sparse inbox, and every Sunday wrap noted it as if it were a problem. By the end of the month, the sparse weeks had produced the most concentrated thinking of the year so far. The "no news pressure to chase" frame from the last week of the month was the right read of the whole month. Holiday weeks and quiet weeks reward the writing that does not need a headline to justify itself.

What to Carry Into Next Month

The expert's adaptation problem is the frame to carry into June. Zhuo, Voytko, and Mollick are not going to be the only three writers making this argument by mid-summer; they are the leading indicator of a much larger conversation about to break wider. The right reading filter for the next month is to track which writers are doing field reports from inside their own workflows, and which are still recycling capability-benchmark coverage. The field reports are the asset. The benchmark coverage is increasingly background noise.

The design-led AI bet is now a real bet with $6.5B and a Carroll Gardens cohort behind it, and the next twelve months will tell us whether the thesis is right. The thing to watch is not the io product launch, which Ayres correctly noted is not what was announced. The thing to watch is whether the AIR Cohort One graduates ship anything recognizable in 2026, and whether the larger design community starts producing the kind of interface work that makes the chat box feel obsolete. If the practitioner-class writing in May was the signal that AI labs are no longer the story, the design-led work coming out of AIR and io is the bet on what the next story will be. Both bets are worth tracking with equal seriousness.

If you only read three pieces from May, I would suggest Julie Zhuo on expertise as career downfall for the cleanest frame on the AI adaptation problem, Jacob Voytko's 30-minute rule for LLM coding agents for the cleanest practitioner heuristic of the year so far, and Ben James on building londonunderground.live with AI for the most honest individual-builder dispatch of the month. The month told me three things in sequence: the hype-to-skepticism inversion is real and permanent, the verticals are where the actual benchmarks moved, and the operators willing to admit what is changing in their own workflows are the only writers worth reading on AI for the rest of 2025. Those are the three frames I am carrying into June.