Week 47 · 2025-11-17 → 2025-11-23 · 30 newsletters

Gemini Three Lands

gemini-3-launch · ai-economy-maturation · macro-and-markets · epistemics-and-explanation

A 31-email week with one clear news anchor, Gemini 3.0 launched Tuesday, and quieter through-lines on AI product maturation, late-year macro positioning, and clear thinking. The inbox was not dense, but the signal was well-aligned.

Gemini 3 Launch: Two Smart Takes on the Same Tuesday

The week's news event was Gemini 3.0, and the two best reads on it came from very different angles. Ethan Mollick at One Useful Thing ran the demonstration-not-benchmark approach. He gave Gemini 3 a screenshot of his own November 2022 Substack post on GPT-3 and asked it to "show how far AI has come since this post by doing stuff." Gemini built him a fully interactive Candy-Powered FTL Starship Simulator. The framing is the line worth keeping: "In 2022, AI could describe the engine. In 2025, AI can code the engine, design the interface, and let you pilot the ship yourself."

Mark Humphries at Generative History ran the more rigorous version. He had early access through DeepMind and tested the model on his actual research workload, transcribing 18th-century handwritten ledgers, what he calls the Sugar Loaf Test. His finding is that Gemini 3 now operates below a 1% error rate on his test set, and more importantly does the same things the same way over and over. Humphries' framing is the one that matters for knowledge work: the bottleneck on LLM adoption was never benchmark scores, it was repeatability and trust. Coding adopted AI first because the code either runs or it doesn't, a built-in automated check. Most knowledge work has no such check, so reliability is the whole game.

The take: read these two together. Mollick is the right read on capability ceilings, what the model can now do. Humphries is the right read on capability floors, whether you can actually trust it to do the same boring task tomorrow. Both ceilings and floors moved this week, and the floor move is the underrated one. If you only chase headline benchmarks, you miss that the reliability story is finally turning.

AI Economy: PMF Treadmill, Day 2 Problems, and the Prompt Bar UX

The week's most cohesive cluster was about what happens to AI products after launch. Elena Verna opened with the cleanest frame: "The Product-Market Fit Treadmill," arguing that the old PMF model, reach it, ride it, then invest in the next horizon, is dead, because in AI the foundations of your value proposition are renegotiated weekly. signull ran the companion piece in lowercase prose: "day 0 is loud. day 2 is real." His argument is that AI made day 0 almost free, which means it has made day 2 brutally expensive. The seed soil is flooded. The only question that survives the noise is whether a real person comes back because their life feels worse without you.

Yaakov Carno guesting at Kyle Poyar's Growth Unhinged mapped the consequence at the UX layer. He audited 40+ AI products and found the prompt bar has quietly become "the new front door to value." Canva, Notion, Vercel, Lovable, Airtable: they all now lead with a clean hero, a floating input box, and a few "try this" suggestions. The risk is the "beautiful illusion" of "ask me anything," which sets users up for a value moment the product cannot reliably deliver. The Alephic Forward Deployed podcast Episode 2 added the builder-side companion: Noah and Lance on Claude Code skills as "instructing a new hire," the 10% skill-hit-rate problem when you stack 10+ skills, and the Cerebras Llama 120B hook running at 3,000 tokens per second as an invisible routing layer.

Dan Shipper at Every ran the Black Friday "What Is Every" post with day-zero model reviews and the software slate of Cora, Sparkle, Spiral, and Monologue. The aside that mattered was the product list itself: an AI publication that ships five software products. The publishing-plus-software model is the live experiment.

The take: the AI conversation moved from "what can the model do" to "what does the second visit look like." That is a real maturation. Verna's treadmill and signull's day-two framing are the same observation from the operator and the founder seats. If you are shipping AI product in 2026, the prompt bar is table stakes and retention is the moat.

Macro and Markets: Mo-vember, Foreign Banks, and Asymmetric Bets

Three macro reads landed in the same week and broadly reinforce each other. Citrini Research ran "Macro Memo: Mo-vember," forecasting continued labor-market deterioration, a deluge of post-shutdown data unleashing volatility, the Fed holding in December out of caution, and a Santa Rally driven by 2026 fiscal expansion being priced in. Risk-off through mid-December on speculative areas like crypto, quantum, and bitcoin miners, then a resolution into a year-end rally. Sticky bull thesis with a near-term pullback.

Matthew Klein at The Overshoot ran the deeper structural piece: "The 'Sell America Trade', QT, and Foreign Banks." Markets stopped pricing the sell-America trade after April, but Klein's data shows foreign banks have been shedding assets in their U.S. branches since July at one of the fastest rates on record. The decline has been masked by the $500B drop in Fed deposits driven by Treasury rebuilding the TGA after the debt-ceiling raise and the shutdown restricting spending. The piece is the kind of pattern only visible if you read both the headline numbers and the underlying flows.

The Stonkstack ran the asymmetric-bet philosophy in "Buy When There's Blood on the Carpet," invoking the Babe Ruth Effect: frequency matters far less than magnitude, and the most mispriced names are often the ones that look risky.

The take: Citrini is the tactical read, Klein the structural one, Stonkstack the temperament one. Together they argue that the second-order story of the U.S. financial regime change is still developing under the surface even when the price tape looks fine. Foreign bank withdrawals at record pace, with markets indifferent, is the kind of dislocation that gets named in retrospect.

Epistemics: How to Think When the Models Are This Good

A surprisingly tight cluster on the discipline of clear thinking. Addy Osmani at Elevate ran "Critical Thinking during the age of AI," structured around the classic Who/What/Where/When/Why/How framework. His TL;DR is the rule worth pinning: don't rely on AI as an oracle, define the real problem before rushing to a solution, use the 5 Whys to find root causes, and communicate with evidence not opinions. The whole essay is the playbook for engineers working with AI without surrendering judgment to it.

Annie Duke at Thinking in Bets brought on cognitive scientist Tania Lombrozo to dig into the science of a good explanation. The key move Lombrozo describes is shifting from "could it be true?" to "must it be true?", which is the same epistemic upgrade Osmani describes from the engineering side. Both pieces argue that satisfying explanations regularly trump correct ones, and the discipline of probabilistic thinking is what protects you from your own preferred narratives.

dynomight ran the contrarian post on the same theme: "Make product worse, get money." He pushes back on the dating-app theory that platforms have a perverse incentive to keep users single, pointing out that the same logical structure could be used to claim pizza restaurants want you hungry and automakers want unsafe cars. Bad incentive analysis is the most flattering form of explanation and therefore the most easily believed. A perfect companion to Lombrozo. George Milton at Gross to Net launched his podcast this week with the same instinct: stop optimizing gross, do the real net math on the costs of how you spend your time.

The take: the strongest cross-domain pattern of the week was the case for slower thinking. Osmani's checklist, Lombrozo's "must it be true," dynomight's incentive-fable warning, and Milton's gross-vs-net all point at the same discipline. The models are good enough now that the binding constraint is no longer the answer, it is the quality of the question.

Ideas Worth Reading

Peter Teague at The All American ran two pieces. "Spark & Flag" on the Epstein files release makes the case that the opposition's biggest danger is believing a single focus on Republican hypocrisy will finally break Trump, when his actual play is creating ambiguity and stoking partisan divides. The second, by Jason Mangone, returns to Oliver Wendell Holmes' 1884 Memorial Day speech to argue patriotism can be rebuilt from family ritual and local community engagement. Zach Everson at 1100 Pennsylvania reported, via Forbes, that Alt5 Sigma, the publicly traded company that accumulated $1.5B of World Liberty Financial cryptocurrency in August in a deal routing more than $500M to a Trump-affiliated entity, may have violated SEC disclosure rules by waiting six weeks before reporting that its CEO had been placed on leave. Julia Maltby wrote a clean five-months-as-an-LP piece for emerging GPs, with the rule worth keeping: showcase process as much as output. SeattleDataGuy interviewed Brian Femiano, a staff data engineer at Apple, on what actually separates senior from staff: less technology, more orchestration. Kerman Kohli opened his "Investing in the Machine Economy" series with a dependency-chain framework for picking compute-stack names beyond NVIDIA.

Outside Interests

Liz Prueitt at Have Your Cake ran the maple-pecan shortbread recipe with the fraisage hand technique. Wendy MacNaughton at DrawTogether ran "From Deficit to Abundance," a gratitude-focused drawing lesson framed around holding the two truths of Thanksgiving at once. Lingthusiasm interviewed Danny Bate on the history of Proto-Indo-European, with Laura Spinney's book Proto as the recent-evidence anchor. Emily Kramer at MKT1 ran five marketing planning thought experiments for breaking out of strategy ruts.

Three Takeaways from the Week

Gemini 3 is the news, but the underrated story is that reliability is finally turning. Mollick's interactive Candy-Powered FTL Starship Simulator is the fun read; Humphries on sub-1% handwriting transcription error rates is the read that matters for actual knowledge work. Capability ceilings get covered to death. Capability floors are where adoption lives.

The AI economy moved from day-zero mode to day-two mode this week. Elena Verna's PMF treadmill, signull's day-two framing, and Yaakov Carno's prompt-bar audit are all pointing at the same shift: the easy part is fully commoditized, and the hard part of the user coming back tomorrow because their life feels worse without you is the new moat. If you are operating in AI, this is the only question.

If you only revisit three pieces from the week, I'd suggest Mark Humphries on Gemini 3's Sugar Loaf Test for the cleanest read on AI reliability finally turning, signull's "day 0 is loud, day 2 is real" for the cleanest frame on what AI product actually requires now, and Matthew Klein on the quiet foreign-bank pullback for the macro story most of the market is not yet pricing.