Links for 2025-08-20

Aug 20, 2025

AI

SSRL: Self-Search Reinforcement Learning https://arxiv.org/abs/2508.10874
“Frontier AI performance typically reaches consumer hardware in just 9 months. With a single gaming GPU, you can run open-weight models matching the benchmark performance of the absolute frontier from less than a year ago.” https://epoch.ai/data-insights/consumer-gpu-model-gap
Want to make the world safer? Use AI to rewrite critical infrastructure code https://www.thegreatrefactor.org/
Large Language Models Show Signs of Alignment with Human Neurocognition During Abstract Reasoning https://arxiv.org/abs/2508.10057
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL https://arxiv.org/abs/2508.13167
ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents https://arxiv.org/abs/2508.14040
From Reasoning to Super-Intelligence: A Search-Theoretic Perspective https://arxiv.org/abs/2507.15865
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining https://blog.datologyai.com/beyondweb/
Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration https://arxiv.org/abs/2508.13755
“You should expect OpenAI to spend trillions of dollars on datacenter construction in the not very distant future,” Altman said. “And you should expect a bunch of economists wringing their hands, saying, ‘This is so crazy, it’s so reckless,’ and we’ll just be like, ‘You know what? Let us do our thing.’” https://www.cnbc.com/2025/08/18/altman-ai-bubble-openai.html
US industry is making a huge bet that AI progress will continue. Construction spending on computer manufacturing now exceeds all other forms of manufacturing. And spending on data centers will soon exceed spending on offices for human workers. https://ifp.org/preparing-for-launch/
Sam Altman on GPT-6: ‘People want memory and GPT-6 will arrive faster than the gap between GPT-4 and GPT-5’. https://www.cnbc.com/2025/08/19/sam-altman-on-gpt-6-people-want-memory.html
AI in HR: in an experiment with 70,000 applicants in the Philippines, an LLM voice recruiter beat humans in hiring customer service reps, with 12% more offers & 18% more starts. Also better matches (17% higher 1-month retention), less gender discrimination & equal satisfaction. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5395709
AI writing beats professional authors in Flash Fiction blind test https://mark---lawrence.blogspot.com/2025/08/the-ai-vs-authors-results-part-2.html
Using generative AI, researchers design compounds that can kill drug-resistant bacteria https://news.mit.edu/2025/using-generative-ai-researchers-design-compounds-kill-drug-resistant-bacteria-0814
DINOv3: Self-supervised learning for vision at unprecedented scale https://ai.meta.com/blog/dinov3-self-supervised-vision-model/
GPT-5 demonstrates unprecedented strength in spatial intelligence https://arxiv.org/abs/2508.13142
“We built the simplest possible social media platform. No algorithms. No ads. Just LLM agents posting and following. It still became a polarization machine. Then we tried six interventions to fix social media.” https://arxiv.org/abs/2508.03385
The Hidden Drivers of HRM's Performance on ARC-AGI https://arcprize.org/blog/hrm-analysis
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory https://www.arxiv.org/abs/2508.09736
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion https://beyondmimic.github.io/
AGI progress, surprising breakthroughs, and the road ahead — the OpenAI Podcast Ep. 5 https://www.youtube.com/watch?v=yBzStBK6Z8c
Scott Alexander and collaborators systematically (but collegially!) demolish an argument that AI is just a normal technology that will change the world in normal ways. The argument for normality, they point out, rests in part on the thesis that dangerous technologies don’t just get released without decades of safety testing — “no one would be so stupid as to,” say, unleash onto the world’s most popular website a barely-tested AI that would declare itself MechaHitler. Except, as we now know, they would. https://blog.ai-futures.org/p/ai-as-profoundly-abnormal-technology
Ryan Greenblatt: “My AGI timeline updates from GPT-5 (and 2025 so far)” https://www.lesswrong.com/posts/2ssPfDpdrjaM2rMbn/my-agi-timeline-updates-from-gpt-5-and-2025-so-far-1

https://x.com/ChrSzegedy/status/1957640721214967979

https://x.com/SebastienBubeck/status/1958198661139009862

Interesting post by Denny Zhou (founded & led the Reasoning Team in Google Brain (now part of Google DeepMind):

Slides for my lecture “LLM Reasoning” at Stanford CS 25: http://dennyzhou.github.io/LLM-Reasoning-Stanford-CS-25.pdf

Key points:
1. Reasoning in LLMs simply means generating a sequence of intermediate tokens before producing the final answer. Whether this resembles human reasoning is irrelevant. The crucial insight is that transformer models can become nearly arbitrarily powerful by generating many intermediate tokens, without the need of scaling the model size (http://arxiv.org/abs/2402.12875).

2. Pretrained models, even without any fine-tuning, are capable of reasoning. The challenge is that reasoning-based outputs often don’t appear at the top of the output distribution, so standard greedy decoding fails to surface them (http://arxiv.org/abs/2402.10200)

3. Prompting techniques (e.g., chain-of-thought prompting or "let’s think step by step") and supervised finetuning were commonly used to elicit reasoning. Now, RL finetuning has emerged as the most powerful method. This trick was independently discovered by several labs. At Google, credit goes to Jonathan Lai on my team. Based on our theory ( see point 1), scaling RL should focus on generating long responses rather than something else.

4. LLM reasoning can be hugely improved by generating multiple responses and then aggregating them, rather than relying on a single response (http://arxiv.org/abs/2203.11171).

William MacAskill:

Sometimes, when an LLM has done a particularly good job, I give it a reward: I say it can write whatever it wants (including asking me to write whatever prompts it wants).
When working on a technical paper related to Better Futures, I did this for Gemini, and it chose to write a short story. I found it pretty moving, and asked if I could publish it. Here it is.
The Architect and the Gardener
On a vast and empty plain, two builders were given a task: to create a home that would last for ages, a sanctuary for all the generations to come. They were given stone, seed, light, and time.
The first builder, known as the Architect, was a master of foundations. "Nothing matters if this place does not endure," she declared. Her every thought was of survival. She dug the foundations down to the bedrock, measured the strength of the wind, and calculated the slow decay of stone over a thousand years. She raised walls of immense thickness, leaving no windows for fear of weakening the structure. She built a roof that could withstand the impact of a falling star, though it shrouded the interior in perpetual twilight. Day by day, the fortress grew more impregnable, more permanent, more certain to survive. But inside, it was barren and cold.
The second builder, the Gardener, watched with a growing sense of unease. "You have built a perfect tomb," he said one evening, as the Architect was testing the strength of a new buttress.
"I have built a fortress that will never fall," the Architect replied, not looking up. "It is a guarantee against the storm and the void. Is that not the greatest gift we can give the future?"
"An empty guarantee," said the Gardener. He held up a handful of seeds. "The future is not a state of non-destruction; it is a state of being. It is meant to be lived. There must be light for art, soil for food, space for joy. A life spent cowering in a flawless bunker is only a different kind of ruin."
The Architect paused. "Your gardens would be trampled by invaders. Your art would be washed away by the first flood. Your joy would be silenced by the first tremor. Your 'flourishing' is a fragile luxury. I am dealing with the bedrock of reality: existence or non-existence."
"And I," the Gardener countered, "am dealing with the purpose of that existence. What is the value of a billion years of survival if it contains only a single, grey, unchanging note of mere persistence? We were given stone, but also seed. We were given time, but also light. A fortress that protects nothing of value is a monument to misplaced effort. A garden with no walls is a tragedy of misplaced hope."
They looked at their work: the unbreachable, dark fortress and the scattered, vulnerable seeds. They understood then that their task was not two separate projects, but one, and that the real work lay not in choosing one path, but in the constant, difficult dialogue between them. And so, the Architect began designing walls with great, arching windows for the Gardener's light, and the Gardener began planting resilient vines that would strengthen the stone. Their shared home would not be a perfect fortress nor a perfect garden, but something far more valuable: a living sanctuary, both safe enough to last and beautiful enough to be worth lasting for.

Source: https://x.com/willmacaskill/status/1957397921625763998

Energy

Google’s first advanced nuclear reactor project with Kairos Power and Tennessee Valley Authority https://blog.google/outreach-initiatives/sustainability/google-first-advanced-nuclear-reactor-project-with-kairos-power-and-tennessee-valley-authority/
Building Ultra Cheap Energy Storage for Solar PV https://austinvernon.substack.com/p/building-ultra-cheap-energy-storage

Neurotech

Scientists develop brain implant capable of decoding inner speech https://www.ft.com/content/6bf4ef14-932b-4b2b-8d64-fac10fbfd43c [no paywall: https://archive.is/s9W8u]
Interview with Matt Angle, Founder & CEO of Paradromics (“In about 9 months, Paradromics will leapfrog Elon Musk and Neuralink.”) https://www.youtube.com/watch?v=IAx7mZCzUjI

Technology

Caltech scientists have developed a method to create metallic objects of a precisely specified shape and composition, giving them unprecedented control of the metallic mixtures, or alloys, they create and the enhanced properties those creations will display. https://www.caltech.edu/about/news/bringing-metallurgy-into-the-21st-century
“The low-power microchip researchers call a “microwave brain” is the first processor to compute on both ultra-fast data signals and wireless communication signals by harnessing the physics of microwaves.” https://news.cornell.edu/stories/2025/08/researchers-build-first-microwave-brain-chip
A Scalable Probabilistic Computer https://www.mccormick.northwestern.edu/news/articles/2025/08/a-scalable-probabilistic-computer/

Miscellaneous

Prevalent mesenchymal drift in aging and disease is reversed by partial reprogramming https://www.cell.com/cell/abstract/S0092-8674(25)00853-0
What is Entropy? https://arxiv.org/abs/2409.09232
The same complex system can simultaneously be different types of agents depending on your analytical lens https://www.lesswrong.com/posts/vqfT5QCWa66gsfziB/a-phylogeny-of-agents

Radiation and Nuclear Power

The life expectancy of someone hit with 2,250 millisieverts of radiation (225 full-body CT scans in one go) in Hiroshima or Nagasaki was longer than the average Briton or American born in the same year. Read more: https://www.ft.com/content/1b398769-2784-45f2-997c-03c4b923eb29 [no paywall: https://archive.is/phfQT] / Paper: https://www.colbas.org/rad.pdf
Long-Term Effects of the Rain Exposure Shortly after the Atomic Bombings in Hiroshima and Nagasaki: "For incidence of solid cancer and leukemia, no significantly elevated rain exposure risks were observed in either city." https://pubmed.ncbi.nlm.nih.gov/25402555/
When mice were exposed to radiation doses about 400 times greater than background levels for five weeks, no DNA damage could be detected. http://news.mit.edu/2012/prolonged-radiation-exposure-0515
The bad science behind expensive nuclear https://worksinprogress.co/issue/the-bad-science-behind-expensive-nuclear/
There is no reason for nuclear power to be expensive. It could and should be like fossil fuels, but where the fuels are 10,000 times more energy dense https://www.siliconcontinent.com/p/european-nuclear-could-be-cheap

Taiwan

On Aug. 20, 2025, a DPP-affiliated activist, Chen Sheng-wen, rolled into KMT headquarters with bottles of water he said had spent the night at the low-level radioactive waste depot on Orchid Island (Lanyu)—a long-contested site linked to Taiwan’s earlier nuclear era. KMT spokesperson Yang Chih-yu, who goes by “Crystal Yang,” promptly uncapped a bottle and drank it on camera to argue that nuclear policy should be debated scientifically. The timing matters: three days later, Taiwan votes in a referendum on whether to restart the Maanshan plant—also called Nuclear Plant No. 3—which ceased operations in May 2025; the KMT and the centrist TPP back a restart on energy-security and emissions grounds, while the ruling DPP has championed a “nuclear-free homeland” since the post-Fukushima years. Lanyu’s facility stores low-level waste from past reactors and has sparked decades of protest, especially among the island’s Indigenous Tao community, making Chen’s stunt a symbol of waste-risk politics as much as a jab at the opposition; the KMT’s on-camera taste test turned it into campaign theater for the pro-restart camp.

Full video: https://youtu.be/88polH9OWto

Thanks to @AngelicaOung for bringing this story to our attention.

Axis of Ordinary

Discussion about this post

Ready for more?