Links for 2025-07-22
Hierarchical Reasoning Model
Inspired by the brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku.
With only 27 million parameters, HRM achieves exceptional performance on complex reasoning tasks using only 1000 training samples.
Paper: https://arxiv.org/abs/2506.21734
Code: https://github.com/sapientinc/HRM
A large-scale vision-language-action (VLA) model
Ever wondered what it takes for robots to handle real-world household tasks? long-horizon execution, deformable object dexterity, and unseen object generalization — meet GR-3, ByteDance Seed’s new Vision-Language-Action (VLA) model!
GR-3 is a generalizable Vision-Language-Action (VLA) model with strong capabilities in complex long-horizon tasks. It understands unseen abstract concepts, manipulates deformable objects robustly, and adapts to novel settings with minimal human data.
✨ Generalization: Generalizes well to unseen objects, environments, and even instructions with abstract concepts.
✨ Long-Horizon Manipulation: Completes long-horizon tasks with strong instruction-following capabilities.
✨ Deformable Object Manipulation: Manipulate deformable objects robustly.
Project Page: https://seed.bytedance.com/en/GR3
Arxiv: http://arxiv.org/abs/2507.15493
AI
Gemini has unlocked a new capability: conversational image segmentation. This enables new use cases that were previously not possible, furthering Gemini’s SOTA image understanding capabilities! https://developers.googleblog.com/en/conversational-image-segmentation-gemini-2-5/
Former Top Google Researchers Have Made a New Kind of AI Agent https://www.wired.com/story/former-top-google-researchers-have-made-a-new-kind-of-ai-agent/ [no paywall: https://archive.is/eSNn4]
DeepMind’s Quest for Self-Improving Table Tennis Agents https://spectrum.ieee.org/deepmind-table-tennis-robots
According to reporting by the WSJ, there are at least ten employees at OpenAI who have turned down $300 million offers from Mark Zuckerberg. https://www.wsj.com/tech/ai/meta-ai-recruiting-mark-zuckerberg-sam-altman-140d5861 [no paywall: https://archive.is/ubWkF]
Logan Kilpatrick: Windsurf Acquisition, Gemini 3, Agentic Browsing, Veo 4, and more! https://www.youtube.com/watch?v=3EQtzP92Z0U
How to Train Your Agent: Building Reliable Agents with RL https://www.youtube.com/watch?v=gEDl9C8s_-4
LLMs Can't See Pixels or Characters https://www.lesswrong.com/posts/uhTN8zqXD9rJam3b7/llms-can-t-see-pixels-or-characters
ARC unveils benchmark to test how quickly agents learn new tasks https://arcprize.org/arc-agi/3/
The geopolitics of artificial general intelligence: from the race to automate ML R&D and hyperscale manufacturing to counter-proliferation and nuclear deterrence. https://fourthoffset.ai/
Scale increases persuasion, +1.6pp per OOM. Post-training more so, +3.5pp. Increasing persuasion decreased factual accuracy. https://arxiv.org/abs/2507.13919
In an industry where 90 percent of drug candidates fail before reaching the market, a handful of startups are betting everything on AI to beat the odds. https://www.wired.com/story/artificial-intelligence-drug-discovery/ [no paywall: https://archive.is/LwDGa]
AI-Designed Drugs Can Now Target Previously ‘Undruggable’ Proteins in Cancer and Alzheimer’s https://singularityhub.com/2025/07/21/ai-designed-drugs-can-now-target-previously-undruggable-proteins-in-cancer-and-alzheimers/
MIT researchers found that special kinds of neural networks, called encoders or “tokenizers,” can do much more than previously realized. https://news.mit.edu/2025/new-way-edit-or-generate-images-0721
WebShaper: Agentically Data Synthesizing via Information-Seeking Formalization https://arxiv.org/abs/2507.15061
The Serial Scaling Hypothesis https://arxiv.org/abs/2507.12549
China Is Spending Billions to Become an A.I. Superpower https://www.nytimes.com/2025/07/16/technology/china-ai.html [no paywall: https://archive.is/xqgpB]
IMO & AI
DeepMind IMO Gold:
Google DeepMind has confirmed that one of its experimental models has also achieved gold medal-level performance at the International Mathematical Olympiad.
Gemini was given the same problem statements and time limit - 4.5 hours - as human competitors, and still produced rigorous mathematical proofs.
Gemini solved the math problems end-to-end in natural language (English).
It was trained it on RL techniques that use more multi-step reasoning, problem-solving and theorem-proving data.
A version of this model with Deep Think will soon be available to trusted testers, before rolling out to Google AI Ultra subscribers.
The IMO have confirmed that their submitted answers are complete and correct solutions.
IMO proofs P1-P5 generated by DeepMind's model: https://storage.googleapis.com/deepmind-media/gemini/IMO_2025.pdf
Regarding P6: “On IMO P6 (without going into too much detail about our setup), the model "knew" it didn't have a correct solution.” https://x.com/alexwei_/status/1947461238512095718
DeepMind says IMO is just a testbed for RL and search working on open-ended problems. The frontier is going to be pushed in all the non-verifiable domains in the upcoming months.
More about the OpenAI IMO gold model:
Some general caveats:
Were the questions particularly easy this year? https://x.com/rpeng233/status/1947383782455292387
Was IMO 25 already within reach by current gen frontier models? Gemini 2.5 Pro + simple self-verification prompt https://github.com/lyang36/IMO25/blob/main/IMO25.pdf
Math
A landmark proof brought a “grand unified theory of mathematics” closer to reality. https://www.nature.com/articles/d41586-025-02197-3 [no paywall: https://archive.is/lflLD]
Why Reality has a Well-Known Math Bias https://linch.substack.com/p/why-reality-has-a-well-known-math
Science and Technology
We Can, Must, and Will Simulate Nematode Brains https://asteriskmag.com/issues/09/we-can-must-and-will-simulate-nematode-brains
Nano-engineered thin-film thermoelectric materials enable practical solid-state refrigeration https://www.nature.com/articles/s41467-025-59698-y
Observation of charge–parity symmetry breaking in baryon decays https://www.nature.com/articles/s41586-025-09119-3
Blood Tests Can Spot Cancer DNA Years Before Actual Diagnosis https://www.sciencenews.org/article/cancer-tumor-dna-blood-test-screening
The use of prime editing to correct several mutations that cause alternating hemiplegia of childhood (AHC), a rare and devastating neurodevelopmental disorder, in patient-derived cells and in two mouse models. https://drive.google.com/file/d/1ibC50ttiMqRh6ijLBcPx8Hgjs_BvmkLk/view
FFmpeg Devs Hit 100x Performance Boost With Handwritten Assembly Code https://www.tomshardware.com/software/the-biggest-speedup-ive-seen-so-far-ffmpeg-devs-boast-of-another-100x-leap-thanks-to-handwritten-assembly-code
German DARPA is looking for up to ten scientists, technologists, and builders to solve a significant problem. The goal is to train individuals who can launch bold, coordinated programs that solve society-scale problems. Deadline 1 Aug.: https://sprind.org/en/design-your-challenge
Observation-selection effect
Water absorbs most wavelengths outside the range of 400-700 nm, leaving a "window" through which sunlight can penetrate tens of meters in the ocean, where early life originated. Photoreceptors that could exploit this relatively unattenuated band provided a clear fitness advantage, so natural selection favored sensitivity to these wavelengths. Later, land animals inherited the same pigments. Thus, what we call "visible light" is largely defined by water's transparency and the peak of the solar spectrum, an example of an observation-selection effect. Organisms can only evolve to sense radiation that reliably reaches them.
Fermi paradox explained
We're early because fast-spreading "grabby" aliens impose a cosmic deadline; you either evolve before they arrive or you never get the chance.
The Theory: Grabby Aliens Create a Deadline
The theory proposes a type of civilization called "grabby aliens". These are "loud" civilizations that have three key characteristics:
They expand at a very high speed of ~0.3-0.8 c.
They make visible, unmistakable changes to the volumes of space they control.
They expand continuously and don't die out, stopping only when they meet another grabby civilization.
The existence of such aliens creates a cosmic deadline. If grabby aliens are destined to eventually expand and take over all available space, then any quiet civilization like ours must appear before they arrive. We observe an empty universe precisely because we evolved early in an unclaimed region. Had we evolved later, we would have been born inside a grabby alien-controlled volume and would not be asking where everyone is. Our earliness, therefore, is not a coincidence but a selection effect: only early civilizations get to see an empty universe.
Prediction: We will encounter the expanding border of a grabby civilization in approximately 200 million to 2 billion years.
Paper: https://grabbyaliens.com/
Ukraine
Perspective.
This is the map from the end of Ukraine's counteroffensive at the end of 2022, vs now in July 2025. The dark red line is now, the lighter one was then. Look at the macro picture.
The situation would look much better if Ukraine had better allies. There are still no serious sanctions, as Russia has earned more from Western purchases of oil and gas than Ukraine received in aid. And Russia still receives critical Western tooling machinery to build its weapons. Meanwhile, Russia's allies, like North Korea, have shipped some 12 million artillery shells and sent thousands of soldiers to Russia in a year. In contrast, the entire West has sent a mere ~5 million shells since 2022, and zero troops. China is actively producing fiber-optic cables and other drone equipment for Russia, while Ukraine must produce most components domestically.
More:
The Hidden War Over Ukraine’s Lost Children https://time.com/7302345/ukraine-lost-children-russia-war/










👍