Links for 2025-02-10
Latent Reasoning:
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. This stands in contrast to mainstream reasoning models that scale up compute by producing more tokens. Unlike approaches based on chain-of-thought, our approach does not require any specialized training data, can work with small context windows, and can capture types of reasoning that are not easily represented in words. We scale a proof-of-concept model to 3.5 billion parameters and 800 billion tokens. We show that the resulting model can improve its performance on reasoning benchmarks, sometimes dramatically, up to a computation load equivalent to 50 billion parameters.
The authors find evidence for pretty advanced structures in latent space, such as the tendency to use orbitals (see picture) to compute arithmetic tasks and reasoning about sentence structure.
Paper: https://arxiv.org/abs/2502.05171
New Altman Essay:
…we can now imagine a world where we cure all diseases, have much more time to enjoy with our families, and can fully realize our creative potential.
In a decade, perhaps everyone on earth will be capable of accomplishing more than the most impactful person can today.
1. The intelligence of an AI model roughly equals the log of the resources used to train and run it. These resources are chiefly training compute, data, and inference compute. It appears that you can spend arbitrary amounts of money and get continuous and predictable gains; the scaling laws that predict this are accurate over many orders of magnitude.
2. The cost to use a given level of AI falls about 10x every 12 months, and lower prices lead to much more use. You can see this in the token cost from GPT-4 in early 2023 to GPT-4o in mid-2024, where the price per token dropped about 150x in that time period. Moore’s law changed the world at 2x every 18 months; this is unbelievably stronger.
3. The socioeconomic value of linearly increasing intelligence is super-exponential in nature. A consequence of this is that we see no reason for exponentially increasing investment to stop in the near future.
Read more: https://blog.samaltman.com/three-observations
Politics, Tech Chiefs Double Down on AI Spending:
French President Emmanuel Macron has announced a €109 billion investment in AI for France in the coming years. This investment will be supported by the United Arab Emirates, major American and Canadian investment funds, and French companies. President Emmanuel Macron announced the spending ahead of a two-day AI summit he is cohosting in Paris with Indian Prime Minister Narendra Modi, attended by the US vice president, China’s vice premier, and the bosses of OpenAI and Google.
European Commission chief Ursula von der Leyen is expected to announce around 10 public supercomputers for researchers and startups.
Tech giants Amazon, Google, Microsoft, and Meta are significantly increasing their investments in AI. They plan to spend a combined total of at least $215 billion in the current fiscal year, an increase of over 45% from the previous year.
Sources:
https://www.france24.com/en/europe/20250210-government-tech-leaders-paris-ai
https://www.lemonde.fr/en/economy/article/2025/02/10/ai-with-the-announcement-of-a-109-billion-investment-macron-intends-to-take-on-the-us_6737985_19.html [no paywall: https://archive.is/JZm6I]
https://www.wsj.com/tech/ai/tech-giants-double-down-on-their-massive-ai-spending-b3040b33 [no paywall: https://archive.is/FeKCf]
Brain-to-Text Decoding: A Non-invasive Approach via Typing
Meta researchers used AI to predict the text a person was typing just from non-invasive brain recording!
With EEG, their "Brain2Qwerty" model gets 67% of the characters wrong, but magnetoencephalography (MEG) shows much better performance, instead only getting 32% of the characters wrong on average.
"For the best participants, the model achieves a CER of 19%, and can perfectly decode a variety of sentences outside of the training set. "
Paper: https://ai.meta.com/research/publications/brain-to-text-decoding-a-non-invasive-approach-via-typing/
More AI links:
Agency is fundamentally frame-dependent: Any measurement of a system's agency must be made relative to a reference frame. https://arxiv.org/abs/2502.04403
Generating Symbolic World Models via Test-time Scaling of Large Language Models https://arxiv.org/abs/2502.04728
CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance https://arxiv.org/abs/2502.04350
“OpenAI o1 significantly outperforms other reasoning models that are on par on benchmarks that test specialized knowledge.” https://arxiv.org/abs/2502.01584
Exploring the possibility to enable models to correct errors immediately after they are made. https://arxiv.org/abs/2408.16293
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models https://arxiv.org/abs/2502.04404
DexterityGen (DexGen): A new system that helps robots use their hands better. It improves how they grip, move, and handle objects… from holding a pen to using a screwdriver. DexGen learns in simulation and refines its skills in the real world, making robotic hands much more useful. https://zhaohengyin.github.io/dexteritygen/
MedRAX: Medical Reasoning Agent for Chest X-ray https://arxiv.org/abs/2502.02673
Verifiable agents are the next meta in crypto x AI - agents that don't require trust. https://www.blog.eigenlayer.xyz/introducing-verifiable-agents-on-eigenlayer/
Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation https://arxiv.org/abs/2502.05151
Karina Nguyen, research & product at OpenAI, says pre-training was approaching a data wall, but now post-training scaling (o1 series) unlocks "infinite tasks." Says models were already "diverse and creative" from pre-training, but teaching AI real-world skills is paving the way to "extremely super intelligent" models. https://youtu.be/DeskgjrLxxs?si=kXjvn89Sdf5N-vF6&t=578
AI compute:
This AI chip is the size of a grain of salt https://www.popsci.com/technology/ai-fiber-optic-chip/
"How Intel ruined an Israeli startup it bought for $2b, Habana Labs—and lost the AI race" (the end of the Gaudi chips) https://www.calcalistech.com/ctechnews/article/s1tra0sfye
More AI politics:
How Sam Altman Sidestepped Elon Musk to Win Over Donald Trump https://www.nytimes.com/2025/02/08/technology/sam-altman-elon-musk-trump.html [no paywall: https://archive.is/5ERSg]
Human takeover might be worse than AI takeover https://www.lesswrong.com/posts/FEcw6JQ8surwxvRfr/human-takeover-might-be-worse-than-ai-takeover
Science:
Children’s arithmetic skills do not transfer between applied and academic mathematics https://www.nature.com/articles/s41586-024-08502-w
Three Years After Experimental Vaccine, These Patients Are Still Cancer-Free https://gizmodo.com/three-years-after-experimental-vaccine-these-patients-are-still-cancer-free-2000559585
“What is it like to live in a society with an estimated median IQ around 70? A Nigerian psychologist explains.” https://woodfromeden.substack.com/p/guest-post-the-global-iq-debate-a
Marriages in China:
Marriages in China fell by 20% in 2024. Since nearly all births in China are within marriage, this implies further large declines in fertility ahead.
China's TFR was just 1.02 in 2023.
Without advanced AI and robotics, we'll eventually face a global collapse of all welfare systems, followed by a collapse of advanced technologies like smartphones, which require a minimum population to be maintained.




Covered in your "Links for 2025-02-03" issue, the following report provides insights into s1 and DeepSeek-R1 that you may find valuable:
From Brute Force to Brain Power: How Stanford's s1 Surpasses DeepSeek-R1
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5130864