Links for 2025-02-06
Human-level sample efficiency?
LIMO: Less is More for Reasoning
LIMO achieves unprecedented performance in mathematical reasoning with only 1% of the training data used by previous approaches, showcasing remarkable data efficiency.
LIMO exhibits exceptional out-of-distribution generalization, outperforming models trained on 100x more data by a significant 40.5% absolute improvement across diverse benchmarks.
LIMO Hypothesis: In foundation models with comprehensively encoded domain knowledge (achieved through extensive pre-training), sophisticated reasoning can emerge through minimal, precisely orchestrated demonstrations of cognitive processes.
The core of LIMO's success lies in the meticulous curation of a small, high-quality dataset. The resulting dataset of 817 examples was carefully selected from millions of candidates.
LIMO fundamentally challenges the assumption that massive datasets are necessary for complex reasoning in LLMs. Quality of the examples, rather than just the number, is the key factor.
LIMO suggests that modern, well-pretrained models like Qwen already possess latent, rich reasoning capabilities. LIMO demonstrates that these capabilities can be unlocked and activated effectively with the right "cognitive templates" provided by curated examples.
LIMO indicates that sophisticated reasoning, regardless of complexity, could potentially be activated with minimal samples given sufficient pre-trained domain knowledge and optimal cognitive reasoning chains for activation.
Further research is needed to validate the LIMO hypothesis across different model architectures and reasoning domains beyond mathematics. Some people hypothesize that these results have a lot to do with the Qwen base models.
Paper: https://arxiv.org/abs/2502.03387
Making robots truly helpful and safe in our everyday lives:
A clever new approach called "Latent Safety Filters" allows robots to understand and prevent complex "failures." Imagine teaching a robot to pick up a bag of Skittles. Traditional safety systems might stop the robot from bumping into the table, but they wouldn't understand that pulling the bag up too quickly will cause the candy to spill everywhere.
The researchers equip the robot with a kind of "imagination." They use "world models" that learn to understand how the world works just by watching videos and trying things out. These models create a "latent space," which is like a simplified, hidden representation of what the robot sees. Think of it as the robot building a mental picture of the scene.
This is where the "Safety Filter" comes in. It acts like a guardian angel for the robot's actions. It constantly monitors what the robot is about to do and checks if it's heading towards a "failure" in its "imagined world." If danger is detected, the safety filter gently steps in and adjusts the robot's actions to prevent the bad outcome, like a spill. Importantly, it does this without needing to be told exactly how to be safe in every situation beforehand. It learns from experience and its "imagination."
Project page: https://kensukenk.github.io/latent-safety/
Real-time speech translation that runs on your phone:
Hibiki produces spoken and text translations of the input speech in real-time, while preserving the speaker’s voice and optimally adapting its pace based on the semantic content of the source speech.
Sample: https://x.com/neilzegh/status/1887498102455869775
Paper: https://arxiv.org/abs/2502.03382
Inference code: https://github.com/kyutai-labs/hibiki
Models: https://huggingface.co/kyutai
UK government rips up rules to fire-up nuclear power:
More nuclear power plants will be approved across England and Wales as the Prime Minister slashes red tape to get Britain building - as part of his Plan for Change.
Reforms to planning rules will clear a path for smaller, and easier to build nuclear reactors – known as Small Modular Reactors –to be built for the first time ever in the UK. This will create thousands of new highly skilled jobs while delivering clean, secure and more affordable energy for working people.
This is the latest refusal to accept the status quo, with the government ripping up archaic rules and saying no to the NIMBYs, to prioritise growth. It comes after recent changes to planning laws, the scrapping of the 3-strike rule for judicial reviews on infrastructure projects, and application of common-sense to environmental rules.
Read more: https://www.gov.uk/government/news/government-rips-up-rules-to-fire-up-nuclear-power
More AI links:
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search https://satori-reasoning.github.io/blog/satori/
Dynamic object goal pushing with mobile manipulators through constrained reinforcement learning https://www.youtube.com/watch?v=wGAdPGVf9Ws
SDE Matching: Scalable and Simulation-Free Training of Latent Stochastic Differential Equations https://arxiv.org/abs/2502.02472
BARE: Combining Base and Instruction-Tuned Language Models for Better Synthetic Data Generation https://www.arxiv.org/abs/2502.01697
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning https://arxiv.org/abs/2502.03275
Demystifying Long Chain-of-Thought Reasoning in LLMs https://arxiv.org/abs/2502.03373
Deep Dive into LLMs like ChatGPT: "This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental models of how to think about their "psychology", and how to get the best use them in practical applications." https://www.youtube.com/watch?v=7xTGNNLPyMI
Science and Technology:
The brain calculates with waves: New insights into neural waves could revolutionize the development of energy-efficient AI systems https://www.mpg.de/24143275/oscillating-networks-in-the-brain
Google says commercial quantum computing applications arriving within five years https://www.reuters.com/technology/google-says-commercial-quantum-computing-applications-arriving-within-five-years-2025-02-05/ [no paywall: https://archive.is/iS7s4]
What is an Electron? How Times Have Changed https://profmattstrassler.com/2025/02/06/what-is-an-electron-how-times-have-changed/
A gene-editing technology called 'dual prime editing' was used in plants for the first time. This tool can precisely delete up to two million bases of DNA, or replace a 258,000 base stretch of DNA with a new sequence, in both wheat and tomatoes (so far). https://www.nature.com/articles/s41477-024-01898-3
A large study, performed on 960 female mice, suggests that genetics – and not diet or exercise – are the biggest predictor of which mice live longer than others. https://www.nature.com/articles/s41586-024-08026-3


