Links for 2025-01-24

Jan 24, 2025

AI:

Tech columnist Joanna Stern: [Yesterday at Davos] Anthropic CEO told me around 2027 for AI that's better than a human at everything—or almost everything. Today at Davos: OpenAI's [Kevin Weil] tells me "I don't even know if it'll be 2027. I think it could be earlier." https://www.youtube.com/watch?v=ge-rN5tDaC8
Google DeepMind CEO Demis Hassabis says AGI that is robust across all cognitive tasks and can invent its own hypotheses and conjectures about science is 3-5 years away https://www.youtube.com/watch?v=yr0GiSgUvPU
Anthropic CEO Dario Amodei says the reason he is speaking out about superhuman AI being 2-3 years away is to warn people that significant economic and societal disruptions are on the way https://www.youtube.com/live/lWuQWv-ef1I?si=uJZ1V66v9rcowgE6&t=4264
Anthropic CEO Dario Amodei says there will be "a great acceleration" due to AI, with 100 years of progress in biology over the next 5-10 years, resulting in a doubling of the human lifespan in that timeframe https://www.youtube.com/live/KdFH3uGerBg?si=NvWEz1bJGdJ4Tb3m&t=1521
Cohere CEO Aidan Gomez says breakthroughs are coming this year or next that will see AI models able to continually learn and improve from experience, unlocking dramatic improvements https://youtu.be/qO5i79dk6tE?si=3LX53yFOp7FOiHPA&t=789
Microsoft CEO Satya Nadella says the AI scaling laws multiply intelligence in both compute and algorithmic dimensions: for every 10x increase in compute, capability increases 100x https://youtu.be/lb_ZJylekWo?si=-fE93WEHfPzfZsWp&t=415
Google DeepMind CEO Demis Hassabis says their AlphaFold AI system has performed a billion years of PhD work and they will have their first drugs in the clinic by the end of the year, with cancer, heart disease and neurodegenerative diseases on the list of targets https://www.youtube.com/live/ICv03VysLaE?si=lDSEX3FZ3ijsXaBN&t=228
OpenAI's Brad Lightcap: "if you were to take our models away from most of our software engineers and our researchers and our research engineers, there'd be a mutiny... it's very clear that the models have now crossed the utility threshold for coding" https://x.com/tsarnick/status/1882310220489327027
Chain of Agents: Large language models collaborating on long-context tasks https://research.google/blog/chain-of-agents-large-language-models-collaborating-on-long-context-tasks/
OpenAI Operator: An agent that can use its own browser to perform tasks for you. https://openai.com/index/introducing-operator/
“We Tried OpenAI’s New Agent—Here’s What We Found” — it can do complex tasks that last as long as 20 minutes https://every.to/chain-of-thought/we-tried-openai-s-new-agent-here-s-what-we-found
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback https://arxiv.org/abs/2501.10799
Toward video generative models of the molecular world https://news.mit.edu/2025/toward-video-generative-models-molecular-world-0123
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step https://arxiv.org/abs/2501.13926
Humanity’s Last Exam [project page: https://lastexam.ai/]: A dataset with 3,000 questions developed with hundreds of subject matter experts to capture the human frontier of knowledge and reasoning. https://www.nytimes.com/2025/01/23/technology/ai-test-humanitys-last-exam.html [no paywall: https://archive.is/ovzhU]
Test-time regression: a unifying framework for designing sequence models with associative memory https://arxiv.org/abs/2501.12352
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding https://arxiv.org/abs/2501.13106

Good intuition pump on what's the big deal with the recent merger between LLMs and self-improving RL: https://x.com/ptrschmdtnlsn/status/1882480473332736418

In a sentence: "we keep distilling what we conclude after a lot of thinking into what we conclude intuitively in a single step of thinking, which in turn improves what we conclude with a lot of thinking, and so on".
Note that the above core trick is part of what we've all been waiting to see if someone will figure out for LLMs. The analogy is quite strong. You make up ever harder problems, and your "policy" is just what answers your LLM gives in a single step. The analogy to tree search is letting your LLM do chain-of-thought (CoT) reasoning. The hope is that a model that produces "1500 elo thoughts" shooting from the hip will, via CoT reasoning, produce "1600 elo thoughts" or something, and you can distill those back into the model to get a model that thinks 1501 elo thoughts to start with, and then you can iterate this over and over.

The course of history now critically depends on whether distillations of o4 and o5 show better-than-linear scaling of financial returns.

https://x.com/yilongqin/status/1882507643669123230

https://x.com/__nmca__/status/1882563755806281986

https://x.com/sama/status/1882478782059327666

Note: According to OpenAI, o3-mini will outperform o1 but worse that o1-pro.

https://x.com/ArtemisConsort/status/1882165554716570110

AI safety:

Sparks flying in this Davos debate over the safety of building AGI https://www.youtube.com/watch?v=w5iuHJh3_Gk
Initial evidence that reasoning models such as o1 become more robust to adversarial attacks as they think for longer. https://openai.com/index/trading-inference-time-compute-for-adversarial-robustness/
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking https://arxiv.org/abs/2501.13011
Donald Trump rescinds Biden-era executive order on AI safety https://www.theverge.com/2025/1/21/24348504/donald-trump-ai-safety-executive-order-rescind

AI politics:

Trump's AI executive order is out. It's short and to the point: It's the policy of the United States to sustain global AI dominance. https://www.whitehouse.gov/presidential-actions/2025/01/removing-barriers-to-american-leadership-in-artificial-intelligence/
President Trump says he has declared a national energy emergency to unlock the United States' energy resources and make the US "a manufacturing superpower and the world capital of artificial intelligence and crypto" https://www.youtube.com/live/iOWiuXhuaz4?si=GPY4HNpHIDxF8kHV&t=761
Trump shrugs off Musk’s criticism of AI project: ‘He hates 1 of the people in the deal’ https://thehill.com/policy/technology/5103498-trump-musk-clash-artificial-intelligence/
Altman showcases stargate site 1, texas, january 2025. https://x.com/sama/status/1882505650594611588
TSMC begins producing 4-nanometer chips in Arizona, Raimondo says https://www.reuters.com/technology/tsmc-begins-producing-4-nanometer-chips-arizona-raimondo-says-2025-01-10/
Pentagon Using AI to Speed Up Military Planning https://techcrunch.com/2025/01/19/the-pentagon-says-ai-is-speeding-up-its-kill-chain/
Anduril Building $1B Weapons Factory in Ohio https://techcrunch.com/2025/01/16/anduril-to-build-its-billion-dollar-weapons-megafactory-in-ohio/
A high-risk, high-reward investment strategy tied to optimism about AI-driven market growth https://www.lesswrong.com/posts/JotRZdWyAGnhjRAHt/tail-sp-500-call-options
"She Is in Love With ChatGPT: A 28-year-old woman with a busy social life spends hours on end talking to her A.I. boyfriend for advice and consolation. And yes, they do have sex." https://www.nytimes.com/2025/01/15/technology/ai-chatgpt-boyfriend-companion.html [no paywall: https://archive.is/YzftO]

Science and Technology:

The intuition behind the Free Energy Principle and what it means for the brain to "predict sensory observations using a generative model" https://www.youtube.com/watch?v=iPj9D9LgK2A
Physicists Discover Hidden Quantum Forces That Could Supercharge Your Devices https://attheu.utah.edu/facultystaff/brand-new-physics-for-next-generation-spintronics/
Nanotechnology Milestone: DNA Motors Reach 30 nm/s Speeds https://pmc.ncbi.nlm.nih.gov/articles/PMC11739693/
Curious blue rings in trees and shrubs reveal cold summers of the past — potentially caused by volcanic eruptions https://www.eurekalert.org/news-releases/1070601
Science Corp. has created its own line of neurons that it's fusing with an animal's existing neurons via an implant, and we may end up with monkeys playing poker. https://www.corememory.com/p/science-corp-aims-to-plant-ideas

Ukraine:

It's pretty crazy that Ukraine is now blowing up more infrastructure in Russia with its own domestically produced weapons than vice versa. Every few days an oil refinery or depot goes up in flames. There are so many videos on Telegram filmed by Russians of crazy night raids by Ukrainian drones that it has become boring normality. And Ukraine is still ramping up its drone production. https://x.com/NOELreports/status/1882566777219268964
NATO Secretary General announces that Europe will finance the purchase of American weapons for Ukraine. Mark Rutte stated: "We need the U.S. to continue supporting Ukraine, and if a new Trump administration is willing, Europe will cover the costs. https://www.reuters.com/world/europe/davos-nato-chief-rutte-reaffirms-need-step-up-support-ukraine-2025-01-23/
Trump calls for $1 trillion Saudi investment, lower oil prices: "If the price came down, the Russia-Ukraine war would end immediately. Right now, the price is high enough that that war will continue - you got to bring down the oil price," Trump said, speaking remotely by video link. https://www.reuters.com/business/energy/trump-calls-1-trillion-saudi-investment-lower-oil-prices-2025-01-23/

Axis of Ordinary

Discussion about this post

Ready for more?