Links for 2024-12-25

Dec 25, 2024

AI:

What are the strongest arguments for very short timelines? — Some snippets from the answers: Both long time-horizon tasks and self-directed learning are fairly easy to reach; Efficient decentralized training should be possible; Scaling is going to continue rapidly showing new results at least until 2026-2029, “after that the rate of change in capabilities goes down. Probably 10 more years of semiconductor and algorithmic progress after that are sufficient to wrap it up though, so 2040 without AGI seems unlikely.”; Many things might go wrong with this scenario, such as government intervention. https://www.lesswrong.com/posts/oC4wv4nTrs2yrP5hz/what-are-the-strongest-arguments-for-very-short-timelines
Why do pre-o3 LLMs struggle with generalization tasks like ARC Prize? It's not what you might think. LLMs bad at ARC because they can’t perceive large text grids. Even if a model is capable of the reasoning and generalization required, it can still fail just because it can't handle enough tokens. When models can't understand the task format, the benchmark can mislead, introducing a hidden threshold effect. https://anokas.substack.com/p/llms-struggle-with-perception-not-reasoning-arcagi
New research from Meta FAIR: Large Concept Models (LCM) is a fundamentally different paradigm for language modeling that decouples reasoning from language representation, inspired by how humans can plan high-level thoughts to communicate. https://ai.meta.com/research/publications/large-concept-models-language-modeling-in-a-sentence-representation-space/ (Third-party presentation: https://www.youtube.com/watch?v=2ZLd0uZvwbU)
LANG-JEPA is an experimental language model architecture that operates in "concept space" rather than "token space." https://github.com/jerber/lang-jepa
Deliberation in Latent Space via Differentiable Cache Augmentation https://arxiv.org/abs/2412.17747
Meta presents Improving Factuality with Explicit Working Memory https://arxiv.org/abs/2412.18069
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought https://arxiv.org/abs/2412.17498
Training Large Language Models to Reason in a Continuous Latent Space https://arxiv.org/abs/2412.06769
LLM-driven genetic programming has turned out to be way more powerful than anybody expected. https://params.com/@jeremy-berman/arc-agi
Formal Mathematical Reasoning: A New Frontier in AI — “We advocate for formal mathematical reasoning, grounded in formal systems such as proof assistants…Feedback can serve as learning signals for RL, while verifiability enables LLMs to tackle tasks requiring rigorous reasoning, like theorem proving and software/hardware design.” https://arxiv.org/abs/2412.16075
Automating the Search for Artificial Life with Foundation Models https://pub.sakana.ai/asal/
Recurrent Drafter for Fast Speculative Decoding in Large Language Models https://arxiv.org/abs/2403.09919
"Maximum diffusion reinforcement learning", Berrueta et al 2023 https://arxiv.org/abs/2309.15293
Generative AI for Economic Research: LLMs Learn to Collaborate and Reason https://genaiforecon.substack.com/p/llms-learn-to-collaborate-and-reason
QVQ: Inference-time scaling for visual multimodal tasks. The first open multimodal o1-like model, which can be seen as the visual counterpart to QwQ. Much like QwQ, QVQ demonstrates intriguing thought processes and has achieved promising results on some challenging tasks. https://qwenlm.github.io/blog/qvq-72b-preview/
OpenAI's Sébastien Bubeck says that an AI model will "for sure" win a gold medal at the International Mathematical Olympiad next year https://www.youtube.com/live/H3TnTxVKIOQ?si=aYTMYTAmOEcJZFG1&t=667
Sam Altman: AI in April 2023 was primitive compared to what we have now and in another 18 months the gap between now and then will be even bigger and the rate of adoption and integration into society is unprecedented. The arrival of superintelligence will be heralded by a 10x increase in the rate of scientific discovery and technological advancement, where a decade's progress will compound every year https://youtu.be/DfOt_cqXCFI?si=HLqlSCtvMAieNbYB&t=434
Gemini 2.0 Flash: Multimodal understanding of audio/music, + in context understanding of complex music editing software UI to help Jon Taylor get the desired effects with their audio mixing. https://x.com/kwindla/status/1871268454826996121
“last night i demo’d @openai o1 pro to my most skeptical friend - the results were extraordinary. it was amazing to see o1 pro successfully tackle humanities research, and how much better it was than 4o / o1. i also had a ton of fun watching my friend come around to AI!” https://x.com/kaysorin/status/1871264540698239050
Chinese AI Companies Are Catching Up Despite U.S. Restrictions https://www.thewirechina.com/2024/12/08/chinese-ai-companies-are-catching-up-despite-u-s-restrictions-chinas-ai-models/
Don’t Look Now, but China’s AI Is Catching Up Fast https://www.wsj.com/tech/ai/china-ai-advances-us-chips-7838fd20 [no paywall: https://archive.is/cUSXo]

I expect the next logical thing to happen will be to both scale RL and the underlying base models and that will yield even more dramatic performance improvements. This is a big deal because it suggests AI progress in 2025 should speed up further relative to 2024…I think basically no one is pricing in just how drastic the progress will be from here.

— Jack Clark, co-founder of Anthropic

https://x.com/elonmusk/status/1870900539367752111

The reasons could be national security, fear of nuclear war (if, say, China feels threatened), or preventing existential risk to humanity.

Compute:

While Meta and Amazon are building multi-gigawatt data centers, Microsoft are spending billions on fiber to connect all their data centers into one high-bandwidth mega-cluster https://www.youtube.com/watch?v=QVcSBHhcFbg
“Colossus was fully operational in 122 days and started running workloads just 19 days after the first servers were delivered. Soon, xAI will double to 200K NVIDIA Hopper GPUs with NVIDIA Spectrum-X Ethernet networking.” https://x.ai/blog/series-c
Scott Aaronson explains how computation using closed timelike curves would be able to solve even NP-complete problems easily https://www.youtube.com/watch?v=A6iDHLMRvwg

Cosmology:

A revolution in cosmology? Supernovae evidence for foundational change to cosmological models? Dark energy 'doesn't exist' so can't be pushing 'lumpy' universe apart, physicists say. It seems the key innovation is to make time pass at varying speeds in different parts of the galaxy? https://phys.org/news/2024-12-dark-energy-doesnt-lumpy-universe.html
“In universes where this was true, you could place your computational megastructures inside a void. They would to appear to accelerate through time more quickly. This would be useful if you had a Very Important Problem or were trying to outcompute someone living in a slow-well.” https://x.com/AndrewCurran_/status/1871266019316232246
From time to timescape - Einstein's unfinished revolution https://arxiv.org/abs/0912.4563
Conformally Friedmann-Lemaitre-Robertson-Walker cosmologies https://arxiv.org/abs/1502.02758

Technology:

A light-driven hybrid nanoreactor that merges natural efficiency with cutting-edge synthetic precision to produce hydrogen—a clean and sustainable energy source. https://news.liverpool.ac.uk/2024/12/17/significant-advancement-made-in-engineering-biology-and-clean-energy/
A solid-state DNA origami register that facilitates faster execution of molecular algorithms. https://pubs.acs.org/doi/full/10.1021/acscentsci.4c01557

Miscellaneous:

The next massive volcanic eruption is coming. It will cause chaos the world is not prepared for https://edition.cnn.com/2024/12/24/climate/massive-volcano-eruption-climate/index.html
Vegans need to eat just enough Meat - emperically evaluate the minimum ammount of meat that maximizes utility https://www.lesswrong.com/posts/H27mzmW6G5ywyrJBn/vegans-need-to-eat-just-enough-meat-emperically-evaluate-the

Politics:

Germany joins EU’s ‘ultra-low’ fertility club https://marginalrevolution.com/marginalrevolution/2024/12/eu-facts-of-the-day.html
Denmark to boost Greenland defence after Trump repeats desire for US control https://www.bbc.com/news/articles/ckgzl19n9eko

https://x.com/levie/status/1871586990598279266

Ukraine:

Ukrainian attack drones reportedly struck Russia's Millerovo air base in Rostov Oblast https://x.com/Osinttechnical/status/1871292186114822365
SBU drones have blown up an ammunition depot at the Kadamovskoye training ground in Rostov region. https://x.com/wartranslated/status/1871894793501937684
“Ukrainian forces, supported by tanks, conducted a successful assault on Russian positions in the Kharkiv region. During the operation, a significant number of Russian soldiers were taken captive.” https://x.com/NOELreports/status/1871886414239138010
Storming Ukrainian positions in pink "Zhiguli" cars is the new trend of the Russian army. https://x.com/wartranslated/status/1871587525741174834
On Christmas Eve, Russians launched a missile strike on a civilian multi-story building in Kryvyi Rih. https://x.com/bayraktar_1love/status/1871637816242999493

Axis of Ordinary

Discussion about this post

Ready for more?