Links for 2025-04-17

Apr 16, 2025

o3 and o4-mini

OpenAI releases o3 and o4-mini, and they feature new emergent capabilities.

We didn't train the model to use certain strategies directly. We didn't say 'simplify your solution' or 'double check'. It just organically learns to do these things.

Highlights:

The models can agentically use and combine every tool within ChatGPT—this includes searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs, and even generating images.
In evaluations by external experts, o3 makes 20 percent fewer major errors than OpenAI o1 on difficult, real-world tasks—especially excelling in areas like programming, business/consulting, and creative ideation. Early testers highlighted its analytical rigor as a thought partner and emphasized its ability to generate and critically evaluate novel hypotheses—particularly within biology, math, and engineering contexts.
At equal latency and cost with OpenAI o1, o3 delivers higher performance. For example, on the 2025 AIME math competition, the cost-performance frontier for o3 strictly improves over o1, and similarly, o4-mini's frontier strictly improves over o3‑mini. More generally, OpenAI expects that for most real-world usage, o3 and o4-mini will also be both smarter and cheaper than o1 and o3‑mini, respectively.
For the first time, these models can integrate images directly into their chain of thought. They don’t just see an image—they think with it.

Even more:

OpenAI Codex CLI: Lightweight coding agent that runs in your terminal https://github.com/openai/codex/ (video: https://www.youtube.com/watch?v=FUq9qRwrDrI)
Vibe Check: o3 Is Here—And It’s Great https://every.to/chain-of-thought/vibe-check-o3-is-out-and-it-s-great
Details about METR’s preliminary evaluation of o3 and o4-mini https://metr.github.io/autonomy-evals-guide/openai-o3-report/
Tyler Cowen: “I think it is AGI, seriously.” https://marginalrevolution.com/marginalrevolution/2025/04/o3-and-agi-is-april-16th-agi-day.html
“We tested a pre-release version of o3 and found that it frequently fabricates actions it never took, and then elaborately justifies these actions when confronted.” https://transluce.org/investigating-o3-truthfulness

AI

Google Is Winning on Every AI Front https://www.thealgorithmicbridge.com/p/google-is-winning-on-every-ai-front
“We put the latest top AI models—GPT-4.1, Gemini 2.5 Pro, Llama-4 Maverick, and more—to the test in Ace Attorney, to see if they could shout Objection! ⚖️, turn the case around, and uncover the truth behind the lies…When it comes to cost-efficiency, Gemini 2.5 Pro redefines the value.” https://x.com/haoailab/status/1912231343372812508
Asynchronous RL completely eliminates communication bottlenecks. INTELLECT-2: Launching the First Globally Distributed Reinforcement Learning Training of a 32B Parameter Model https://www.primeintellect.ai/blog/intellect-2
AI used for skin cancer checks at London hospital https://www.bbc.com/news/articles/czd3ygd7mrno
LLMs and Beyond: All Roads Lead to Latent Space https://aiprospects.substack.com/p/llms-and-beyond-all-roads-lead-to
Ctrl-Z: Controlling AI Agents via Resampling — “Our core findings: the control techniques that we’d previously explored in more toy settings generalize well to this more realistic setting, but we can do even better by developing novel techniques that exploit the multi-step nature of the setting.” https://www.lesswrong.com/posts/LPHMMMZFAWog6ty5x/ctrl-z-controlling-ai-agents-via-resampling
AI-Enabled Coups: How a Small Group Could Use AI to Seize Power https://www.lesswrong.com/posts/6kBMqrK9bREuGsrnd/ai-enabled-coups-a-small-group-could-use-ai-to-seize-power-1

Miscellaneous

A super clear map of the things that need solving in science and R&D https://www.gap-map.org/
OpenAI is Building a Social Network https://www.theverge.com/openai/648130/openai-social-network-x-competitor
Mystery Objects From Other Stars Are Visiting Our Solar System. These Missions Will Study Them Up Close https://singularityhub.com/2025/04/15/mystery-objects-from-other-stars-are-visiting-our-solar-system-these-missions-will-study-them-up-close/

Progress

1. Steel Production Efficiency: Labor required in steel manufacturing decreased roughly 1,000×—from over 3 man-hours per ton in 1920 to just 0.003 man-hours per ton by 2000.

2. Agricultural Productivity (Corn Yields): U.S. corn yields increased nearly 7× over the past century, from approximately 26 bushels per acre (~1924) to around 177 bushels per acre in 2023.

3. Solar Panels & Electricity: Solar PV module prices fell from about $106 per watt (1976, inflation‑adjusted) to roughly $0.38 per watt by 2019—a 99.6% reduction. Meanwhile, utility‑scale solar electricity’s levelized cost (LCOE) has dropped from around $496/MWh in 2009 to about $59/MWh by 2024 (roughly 5.9 cents/kWh).

4. Lithium-Ion Batteries: Battery cell costs declined from roughly $7,500/kWh (adj.) in 1991 to about $181/kWh in 2018 (~41x decrease), with average pack prices reaching a new low of $115/kWh in 2024.

5. Energy Efficiency: Technologies like LED lighting (~100x more efficient than early incandescents) and improved appliances/vehicles drastically reduced energy consumption per unit of service.

6. Industrial Robotics: The median price of robotic arms has plunged from roughly $50,000 in 2016 to around $12–$13,000 in 2021, with trends suggesting prices may soon approach $10,000.

7. Air Travel: Flight became ~90-95% cheaper (inflation-adjusted) since the 1940s, significantly faster, and over 1000 times safer.

8. Space Launch: Launch costs have tumbled from roughly $20,000 per kilogram (typical of the Space Shuttle era) to about $2,000–$2,700 per kilogram for SpaceX’s Falcon 9—with future systems (such as Starship) targeting costs potentially as low as $10 per kilogram.

9. Data Storage: The cost per megabyte plummeted over 600 billion-fold, from ~$5.2 million in 1960 (core memory) down to below $0.000015 by 2022 (HDDs), making mass data storage virtually free.

10. Internet Speeds: Home internet has surged from dial-up speeds (~56 kbps in the late 1990s) to US median fixed broadband speeds around 353 Mbps (late 2024), representing roughly a 6,300× increase in speed.

11. Communication Capacity: Data transmission capacity exploded, with transatlantic fiber optic cables carrying trillions of times more data per second than early telegraph cables.

12. Genome Sequencing: Cost per human genome fell from ~$95-100 million in 2001 to ~$500-600 by 2023—a ~160,000-200,000× reduction—with ongoing efforts pushing towards a $100-$200 genome.

13. Medical Imaging Resolution: The minimum detectable size in medical imaging has shrunk by roughly 100×, implying that the detectable volume has improved by about 1,000,000× compared to earlier methods.

14. Vaccine Development: Where it once took decades (such as with the polio vaccine), the advent of new technologies like mRNA platforms enabled the rapid design and mass-production of effective COVID-19 vaccines in under 12 months.

15. Algorithmic Progress: With today’s algorithms, an average 1994 desktop computer would have beaten the world chess champion.

16. Language Model Costs: The cost per million tokens for AI language models fell from about $20 (GPT-3) down to as little as $0.50 with models like Google Gemini 2.0 Flash.

17. AI Training Efficiency: Advances in hardware and algorithms have reduced training costs dramatically; what cost millions to train in 2020 (e.g., GPT-3) can now be achieved for a few hundred thousand dollars, and retraining a small GPT-2 can cost as little as $20.

Why Human Intelligence May Be Just a Glimmer of What's Possible

The emergence of artificial intelligence capable of vastly exceeding human cognitive abilities – sometimes termed "superintelligence" – is often framed as an extraordinary claim requiring extraordinary evidence. However, examining the constraints on human intelligence imposed by its evolutionary origins, and contrasting them with the potential of artificial systems, suggests that the possibility of superhuman AI should perhaps be our default assumption.

1. The Constraints of Biological Evolution

Human intelligence is the product of natural selection, a powerful but fundamentally blind optimization process. Evolution operates without foresight, goals, or the ability to jump significant fitness gaps. It tinkers incrementally, selecting only for traits that offer immediate survival and reproductive advantages within specific environmental contexts.

No Goal of Maximal Intelligence: Evolution didn't aim to create the smartest possible entity; it favoured traits sufficient for thriving in ancestral environments. Human-level general intelligence, once achieved, may represent a point where the immediate selective pressures for further cognitive enhancement diminished or were balanced by costs.
Physical and Energetic Costs: Biological brains are metabolically expensive. Further increases in size or processing power face significant biological hurdles. For instance, larger infant head sizes correlate with increased risks during childbirth, potentially creating a strong selective pressure against significantly bigger brains, regardless of the potential cognitive benefits.
Path Dependence: Evolution builds upon existing structures. There might be vastly different and more efficient cognitive architectures achievable through design that are simply inaccessible via incremental biological mutation from our current state.
No 'Ceiling' Apparent: Observation of other species doesn't show them tightly clustered just below human intelligence, as if bumping against a natural cognitive ceiling. Instead, there's a vast spectrum, suggesting that the specific level humans reached isn't necessarily a universal limit.

2. The Advantages of Artificial Systems

Artificial intelligence, particularly future Artificial General Intelligence (AGI), is not bound by these biological constraints. Its potential stems from its nature as designed technology operating on a fundamentally different substrate.

Raw Speed (Processing & Communication): Biological neurons operate relatively slowly (action potentials around 10^2 Hz or ~120 m/s signal speed). Digital processors already operate orders of magnitude faster (CPUs at 10^9 Hz), and future substrates like optical computing could approach the speed of light. This speed advantage extends to communication; humans typically process language at rates equivalent to ~10-50 bits per second, whereas digital systems communicate at rates like 10^11 bps (e.g., Infiniband), enabling vastly faster information exchange between AI instances.
Scalability & Duplicability: Creating a new human expert takes decades of development and learning. Once a highly capable AI exists, it can potentially be copied near-instantly millions or billions of times. This eliminates bottlenecks in acquiring talent. Entire expert teams, or even entire organizational structures proven to be effective, could be replicated on demand. This ability to turn capital into compute, and compute into top-tier "talent" or cognitive work, is transformative.
Editability & Recursive Self-Improvement: Humans lack 'root access' to their own cognitive hardware (the brain). Understanding and modifying our 'wetware' is incredibly difficult and risky. AIs, being software and hardware systems, can be designed for introspection and modification. They can possess their own source code, utilize version control, run controlled experiments on copies, debug, and implement improvements systematically. Risky modifications can be tested on backups. This allows for a potentially explosive cycle of recursive self-improvement that biological systems cannot match.
Memory (Working & Long-Term): Human working memory is famously limited (holding perhaps 3-7 'chunks' of information consciously). While humans use external aids, an AI's working memory could potentially hold vast datasets simultaneously (e.g., the entirety of Wikipedia, which is ~21 GB of text, easily fits within modern RAM). Furthermore, digital memory is high-fidelity and less prone to the decay, biases, and confabulation inherent in human recall. AIs could have near-perfect, instantly searchable access to all data they've processed.
Collective Intelligence & Knowledge Merging: Humans share knowledge slowly and imperfectly through language and demonstration. AIs can potentially merge knowledge directly. Insights gained by one instance could be integrated into a central model or shared perfectly with copies via high-bandwidth internal communication (potentially operating on latent representations rather than lossy natural language). This enables vastly superior organizational learning, akin to a "collective brain" where innovations and knowledge propagate instantly and without degradation, overcoming the limitations of human social learning and knowledge transmission across generations.
Architectural Freedom: AIs are not restricted to the specific neural architecture produced by primate evolution. Designers can explore radically different architectures optimized for different kinds of computation or goals, potentially leading to forms of intelligence qualitatively different from and superior to human cognition.
Focus & Endurance: AIs lack biological drives, needs (like sleep), and the inherent emotional/cognitive biases that affect human decision-making and consistency. They can potentially operate 24/7, maintain focus indefinitely ('unlimited willpower'), and pursue objectives with a level of rationality and consistency far exceeding human capabilities.

Conclusion: A New Default?

While human intelligence is a remarkable product of evolution, it carries the indelible marks of its contingent, constrained, and unguided origin. Artificial intelligence, freed from many of these biological limitations and benefiting from the unique advantages of digital systems – speed, scalability, editability, perfect recall, and superior collective coordination – possesses a fundamentally higher performance ceiling. Existing narrow AI already demonstrates superhuman capabilities in specific domains (e.g., chess, Go, protein folding). Considering these factors, the emergence of AI that surpasses human general intelligence across most or all domains appears not as a speculative fantasy, but as a plausible, perhaps even likely, consequence of continued technological development. The burden of proof might lie not in demonstrating that superhuman AI is possible, but in arguing why it wouldn't be the eventual outcome.

Ukraine

North Korea Sent Russia Over 15,000 Containers of Munitions: Analysis https://www.newsweek.com/north-korea-russia-munitions-shipments-ukraine-war-2059841
Ukraine has unveiled a new cruise missile called “Bars” — a domestically mass-producible "missile-drone" with a range of 700–800 km. Its key advantage: scalable production for frequent strikes deep inside Russia. https://www.bbc.com/ukrainian/articles/cn8v5qq3xe0o
“A Russian army supply route in the Belgorod region, numerous transport vehicles destroyed by drones. According to the Russian source who published the video, the number of destroyed vehicles on this stretch of road has doubled since the footage was recorded, which happened in just one day. The vehicle seen in the video was also destroyed.” https://x.com/bayraktar_1love/status/1912231131115663404

meika loofs samorzewski

Apr 17, 2025Edited

"surpasses human general intelligence" it does not do this, well, depends on the definitional framework, so more than an individual abilities, yes, but this is not because it is AGI but because it maps and generates from all human social learning in aggregate, which a single human cannot do, the best would be a librarian who has read all the books in their charge.

In a differing frame, the aggregate, it surpasses only where it actually adds to that knowledge that is more than a new instantiation of a genre or a mash-up. And yes, this is a grey zone, which may or may not include elements which are interpolated afresh or ride trajectories in the vector phase space which are as yet unexplored by humans (as individuals or in their social learning inter-individualness outcomes {groups/religions/cultures/institutions}).

Incompleteness theorems and complexity theory would suggest that model collapse is unavoidable (even before AI we had this in social learning but we called it metaphysics and/or paranoia).

Axis of Ordinary

Discussion about this post

Ready for more?