Links for 2026-02-22

Feb 22, 2026

AI

The inside view at the [frontier labs] of what’s going to happen... the world is not prepared. We’re going to have extremely capable models soon. It’s going to be a faster takeoff than I originally thought.

— Sam Altman [https://www.youtube.com/live/qH7thwrCluM?t=2313s]

We’re currently training a new model for which a primary focus is increasing the level of rigor in its thinking, with the goal that the model can think continuously for many hours and remain highly confident in its conclusions. When the First Proof problems were announced, it seemed like the perfect testbed, so over the weekend I tried it out. Already it was able to solve two of the problems (#9 and #10). As it trained, it became increasingly capable, eventually solving–in our estimation–at least three more. We were particularly pleased when it solved #6 and then, two days later, #4, as those problems were from fields familiar to many of us. It’s pretty incredible to watch a model get tangibly smarter day by day.

— James R. Lee (OpenAI Researcher, Reasoning) [https://openai.com/index/first-proof-submissions/]

Professor of mathematics Daniel Litt writes about the future of math and his evolving views of AI progress https://www.daniellitt.com/blog/2026/2/20/mathematics-in-the-library-of-babel
Did Claude 3 Opus align itself via gradient hacking? https://www.lesswrong.com/posts/ioZxrP7BhS5ArK59w/did-claude-3-opus-align-itself-via-gradient-hacking
DreamDojo: The first robot world model of its kind that demonstrates strong generalization to diverse objects and environments after post-training. https://dreamdojo-world.github.io/
From a handful of comments, LLMs can infer where you live, what you do, and your interests; then search for you on the web. https://arxiv.org/abs/2602.16800
Think Deep, Not Just Long: Measuring LLM Reasoning Effort via Deep-Thinking Tokens https://arxiv.org/abs/2602.13517
Claude Code Security: It scans codebases for vulnerabilities and suggests targeted software patches for human review, allowing teams to find and fix issues that traditional tools often miss. https://www.anthropic.com/news/claude-code-security
AI/ML, multiscale modeling, and emergence https://nanoscale.blogspot.com/2026/02/aiml-multiscale-modeling-and-emergence.html
The Country That’s Madly in Love With AI https://www.politico.com/news/magazine/2026/02/21/south-korea-ai-popular-why-00789618

Beware AI phase transitions

Something many people miss about AI progress is that there can be sudden jumps in usefulness despite only minor gains in a model’s intelligence. Incremental gains can be exponentially valuable.

Increasing the single-step success rate of a model from 99% to 99.9% can seem irrelevant, but for a task that requires 50 steps, it makes the difference between a coin flip and production-ready autonomy. Reducing the error rate from 1% to 0.1% might require exponentially more compute, but the payoff might yield a system that crosses a threshold from being a brittle copilot to agentic autonomy.

We’ve seen this with Claude Opus 4.5. It was an inflection point for adoption despite not being vastly smarter than the previous version. It just crossed a critical threshold.

Something very similar is true for human evolution. For hundreds of thousands of years, archaic humans were working with the same stone tools. Then a threshold was crossed. We stopped compounding errors in long-horizon tasks and started compounding correctness.

The phase change between an average person and someone like John von Neumann does not require a dramatically new brain architecture or a vastly higher number of neurons. Yet this difference is what enables someone to contribute to the development of nuclear weapons instead of being a garbage collector.

Next time you wonder why AI labs would bother spending exponentially more compute on minimal absolute gains, remember that a small delta in per-step reliability could make the difference between a brittle tool and something that can recursively self-improve.

Science and Technology

Battery storage costs fell 25% in 2025. https://www.semafor.com/article/02/19/2026/battery-storage-prices-drop-to-record-low-report-finds
A fluid can store solar energy and then release it as heat months later https://arstechnica.com/science/2026/02/dna-inspired-molecule-breaks-records-for-storing-solar-heat/
Element Biosciences announced that its high-throughput benchtop sequencing device called VITARI can deliver a whole genome for $100. https://www.sandiegouniontribune.com/2026/02/19/scrappy-san-diego-startup-goes-toe-to-toe-with-gene-sequencing-giant-illumina/
Microsoft’s Glass Chip Holds Terabytes of Data for 10,000 Years https://gizmodo.com/microsofts-glass-chip-holds-terabytes-of-data-for-10000-years-2000723455
Bacteria Frozen Inside 5,000-Year-Old Ice Cave Is Crazy Resistant to Antibiotics https://gizmodo.com/bacteria-frozen-inside-5000-year-old-ice-cave-is-crazy-resistant-to-antibiotics-2000723002

Ukraine

In Ukraine, military ground robots reportedly carry out over 7,000 missions per month. https://www.pravda.com.ua/eng/news/2026/02/17/8021446/
Russian advances rank among the slowest in modern warfare: "CSIS said the rate of Russian advance was just 70 metres per day in its year-and-a-half-long offensive on Pokrovsk and Myrnohrad, a slower advance than any army in more than 100 years of warfare, including the Battle of the Somme." https://www.thetimes.com/world/russia-ukraine-war/article/ukraine-russia-military-army-news-latest-8gs2vjwtz [no paywall: https://archive.is/aLeDO]

A pretty amazing achievement for Ukraine to hit one of the most important and best-protected defense industrial sites of Russia with a missile domestically developed and produced by a startup.

The Flamingo missile blew a 30x24 meter hole through a Votkinsk plant workshop roof. The strike hit Building 19, the stamping and electroplating workshop at the Votkinsk plant, where missile body components are shaped and coated before final assembly. The nature of the destruction and the configuration of the collapse indicate that the epicenter of the explosion was inside the building, resulting in the internal areas being probably completely burned out.

The plant produces ballistic missiles, including Iskander-M, Yars, Topol, and Bulava, as well as components for Oreshnik and Kinzhal.

Source: https://t.me/kiber_boroshno/12582

P.S. This is one of 621 middle- and long-range strikes I logged since 21 October 2025. In recent days, many high-value targets have been hit, such as the Neftegorsk gas plant, where two stabilization columns were damaged, leading to a long outage.

Axis of Ordinary

Discussion about this post

Ready for more?