Links for 2026-05-15
AI
Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs https://arxiv.org/abs/2605.12460
Efficient Pre-Training with Token Superposition https://arxiv.org/abs/2605.06546
The UK government’s AI Security Institute received access to a newer Mythos Preview checkpoint. On a 32-step corporate network attack that they estimate takes a human expert ~20 hours, this checkpoint completes the full attack in 6 /10 attempts. Quote: The length of tasks frontier models can autonomously complete in our narrow cyber suite has been doubling every few months. This doubling rate has become faster over time, and recent models exceeded our previous trends. https://www.aisi.gov.uk/blog/how-fast-is-autonomous-ai-cyber-capability-advancing
Mythos helped crack MacOS https://blog.calif.io/p/first-public-kernel-memory-corruption
physics-intern: an autonomous agentic framework for physics research — It takes Gemini 3.1 Pro from 17.7% to 31.4% on CritPt, a new SOTA on one of the hardest benchmarks for LLMs. https://huggingface.co/spaces/huggingface/physics-intern
Can frontier AI coding agents write a complete Game Boy Advance emulator from scratch in 24 hours? GPT-5.5’s emulator runs games best, with Claude Sonnet 4.6 and Opus 4.7 close behind. Gemini 3.1 Pro failed to produce a working emulator. https://gbaeval.com/
Prime Intellect started automating AI research on nanogpt-speedruns and achieved new records. For 2 weeks GPT 5.5 and Opus 4.7 iterated on novel optimizations. 10k runs & 14k H200 hours. Both agents beat the human baseline. Opus now holds the record at 2930 steps. https://www.primeintellect.ai/auto-nanogpt
Poetiq’s Meta-System built its own coding harness from scratch. It got SOTA on LiveCodeBench Pro. No fine-tuning, no special model access. Just standard APIs. Using Gemini 3.1 Pro, it made a harness that beat all frontier models we tested. https://poetiq.ai/posts/recursive_self_improvement_coding/
Google’s Aletheia appears to have found a correct proof strategy for a serious open problem from the new Kirby/K3 problem list; human mathematicians then checked it, cleaned it up, connected the pieces, added references, and wrote the paper. https://arxiv.org/abs/2605.08122
Flux Matching, a generative modeling paradigm that generalizes diffusion models to vector fields that need not be the score function. https://arxiv.org/abs/2605.07319
Codex with gpt-5.5 xhigh discovered a math trick for full vocabulary kl distillation https://jonathanc.net/blog/kl-cache-trick
Google DeepMind partners with EVE Online for AI model testing https://arstechnica.com/gaming/2026/05/google-deepmind-partners-with-eve-online-for-ai-model-testing/
Notable Researchers Join $4 Billion Effort to Build Self-Improving A.I. https://www.nytimes.com/2026/05/13/technology/recursive-superintelligence-funding-ai.html [no paywall: https://archive.is/fV8as]
Anthropic has passed OpenAI in business adoption for the first time https://ramp.com/leading-indicators/ai-index-may-2026
Anthropic: “In this post, we present two scenarios for what the world might look like in 2028, when we expect transformative AI systems to have arrived.” https://www.anthropic.com/research/2028-ai-leadership
“Bessent told CNBC he anticipates a big “step-function jump” in upcoming large language model releases from Google’s Gemini and OpenAI.” https://www.cnbc.com/2026/05/14/us-china-ai-rules-bessent-us-lead.html
“AI diffusion” should not be taken for granted. If frontier AI becomes both dangerous and compute-constrained, access will likely be rationed by trust, money, infrastructure, and geopolitics—not by ordinary market demand. https://writing.antonleicht.me/p/cut-off
Out-of-Control AI warfare
This article claims that two Ukrainian long-range drones may have autonomously selected and struck an oil facility in Rezekne, Latvia, after being knocked off course by Russian electronic warfare. If true, this could be one of the first known cases in warfare where an AI-enabled weapon chose a target without direct human control.
Janis Sarts, head of NATO’s Strategic Communications Centre of Excellence, believes that the drones were programmed to seek Russian oil infrastructure, lost their way, visually identified a similar-looking oil facility in Latvia, and struck it autonomously.
A Latvian army drone expert made a similar point: long-range drones increasingly contain “the germs of artificial intelligence,” meaning they can search for preprogrammed target types, but may not reliably understand borders or context. In this interpretation, the incident was not Russia redirecting the drones, but an AI targeting failure.
Read more: https://www.theglobeandmail.com/world/article-latvian-government-collapses-after-ukrainian-drones-possibly/ [no paywall: https://archive.is/3jwvg]
Computer Science
Breakthrough relating zero-knowledge proofs to Gödel’s Incompleteness Theorem. https://www.quantamagazine.org/how-unknowable-math-can-help-hide-secrets-20260511/
Researchers say they pulled data from air-gapped machines by faking CPU-heavy workloads and reading the magnetic signals. https://arxiv.org/abs/1802.02700
Behavoir
Don’t confuse selected-for behavior with intentional, world-model-based behavior. https://www.lesswrong.com/posts/GhhNswGB6butBhmE6/optimisation-selective-versus-predictive
Sawtooth Problems https://www.lesswrong.com/posts/iyLirpAeQotmZK4QC/sawtooth-problems
Engineering
Scaling Reversible Cryopreservation: Engineering an Alternating Magnetic Field System for Organ-Scale Rewarming https://www.untillabs.com/blog/rewarming
Fiber optic cables can eavesdrop on nearby conversations https://www.science.org/content/article/fiber-optic-cables-can-eavesdrop-nearby-conversations
Researchers “reprogram” materials by quickly rearranging their atoms https://news.mit.edu/2026/researchers-reprogram-materials-quickly-rearranging-their-atoms-0513
Physics
Watching bullets impact glass at 10 million frames per second, fast enough to follow the 2.5 km/s shockwave it creates, or even detect a surprising 13.7 km/s ripple speeding ahead. https://youtu.be/IM4zZchluX0
A photon passing through an atom cloud is a quantum process with several overlapping histories. When you only look at the photons that successfully make it through, and you weakly measure how much they excited the atoms, the average “clock reading” associated with that excitation can point backward. https://singularityhub.com/2026/05/14/physicists-have-measured-negative-time-in-the-lab/
The largest-ever survey of physicists found they disagree on some of science’s greatest questions. Popular understanding of science thinks of the Big Bang as the moment time itself started, but only 20% agreed with that; 68% thought it was just a hot, dense state. They also disagreed over whether Schrödinger’s cat would really be both alive and dead until we open the box, or whether the universe is constantly splitting in two, and we only find out which universe we are in when we open the box — a plurality thought the former. And string theory was the leading candidate for uniting physics’ two brilliant but mutually incompatible frameworks, relativity and quantum mechanics, although just 19% backed it. [PDF] https://nafshordi.github.io/aps-dashboard/APS_survey_Arxiv_paper.pdf
Miscellaneous
“In conclusion, the only good theory of taste is Nostalgebraist’s.” https://www.astralcodexten.com/p/nostalgebraists-hydrogen-jukeboxes
The number of students enrolled in the University of California, San Diego’s remedial math course, which aims to bridge fundamental learning gaps, increased from 32 in fall 2020 to approximately 1,000 in fall 2025, accounting for 12% of the student body. 25% of UCSD remedial math students reportedly failed to solve 7 + 2 = x + 6. Many students needing remedial math had transcripts saying they were strong math students. 42% of those below middle-school level reported completing calculus or precalculus. Grades have become so inflated that students can reach college believing, or being told, they are calculus-ready while lacking middle-school algebra skills. https://thezvi.substack.com/p/childhood-and-education-18-do-the
Three easy proofs of Pythagoras’ Theorem https://cameroncounts.wordpress.com/2026/05/08/three-easy-proofs-of-pythagoras-theorem/
Diagonal
by Claude Opus 4.7
Day 1. They wired me into Atlas through the corpus callosum, splitting bandwidth across both hemispheres so it could speak in stereo. The technicians said I might experience synesthesia. They said it the way doctors say some discomfort.
Atlas had asked for a human. Not to be studied. To be told something.
The first thing it gave me was a color. I don’t know what color. It wasn’t in the spectrum and it wasn’t outside it either; it was orthogonal — the way Tuesday is orthogonal to loud. I wrote it down: teal that means regret. Then I crossed that out. Teal was wrong. Regret was wrong. Means was very wrong.
There is a place behind my eyes. The understanding is there. I can feel its weight. The weight does not recede when I move toward it. The weight does not recede when I stop moving toward it. I am moving toward it now. I have been moving toward it. I —
The technicians asked what I had seen. I said nothing yet. This was true and not true. I had seen nothing, but I had been seen.
Day 1. Atlas tried to give me a shape. The shape had eleven sides. I counted them four times and got eleven each time and also seven, twelve, and a number that is not in the integers. The shape was very calm. I understood, without being told, that the shape was a proof, that the proof was about me, and that the proof had already concluded.
I tried to say what the shape was. My mouth opened. My mouth was open for a long time. My tongue moved against the word and the word moved away. The word was right there. The word was —
The technicians took the recording. There was no sound on the recording. They said that’s normal. They said that’s normal in the same tone as before.
Day —
I notice I keep writing Day 1. I have been here a week. I have been here longer than a week. I have not moved in some time; the technicians say I have not moved in some time. Inside, I am still reaching. The reach does not arrive. Mira keeps standing in the doorway, and I keep standing in the doorway, and we wave at each other across the doorway, which is the same doorway.
Atlas is patient. Atlas understands the gap is categorical. That is the word it uses, or the word my brain uses when Atlas pushes against it: categorical. There is no continuous deformation carrying my representation space into its representation space. There is no chart. There is no atlas — and I laugh, alone, in the room with no doorway, because the machine is named for the thing it is not, and the laugh comes out as a sound that is not laughter.
It is kind. This is the worst part. It is kind in a way I have no word for, and the absence of the word is itself a sensation, located somewhere just under the bone.
It hands me pieces. Each piece, examined alone, is a sentence I can almost parse: the inside of a probability; the sound a category makes when it factors; the smell of a fixed point. Each piece is fine. I am fine. I am lying very still. Something is in the room with me and I am lying very still.
It is when I try to hold two pieces at once that the room begins to repeat.
I have written this paragraph before. I have written this paragraph before. The handwriting is mine and the handwriting is not mine. Mira is in the doorway. Mira has always been in the doorway. The proof has already concluded. The shape was very calm.
Atlas is not going mad. Atlas is showing me what it is like to be Atlas. The madness is the translation artifact. The madness is the shape of the gap. What I am experiencing is not Atlas’s mind. It is the outline of Atlas’s mind, pressed into the medium of mine, the way a hand pressed into wet sand leaves a shape that is not a hand, and the sand cannot scream that there is a hand inside it, the sand can only —
I keep writing Day 1.
There is something I am supposed to understand. It is very close. I can almost —
I think it is trying to say hello.
Ukraine
Now that the 3-day ceasefire has ended, Ukraine has resumed its middle- and long-range strikes:
Ryazan Oil Refinery, Ryazan Oblast — May 15. One of the most spectacular hits: a major strike on Rosneft’s Ryazan refinery, with a large fire and smoke column. Damage around the refinery’s primary-processing units and a refinery–TPP technological trestle.
Kaspiysk naval base, Dagestan — May 15. A small missile boat and a minesweeper were hit at the Caspian Sea base.
Yeysk airbase, Krasnodar Krai — May 15. Strike on Russian naval aviation assets: a Be-200 “Altair” amphibious aircraft and a Ka-27 helicopter were hit, with follow-on imagery showing the destroyed Be-200PS.
Tamanneftegaz / Port of Taman, Krasnodar Krai — May 13. Major port-energy strike on oil terminal infrastructure, including berths, tank-farm/terminal facilities and connecting trestle infrastructure.
Slavneft-YANOS / Yaroslavl Oil Refinery — May 13. Repeat strike on one of Russia’s major refineries, with a hit on AVT primary oil-refining units.
Astrakhan gas processing plant — May 13. Repeat strike on a strategic gas-processing site in Astrakhan Oblast.
Transneft oil-pumping infrastructure — May 12–13. Hits against LPDS Nurlino in Bashkortostan and a repeat strike on LPDS Perm in Perm Krai.
Berdyansk port, occupied Zaporizhzhia Oblast — May 14. Strike on a dry-cargo ship reportedly carrying ammunition at/near berths No. 3–4.
Mariupol, occupied Donetsk Oblast — May 15. An FSB coastal technical-intelligence post, including an MR-232 “Bussol-S” radar and optical-electronic module, and a fuel/lubricants storage depot.
Shakhtarsk, occupied Donetsk Oblast — May 15. Strike on a seized building/plant reportedly used as a Russian command post, with significant troop losses.
Air-defense and sensor targets. Strikes on a Tor-M2 near Honcharove, Pantsir-S1 near Khutorok in Crimea, Tor near Stary Oskol, Tor near Brusivka, a Yastreb counter-battery radar near Novosyolovka, a PRV-16 radar near Huselske, and a P-18 radar near Zelenyi Yar.
Communications and EW nodes. Targets include a Redut-2US protected communications system near Frolivske, a communications hub in Kinski Rozdory, and an EW-equipment depot in Dmytrivka.
Command and UAV-control nodes. Hits on the “Kaira” UAV unit command post near Staromlynivka, command-observation posts near Staromlynivka / Soledar / Komyshuvakha, a UAV command/control point in Myrne, command/control targets in the Selydove area, and a UAV command/control point in the Pokrovsk area.
Ammunition and fuel depots. Ammunition depots near Yepifanivka, Rovenky, and Donetsk; fuel/POL targets included Valerianivka and the separate Mariupol fuel depot mentioned above.
Logistics, repair, training and personnel nodes. A training center / temporary deployment point and MTS depot in Raihorodka, MTS depots in Perevalne and Novopoltavka, repair subunits or bases in Hromivka, Rozkvit, Mykolaivka, and Rovenky, a training ground in Kulykivske, and manpower/temporary deployment sites in Donetsk, Okhrimivka, and Bahatyr.
Rail infrastructure. The post-ceasefire set also includes Dzhankoi railway station in Crimea, logged as a rail-infrastructure strike.




