Links for 2024-10-24
The global AI arms race is heating up:
The White House issued a National Security Memorandum declaring that 'AI is likely to affect almost all domains with national security significance'. Attracting technical talent and building computational power are now official national security priorities.
It is now the official policy that the United States must lead the world in the ability to train new foundation models. All government agencies will work to promote these capabilities.
DoS, DoD and DHS 'shall each use all available legal authorities to assist in attracting and rapidly bringing to the United States individuals with relevant technical expertise who would improve United States competitiveness in AI and related fields'
The Department of State, the DoD, DoE, and the DoC shall, 'as appropriate and consistent with applicable law, use existing authorities to make public investments and encourage private investments in strategic domestic and foreign AI technologies and adjacent fields.'
Classified threat evaluations of the frontier going forward, delivered directly to the President. 'The United States Government shall advance classified evaluations of advanced AI models’ capacity to generate or exacerbate deliberate chemical and biological threats.'
Our competitors want to upend U.S. AI leadership and have employed economic and technological espionage in efforts to steal U.S. technology.
AI:
Anthropic upgraded Claude 3.5 Sonnet: On coding it improves performance on SWE-bench Verified from 33.4% to 49.0%, scoring higher than all publicly available models—including reasoning models like OpenAI o1-preview and specialized systems designed for agentic coding. It can also control your PC. https://www.anthropic.com/news/3-5-models-and-computer-use
Global power demand for AI data centers could grow by more than 130 GW by 2030. https://ifp.org/future-of-ai-compute/
New SOTA for theorem proving model: InternLM2.5-StepProver and its critic, setting a new minif2f SOTA at 65.9%. No hallucination, verified math proofs by LLM reasoning. https://arxiv.org/abs/2410.15700
miniCTX, a new benchmark that tests a model's ability to prove theorems from complex, real Lean projects https://cmu-l3.github.io/minictx/
New follow-up work on the effects of synthetic data on model pre-training. It’s becoming increasingly clear that the model collapse issues predicted by prior works are not panning out in theory and practice. https://arxiv.org/abs/2410.16713
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model https://arxiv.org/abs/2410.13639
A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration https://arxiv.org/abs/2410.16540
Evaluating and enhancing probabilistic reasoning in language models https://research.google/blog/evaluating-and-enhancing-probabilistic-reasoning-in-language-models/
Multi-Agent AI and GPU-Powered Innovation in Sound-to-Text Technology https://developer.nvidia.com/blog/multi-agent-ai-and-gpu-powered-innovation-in-sound-to-text-technology/
"SimpleAutomation": Robo-arms helpers for $200, with the most advanced AI inside. (Figure and Tesla robots use the same algorithms). You spend $200 on a robot, use a special joystick to control it, and do the tasks you need. After 10 minutes of examples, your tireless autonomous assistant is ready. https://github.com/1g0rrr/SimpleAutomation
LLMD: A Large Language Model for Interpreting Longitudinal Medical Records https://arxiv.org/abs/2410.12860
Microsoft CEO Satya Nadella says AI development is being optimized by OpenAI's o1 model and has entered a recursive phase: "we are using AI to build AI tools to build better AI" https://youtu.be/kOkDTvsUuWA?si=PwvicSLriSR5nmbN&t=1865
Demis Hassabis says DeepMind's drug discovery spinoff Isomorphic will have drug treatments in the clinic in a couple of years tackling "six big areas of health" https://www.ft.com/content/72d2c2b1-493b-4520-ae10-41c1a7f3b7e4 [no paywall: https://archive.is/Nbx7M]
ChatGPT o1-preview can code Stan https://statmodeling.stat.columbia.edu/2024/10/22/chatgpt-o1-preview-can-code-stan/
“ChatGPT-O1 Changes Programming as a Profession. I really hated saying that.” https://www.youtube.com/watch?app=desktop&v=j0yKLumIbaM
Progress towards real-time generation from multimodal models https://openai.com/index/simplifying-stabilizing-and-scaling-continuous-time-consistency-models/
Musk and xAI pulled off a feat that usually takes four years, setting up a supercluster of 100,000 H200 GPUs in just 19 days. Nvidia's Jensen Huang called the effort "superhuman,". https://www.tomshardware.com/pc-components/gpus/elon-musk-took-19-days-to-set-up-100-000-nvidia-h200-gpus-process-normally-takes-4-years
Generating Distinct AI Voice Performances By Prompt Engineering GPT-4o https://minimaxir.com/2024/10/speech-prompt-engineering/
ETH Zurich researchers showcase a method using YOLO models to bypass reCAPTCHAv2 with 100% accuracy https://arxiv.org/abs/2409.08831
Technology:
Science Corporation, a leader in brain-computer interface (“BCI”) technology, has announced the preliminary clinical trials results for its PRIMA retinal implant. https://science.xyz/news/primavera-trial-preliminary-results/
Thermodynamic modeling to optimize the shape of a beer glass to keep it cold for the longest time. https://www.arxiv.org/abs/2410.12043
Solar transpiration–powered lithium extraction and storage https://techxplore.com/news/2024-10-solar-powered-lithium-brine.html
Science:
The insane capabilities of fish in tests of memory, learning, and problem-solving. https://80000hours.org/podcast/episodes/sebastien-moro-fish-cognition-senses-social-lives/
A substantial reduction of Alzheimers disease associated with semaglutide (Ozempic) in > 1 million people with Type 2 diabetes from a nationwide data resource https://alz-journals.onlinelibrary.wiley.com/doi/10.1002/alz.14313
War:
How AI and autonomy create "precise mass": Militaries are beginning to realize that they don’t have to choose between precision and mass; they can have both. https://www.foreignaffairs.com/world/battles-precise-mass-technology-war-horowitz [no paywall: https://archive.is/oRGXh]
Why the French military cryptanalysis failed to break Enigma: "Unlike the Polish, British and American services, French codebreakers did not have an academic background in mathematics, science or classics...More decisively, no French codebreaker seemed to have demonstrated genius, that combination of intelligence, imagination & adaptation." https://www.tandfonline.com/doi/full/10.1080/01611194.2023.2261121
Ukraine:
North Korea Joins Europe's War https://macspaunday.substack.com/p/north-korea-joins-europes-war
Inside the Drone War Arms Race in Ukraine — “I myself was able to find the location of a Patriot ...by simply looking at where the images are being taken. That's basically an intersection of a lot of interest. And I then asked Ukrainian Air Force, like, what is here?...And they say that it's where Patriot stands.” https://x.com/shashj/status/1849026729651671326
The United States has agreed to give Ukraine $800 million in military aid that will go toward manufacturing long-range drones https://www.nytimes.com/2024/10/22/world/europe/us-ukraine-aid-long-range-drones.html [no paywall: https://archive.is/bECji]
Footage shared by Russian media, showing a Ka-29 multipurpose helicopter chasing a Ukrainian naval drone, reportedly equipped with a converted R-73 short-range air-to-air missile. https://x.com/NOELreports/status/1849061153596465177
“A well-known Russian military medic and fascist, who called for the extermination of Ukrainians, has recorded an appeal to Defense Minister Belousov, complaining about the "serial killings of personnel in the form of meat-grinder assaults" that are becoming widespread.” https://x.com/wartranslated/status/1849420120600285422

