Links for 2023-10-26
Motif: An LLM-powered method for intrinsic motivation from AI feedback.
Extracts reward functions from Llama 2's preferences and uses them to train agents using reinforcement learning.
Solves previously unsolved tasks without the need for expert demonstrations by bootstrapping its knowledge from LLM's common sense.
Solves extremely sparse reward tasks, when other methods never find the solution.
Leads to better game score than the one obtained by using the score itself as a reward.
First time an intrinsically motivated agent outperforms the task-driven baseline on such a complex environment.
Reward encourages a survival attitude to get significantly more score than an agent trained to maximize the score.
Discovers sophisticated strategy to hack the reward. Learns to find hallucinogens to dream of the goal state, instead of actually going there.
Small compute budget: the whole pipeline can take less than two GPU-days.
Paper: https://arxiv.org/abs/2310.00166
Code: https://github.com/facebookresearch/motif
Blog post: https://mila.quebec/en/article/motif/
I think the right way to think of the models we create is as a reasoning engine, not a fact database. They can also act as a fact database, but that’s not really what is special about them.
— Sam Altman
Previously: Can GPT-4 teach a robot hand to do pen spinning tricks better than you do? Eureka, an open-ended agent that designs reward functions for robot dexterity at super-human level. https://eureka-research.github.io/
“We argue that Transformers will generalize to harder instances on algorithmic tasks iff the algorithm can be written in the RASP-L programming language (cf Weiss et al). By design, each line of RASP-L code can be compiled into 1 Transformer layer.” https://arxiv.org/abs/2310.16028
AgentTuning: Enabling Generalized Agent Abilities for LLMs https://arxiv.org/abs/2310.12823
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning https://sites.google.com/view/vlm-rm
3D-GPT: 3D Modeling With Large Language Models https://chuny1.github.io/3DGPT/3dgpt.html
ChatGPT Can 'Infer' Personal Details From Anonymous Text https://gizmodo.com/chatgpt-llm-infers-identifying-traits-in-anonymous-text-1850934318
California suspension is an existential threat to Cruise https://www.understandingai.org/p/california-suspension-is-an-existential
This pea-sized brain implant could be inserted in a 30-minute surgery. But the "real breakthrough," is the wireless power system. https://spectrum.ieee.org/neurostimulation
Cardiac regeneration becomes possible: In mice reprogramming of energy metabolism restores cardiac function after infarction https://www.mpg.de/20981292/1020-pfor-cardiac-regeneration-becomes-possible-through-reprogramming-of-cell-metabolism-149770-x
A green laser of the Chinese Daqi-1/AEMS satellite scanning the surface of Earth at around 7.51 km/s. https://youtu.be/vn_PMiND4Yw
Atomic Vapor Meets Radio Waves: The Future of Antennas? https://www.otago.ac.nz/sciences/news/news/physicists-create-new-antenna
Political links:
"The European Union is falling behind on plans to provide Ukraine with a million artillery shells by March, people familiar with the matter said...With more than half of that time now gone, the initiative has so far delivered about 30% of the target" https://www.bloomberg.com/news/articles/2023-10-25/russia-ukraine-war-eu-is-falling-short-on-pledge-to-supply-kyiv-with-ammunition [https://archive.ph/YPn4R]
“Now that important UN Security Council restrictions have been lifted, will Iran begin exporting advanced missiles and UAVs to Russia for use against Ukraine?” https://www.iiss.org/online-analysis/missile-dialogue-initiative/2023/10/iiss-experts-on-the-expiry-of-un-limitations-on-irans-missile-exports/
“Russia is also innovating. They are scaling up production of FPV and Lancet loitering munitions, UMPK glide bombs, Orlan-30 UAVs, Krasnopol laser-guided artillery rounds, and other systems. They are modernizing many of these systems as well.” https://twitter.com/sambendett/status/1717158812593897643
“One of the most conspicuous failures of the Biden administration has been their refusal to aggressively reindustrialize to increase our arms manufacturing capacity. The necessity became clear in 2022, it has only grown more critical in the interim.” https://twitter.com/RealCynicalFox/status/1717223315662627033
Russia is mining the Ukrainian corridor by plane. https://twitter.com/The_Lookout_N/status/1717230038905688148
“Something powerful burst out of the ground in the Gaza Strip. Probably ammunition” https://twitter.com/ian_matveev/status/1717187116235432138
“…a Muslim soldier in the Israeli army, tells it how it is on Arabic TV. The presenter desperately tries to defend Hamas, looks stupid. Reader: this is the *BBC* Arabic service.” https://twitter.com/JakeWSimons/status/1716941441514234306