Links for 2024-01-20
AI:
This is huge! LLM itself provides its own rewards on own generations via LLM-as-a-Judge during Iterative DPO...opens the door to superhuman feedback?
We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal...we study Self-Rewarding Language Models, where the language model itself is used via LLM-as-a-Judge prompting to provide its own rewards during training. We show that during Iterative DPO training that not only does instruction following ability improve, but also the ability to provide high-quality rewards to itself. Fine-tuning Llama 2 70B on three iterations of our approach yields a model that outperforms many existing systems on the AlpacaEval 2.0 leaderboard, including Claude 2, Gemini Pro,and GPT-4 0613. While only a preliminary study, this work opens the door to the possibility of models that can continually improve in both axes.
Paper: Meta presents Self-Rewarding Language Models https://arxiv.org/abs/2401.10020
More AI links:
ANYmal Unleashed: Revolutionizing Search-and-Rescue with Legged Robots https://www.youtube.com/watch?v=_R29DrAx0Xs
ASPIRE: a framework that enhances the selective prediction capabilities of large language models, enabling them to output an answer paired with a confidence score. https://blog.research.google/2024/01/introducing-aspire-for-selective.html
INTERS: Significantly boosts the performance of various publicly available LLMs in search-related tasks https://arxiv.org/abs/2401.06532
Large language models as copilots for theorem proving in Lean https://www.youtube.com/watch?v=7NAIXBANSj4
For decades, the field of forensics has assumed that no two fingerprints are ever alike, even on different fingers from the same person. However, a new neural network-based analysis overturns this long-held assumption. https://www.science.org/doi/10.1126/sciadv.adi0329
Science and Technology:
A multidisciplinary team of researchers developed a brain implant that, when placed on the surface, enables researchers to capture high-res information about activity deep inside the brain without damaging its delicate tissue. https://today.ucsd.edu/story/transparent-brain-implant-can-read-deep-neural-activity-from-the-surface
China’s Solar Dominance Faces New Rival: An Ultrathin Film https://www.wsj.com/business/energy-oil/chinas-solar-dominance-faces-new-rival-an-ultrathin-film-adbc2536 [https://archive.is/XkubK]
Dye-sensitized solar cells - an idea whose time has finally come? https://nanoscale.blogspot.com/2024/01/dye-sensitized-solar-cells-idea-whose.html
Why small-scale nuclear projects have struggled to get off the ground in the U.S. https://www.city-journal.org/article/where-now-for-nuclear-power
“The latest "killer virus" panic: In which I aim to bring some sense to the latest social media panic” https://www.writingruxandrabio.com/p/the-latest-killer-virus-panic
“Romans lacked one crucial ingredient: Efficient implementation of steam power which made industrial revolution possible relies on understanding of thermodynamics and concepts like latent heat and Boyle's law. Romans lacked the maths to grasp it. Song China utilized coal and had mass production of steel. Nonetheless an industrial revolution couldn't take place - they, likewise, didn't understand what they're sitting on due to lack of mathematical understanding trial and error doesn't scale!” https://twitter.com/n00rdung/status/1747929254237495550
Politics:
This is incredibly sad. It is a prime example of how much the Western world has changed for the worse in the last two decades:
Two Harvard Crimson articles, one from 2006 and the other from 2023, describing the legendary Math 55 class showcase how much college has changed in less than a generation.
'06: “This is probably the most difficult undergraduate math class in the country,” reads a page on the Mathematics Department Web site.
'23: “Our slogan is, if you’re reasonably good at math, you love it, and you have lots of time to devote to it, then Math 55 is completely fine for you.” -- article published by the math dept titled, “Demystifying Math 55.”
'06: Regardless of the course’s name brand value, Math 55 students face a single fact: It’s hard. Really hard.
'23: Zoe Shleifer ’26, another current Math 55 student, also doesn’t get the hype. “It’s fun,” she says. “It’s just like any other class. You know, we go to lecture, and then we leave lecture, and then we do the problem set.”
'06: Midway through October, the “Survivor”-like competition intensifies with the add/drop deadline looming frighteningly near, only five days away.
'23: “We wanted to avoid a situation where some students felt excluded because of the way the course was taught a particular year,” (Professor) Harris says.
'06: The class can’t stay this hard for this long, right? “I figure he’s just trying to get people to drop the class,” Litt says. He figured wrong. As class attendance steadily thins, the workload does not.
'23: “To be as inclusive as possible is one of the things I was told before I walked in the classroom the first day,” says Harris.
'06: Before the fifth Monday of the term, students who can’t seem to stay in the game start dropping like flies. “I thought it was completely unbelievable,” Harbater says. “Seventy started it, 20 finished it, and only 10 understood it.”
'23: “The math department has been working hard to foster a more inclusive culture around Math 55. The overall feedback we have received from students about the last few iterations of the course suggests that this is beginning to bear fruit,” she continued, referencing reviews for Math 55’s Q guide reviews, which are overwhelmingly positive.
'06 article: https://thecrimson.com/article/2006/12/6/burden-of-proof-at-1002-am/
'23 article: https://thecrimson.com/article/2023/3/26/behind-math-55/
via @JohnArnoldFndtn
Ukraine:
"We want to decompose a large warship into its functions - air defense, weapons, protection - and put these weapons on several drones," Hunter (SBU) explains. https://twitter.com/DAlperovitch/status/1748011703923585417
“Ukraine hit an oil depot in Russia in a drone attack on Friday, officials on both sides said, the latest in a series of recent assaults targeting Russian oil facilities as Kyiv increasingly seeks to strike critical infrastructure behind Russian lines” https://www.nytimes.com/2024/01/19/world/europe/ukraine-russia-oil-drone-attack.html [https://archive.is/KSwew]
“🛡️ The Baltic countries have agreed on the 🇪🇪🇱🇻🇱🇹 #BalticDefenceLine along their eastern border. It's crucial to use time wisely to increase #defence readiness. #NotAnInch” https://twitter.com/ColbyBadhwar/status/1748370256945406162
The European Union will have the capacity to produce at least 1.3 million shells by the end of the year, European Commissioner Thierry Breton said. https://www.rtbf.be/article/guerre-en-ukraine-l-union-europeenne-aura-la-capacite-de-produire-13-million-d-obus-d-ici-la-fin-de-l-annee-annonce-thierry-breton-11315950
“The SPD around Olaf Scholz have been caught within another embarrassing situation regarding the TAURUS missile.” https://twitter.com/Tendar/status/1748584094756188601
A more personal and detailed interview with the crew of the Ukrainian Bradley which took on a russian T-90M. https://twitter.com/wartranslated/status/1748480083990372669