Links for 2022-12-30
5 years from now, you won't care how much time you spent following Twitter drama, and you will care about missing out on things like Dramatron or CICERO or U-PaLM or RTX-1.
— gwern
LAMBADA: Backward Chaining for Automated Reasoning in Natural Language — Achieves massive accuracy boosts over SotA forward reasoning methods on two challenging logical reasoning datasets with a method inspired by backward reasoning. https://arxiv.org/abs/2212.13894
“The AI explosion is warping our sense of time. Can you believe Stable Diffusion is only 4 months old, and ChatGPT <4 weeks old 🤯? If you blink, you miss a whole new industry. Here are my TOP 10 AI spotlights, from a breathtaking 2022 in rewind” https://threadreaderapp.com/thread/1607746957753057280.html
Top machine learning tweets of 2022 https://threadreaderapp.com/thread/1607978738561552386.html
Can the AI driving ChatGPT help to detect early signs of Alzheimer's disease? https://www.eurekalert.org/news-releases/975246
How you can get basically any answer you want from regressions using arcsinh(y) or log(y+1) https://threadreaderapp.com/thread/1602486736567042048.html
MIT researchers are discovering which parts of the brain are engaged when a person evaluates a computer program. https://news.mit.edu/2022/your-brain-your-brain-code-1221
Magnus Carlsen Checkmates Bill Gates in just 12 seconds https://www.youtube.com/watch?v=oH-txHzE4jA
Ultrafast Electronic Characterization of Proteins and Materials https://www.tsukuba.ac.jp/en/research-news/20221214000000.html
New study finds that people with higher cognitive ability are more supportive of free speech and less concerned about political correctness: https://econtent.hogrefe.com/doi/full/10.1027/1614-0001/a000385
Textbook Introduction to Homotopy Type Theory https://arxiv.org/abs/2212.11082
What do Computer Scientists Read? https://www.youtube.com/watch?v=dMYgY5FhO3M
Clever stuff ravens do with objects: Throw pine cones at trespassing scientists; Soak food to soften it; Put food in containers to move it; Cover food to hide it; Hold bark in claws to steer in wind; Use sticks to retrieve food; Use sticks to jab owls https://doi.org/10.1111/eth.13352
gwern on how to develop models which have taste/drives/preferences/esthetics towards work that is useful/novel:
Links to comments:
1. https://www.lesswrong.com/posts/qy5dF7bQcFjSKaW58/bad-at-arithmetic-promising-at-math?commentId=MDu7XNhpFxyS6FKGv
2. https://www.lesswrong.com/posts/qy5dF7bQcFjSKaW58/bad-at-arithmetic-promising-at-math?commentId=XADsKyb6CvYB4ZD9k
by Matthew Barnett (@MatthewJBar)
Some improvements we might start to see more in large language models within 2 years:
- Explicit memory that will allow it to retrieve documents and read them before answering questions https://arxiv.org/abs/2112.04426
- A context window of hundreds of thousands of tokens, allowing the model to read and write entire books https://arxiv.org/abs/2202.07765
- Dynamic inference computation that depends on the difficulty of the query, allowing the model to "think hard" about difficult questions before spitting out an answer https://arxiv.org/abs/2207.07061
- Alignment principles that help the model produce more reliable and more useful output than naive RLHF, such as Anthropic's "Constitutional AI" approach https://www.anthropic.com/constitutional.pdf
More:
- "an effectively infinite, precise context window allowing a continuous reading of the internet" Memorizing Transformers https://arxiv.org/abs/2203.08913
- Block Recurrent Tranformers https://arxiv.org/abs/2203.07852, I am biased but to me they look better than Perceiver.
- "Compositional Attention: Disentangling Search and Retrieval" https://arxiv.org/abs/2110.09419
- Sam Altman mentioned even more here: https://www.youtube.com/watch?v=WHoWGNQRXb0