AI: Microsoft presents Rho-1: RHO-1-1B and 7B achieves SotA results of 40.6% and 51.8% on MATH dataset, respectively — matching DeepSeekMath with only 3% of the pretraining tokens. https://github.com/microsoft/rho From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples — “Our results demonstrate that large language models are capable of doing regression when given in-context examples of (input, output) pairs, despite not being explicitly trained to do so.”
Links for 2024-04-12
Links for 2024-04-12
Links for 2024-04-12
AI: Microsoft presents Rho-1: RHO-1-1B and 7B achieves SotA results of 40.6% and 51.8% on MATH dataset, respectively — matching DeepSeekMath with only 3% of the pretraining tokens. https://github.com/microsoft/rho From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples — “Our results demonstrate that large language models are capable of doing regression when given in-context examples of (input, output) pairs, despite not being explicitly trained to do so.”