The app for independent voices

Yes, data quality is definitely crucial. There was the LIMA paper, for example, the showed that with a 1k high-quality instruction dataset, you can get better performance than 50k instructions from Alpaca (which were supposedly lower quality):

Jun 16, 2024
at
2:45 PM

Log in or sign up

Join the most interesting and insightful discussions.