New paper :mega: “Fine-Grained Human Feedback Gives Better Rewards for Language Model Training” is out. [Paper] [Code/Data] (NeurIPS Spotlight 2023)