🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 20 items • Updated 12 days ago • 125
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Paper • 2501.18511 • Published Jan 30 • 19
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Paper • 2402.14740 • Published Feb 22, 2024 • 13
🧠 Abliteration Collection Uncensored models using abliteration. See this article for more information: huggingface.co/blog/mlabonne/abliteration • 15 items • Updated 24 days ago • 55
The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines Paper • 2408.01050 • Published Aug 2, 2024 • 9