"Historical citations (PPO Schulman 1707.06347, InstructGPT 2203.02155, DPO Rafailov 2023 NeurIPS, DeepSeekMath GRPO 2402.03300, DeepSeek-R1 2501.12948, KTO/IPO/SimPO/ORPO)", "Callout 'empty ...
"Historical citations verified (LLM.int8 NeurIPS 2022, GPTQ ICLR 2023, AWQ MLSys 2024, SmoothQuant ICML 2023, QuaRot NeurIPS 2024, SpinQuant ICLR 2025)", "reviewer ...