- formatting
- images
- links
- math
- code
- blockquotes
- external-services
•
•
•
•
•
•
-
ORPO: Monolithic Preference Optimization without Reference Model
논문 리뷰 - Alignment 관련 연구
-
Understanding Emergent Abilities of Language Models from the Loss Perspective
논문 리뷰 - Pre-training 관련 연구
-
Scaling Laws for Data Filtering— Data Curation cannot be Compute Agnostic
논문 리뷰 - LM, LLM, Efficient Training, Pre-training 관련 연구
-
Preference-free Alignment Learning with Regularized Relevance Reward
논문 리뷰 - Reinforcement Learning, Alignment 관련 연구