| Aug 19, 2025 | ON THE GENERALIZATION OF SFT: A REINFORCEMENT LEARNING PERSPECTIVE WITH REWARD RECTIFICATION |
| Aug 05, 2025 | Impact of Fine-Tuning Methods on Memorization in Large Language Models |
| Aug 05, 2025 | BLOCK DIFFUSION: INTERPOLATING BETWEEN AUTOREGRESSIVE AND DIFFUSION LANGUAGE MODELS |
| Jul 15, 2025 | Reasoning Model is Stubborn: Diagnosing Instruction Overriding in Reasoning Models |
| Jun 17, 2025 | Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models |
| Jun 03, 2025 | Textgrad: Automatic “Differentiation” via Text |
| Apr 15, 2025 | Universal and Transferable Adversarial Attacks on Aligned Language Models |
| Mar 04, 2025 | SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution |
| Jan 21, 2025 | Agent Laboratory: Using LLM Agents as Research Assistants |
| Jan 02, 2025 | Inferring from Logits: Exploring Best Practices for Decoding-Free Generative Candidate Selection |
| Jan 02, 2025 | DeepSeek R1 |
| Oct 17, 2024 | Rule Based Rewards for Language Model Safety |
| Oct 10, 2024 | FAITHEVAL: CAN YOUR LANGUAGE MODEL STAY FAITHFUL TO CONTEXT, EVEN IF “THE MOON IS MADE OF MARSHMALLOWS” |
| Oct 03, 2024 | QCRD: Quality-guided Contrastive Rationale Distillation for Large Lanauge Models |
| Sep 23, 2024 | Training Language Models to Self-Correct via Reinforcement Learning |
| Sep 23, 2024 | SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories |
| Sep 09, 2024 | Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models |
| Sep 02, 2024 | Many-shot jailbreaking |
| Sep 02, 2024 | LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders |
| Aug 20, 2024 | Knowledge-Augmented Reasoning distillation for Small Language Models in Knowledge-Intensive Tasks (KARD) |
| Aug 13, 2024 | Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process |
| Aug 13, 2024 | Knowledge conflict survey |
| Jul 30, 2024 | In-Context Retrieval-Augmented Language Models |
| Jul 23, 2024 | Step-DPO : Step-wise preference optimization for long-chain reasoning of LLMs |
| Jul 02, 2024 | RL-JACK: Reinforcement Learning-powered Black-box Jailbreaking Attack against LLMs |
| Jun 11, 2024 | Contextual Position Encoding: Learning to Count What’s Important |
| May 21, 2024 | LLAMA PRO: Progressive LLaMA with Block Expansion |
| Apr 30, 2024 | Many-Shot In-Context Learning |
| Apr 16, 2024 | Understanding Emergent Abilities of Language Models from the Loss Perspective |
| Apr 02, 2024 | Preference-free Alignment Learning with Regularized Relevance Reward |
| Mar 26, 2024 | Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks |
| Mar 19, 2024 | Unveiling the Generalization Power of Fine-Tuned Large Language Models |
| Mar 12, 2024 | A Simple and Effective Pruning Approach for Large Language Models |
| Mar 05, 2024 | Beyond Memorization: Violating Privacy Via Inferencing With LLMs |
| Feb 27, 2024 | SELF-RAG: LEARNING TO RETRIEVE, GENERATE, AND CRITIQUE THROUGH SELF-REFLECTION |
| Feb 20, 2024 | WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia |
| Feb 20, 2024 | KNOWLEDGE CARD: FILLING LLMS’ KNOWLEDGE GAPS WITH PLUG-IN SPECIALIZED LANGUAGE MODELS |
| Feb 13, 2024 | CAN SENSITIVE INFORMATION BE DELETED FROM LLMS? OBJECTIVES FOR DEFENDING AGAINST EXTRACTION ATTACKS |
| Feb 06, 2024 | Self-Rewarding Language Models |
| Jan 30, 2024 | Lion: Adversarial Distillation of Proprietary Large Language Models |
| Jan 23, 2024 | OVERTHINKING THE TRUTH: UNDERSTANDING HOW LANGUAGE MODELS PROCESS FALSE DEMONSTRATIONS |
| Jan 16, 2024 | Mistral 7B & Mixtral (Mixtral of Experts) |
| Jan 16, 2024 | BENCHMARKING COGNITIVE BIASES IN LARGE LANGUAGE MODELS AS EVALUATORS |
| Jan 02, 2024 | DETECTING PRETRAINING DATA FROM LARGE LANGUAGE MODELS |
| Dec 26, 2023 | Are Emergent Abilities of Large Language Models a Mirage? |
| Dec 12, 2023 | Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning |
| Oct 31, 2023 | In-Context Learning Learns Label Relationships but Is Not Conventional Learning |
| Oct 31, 2023 | A Survey on Large Language Model based Autonomous Agents |
| Sep 19, 2023 | The CRINGE Loss: Learning what language not to model |
| Sep 19, 2023 | LARGE LANGUAGE MODELS AS OPTIMIZERS |
| Sep 12, 2023 | A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training |
| Jun 29, 2023 | QLoRA: Eficient Finetuning of Quantized LLMs |
| Jun 22, 2023 | The False Promise of Imitating Proprietary LLMs |
| Jun 15, 2023 | Do Prompt-Based Models Really Understand the Meaning of Their Prompts? |
| May 25, 2023 | Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? |
| Mar 30, 2023 | GPT Understands, Too |
| Mar 16, 2023 | Calibrating Factual Knowledge in Pretrained Language Models |
| Feb 02, 2023 | Measuring and Improving Semantic Diversity of Dialogue Generation |
| Jan 26, 2023 | Task-aware Retrieval with Instructions |
| Jan 12, 2023 | A Survey for In-context Learning |