| Apr 22, 2025 | Fine-tuning Vision-Language-Action Models: Optimizing Speed and Success |
| Mar 04, 2025 | Contextual Document Embeddings |
| Feb 18, 2025 | DeepSeek v3 |
| Feb 04, 2025 | SSM → HIPPO → LSSL → S4 → Mamba → Mamba2 |
| Jan 02, 2025 | Diffusion Language Model-Mathematical foundations & inference optimization |
| Sep 23, 2024 | SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories |
| Sep 09, 2024 | Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models |
| Sep 02, 2024 | LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders |
| Aug 20, 2024 | Knowledge-Augmented Reasoning distillation for Small Language Models in Knowledge-Intensive Tasks (KARD) |
| Jul 30, 2024 | In-Context Retrieval-Augmented Language Models |
| Jun 04, 2024 | Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training |
| May 07, 2024 | How to Train LLM? - From Data Parallel To Fully Sharded Data Parallel |
| Apr 30, 2024 | Training diffusion modelse with reinforcement learning |
| Mar 26, 2024 | Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks |
| Feb 20, 2024 | WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia |
| Jan 09, 2024 | Making Large Language Models A Better Foundation For Dense Retrieval |
| Dec 19, 2023 | Learning to Tokenize for Generative Retrieval |
| Sep 12, 2023 | A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training |
| Jun 29, 2023 | QLoRA: Eficient Finetuning of Quantized LLMs |
| Jun 15, 2023 | Do Prompt-Based Models Really Understand the Meaning of Their Prompts? |
| Apr 20, 2023 | FALSESUM : Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization |
| Apr 13, 2023 | P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks |
| Mar 30, 2023 | GPT Understands, Too |
| Jan 26, 2023 | Task-aware Retrieval with Instructions |
| Jan 19, 2023 | KALA: Knowledge-Augmented Language Model Adaptation |