classification

an archive of posts with this tag

Apr 08, 2025 On the Biology of a Large Language Model
Mar 11, 2025 WHEN IS TASK VECTOR Provably EFFECTIVE FOR MODEL EDITING? A GENERALIZATION ANALYSIS OF NONLINEAR TRANSFORMERS
Jan 02, 2025 Inferring from Logits: Exploring Best Practices for Decoding-Free Generative Candidate Selection
Oct 17, 2024 Rule Based Rewards for Language Model Safety
Sep 02, 2024 LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
May 27, 2024 Understanding the performance gap between online and offline alignment algorithms
May 07, 2024 How to Train LLM? - From Data Parallel To Fully Sharded Data Parallel
Apr 30, 2024 Many-Shot In-Context Learning
Apr 23, 2024 Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?
Apr 13, 2024 Scaling Laws for Data Filtering— Data Curation cannot be Compute Agnostic
Apr 02, 2024 Preference-free Alignment Learning with Regularized Relevance Reward
Mar 19, 2024 Unveiling the Generalization Power of Fine-Tuned Large Language Models
Mar 12, 2024 A Simple and Effective Pruning Approach for Large Language Models
Jan 30, 2024 Lion: Adversarial Distillation of Proprietary Large Language Models
Jan 23, 2024 OVERTHINKING THE TRUTH: UNDERSTANDING HOW LANGUAGE MODELS PROCESS FALSE DEMONSTRATIONS
Jan 23, 2024 IN-CONTEXT PRETRAINING: LANGUAGE MODELING BEYOND DOCUMENT BOUNDARIES
Jan 16, 2024 Mistral 7B & Mixtral (Mixtral of Experts)
Jan 09, 2024 Making Large Language Models A Better Foundation For Dense Retrieval
Dec 12, 2023 Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
Sep 19, 2023 The CRINGE Loss: Learning what language not to model
May 25, 2023 Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Apr 20, 2023 FALSESUM : Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization
Apr 13, 2023 P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks