2023

an archive of posts from this year

Dec 26, 2023 Are Emergent Abilities of Large Language Models a Mirage?
Dec 19, 2023 Learning to Tokenize for Generative Retrieval
Dec 19, 2023 Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Dec 12, 2023 Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
Oct 31, 2023 In-Context Learning Learns Label Relationships but Is Not Conventional Learning
Oct 31, 2023 EFFICIENT STREAMING LANGUAGE MODELS WITH ATTENTION SINKS
Oct 31, 2023 A Survey on Large Language Model based Autonomous Agents
Oct 17, 2023 Resolving Interference When Merging Models
Oct 10, 2023 LongLoRA: Efficient Fine-Tuning of Long-Context Large Language Models
Oct 03, 2023 DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Sep 19, 2023 The CRINGE Loss: Learning what language not to model
Sep 19, 2023 LARGE LANGUAGE MODELS AS OPTIMIZERS
Sep 12, 2023 SILO LANGUAGE MODELS: ISOLATING LEGAL RISK IN A NONPARAMETRIC DATASTORE
Sep 12, 2023 A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Aug 29, 2023 Code Llama: Open Foundation Models for Code
Jun 29, 2023 QLoRA: Eficient Finetuning of Quantized LLMs
Jun 22, 2023 The False Promise of Imitating Proprietary LLMs
Jun 15, 2023 Do Prompt-Based Models Really Understand the Meaning of Their Prompts?
May 25, 2023 Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
May 11, 2023 Measuring Association Between Labels and Free-Text Rationales
Apr 27, 2023 Automatic chain of thought prompting in large language models
Apr 20, 2023 FALSESUM : Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization
Apr 13, 2023 P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Apr 13, 2023 AdapterDrop: On the Efficiency of Adapters in Transformers
Mar 30, 2023 GPT Understands, Too
Mar 16, 2023 Calibrating Factual Knowledge in Pretrained Language Models
Feb 09, 2023 AdapterHub: A Framework for Adapting Transformers, Parameter-Efficient Transfer Learning for NLP
Feb 02, 2023 Measuring and Improving Semantic Diversity of Dialogue Generation
Jan 26, 2023 Task-aware Retrieval with Instructions
Jan 19, 2023 KALA: Knowledge-Augmented Language Model Adaptation
Jan 12, 2023 A Survey for In-context Learning