bert | Unknown NLP Lab

Apr 22, 2025	Fine-tuning Vision-Language-Action Models: Optimizing Speed and Success
Mar 04, 2025	Contextual Document Embeddings
Feb 18, 2025	DeepSeek v3
Feb 04, 2025	SSM → HIPPO → LSSL → S4 → Mamba → Mamba2
Jan 02, 2025	Diffusion Language Model-Mathematical foundations & inference optimization
Sep 23, 2024	SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories
Sep 09, 2024	Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models
Sep 02, 2024	LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Aug 20, 2024	Knowledge-Augmented Reasoning distillation for Small Language Models in Knowledge-Intensive Tasks (KARD)
Jul 30, 2024	In-Context Retrieval-Augmented Language Models
Jun 04, 2024	Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
May 07, 2024	How to Train LLM? - From Data Parallel To Fully Sharded Data Parallel
Apr 30, 2024	Training diffusion modelse with reinforcement learning
Mar 26, 2024	Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks
Feb 20, 2024	WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Jan 09, 2024	Making Large Language Models A Better Foundation For Dense Retrieval
Dec 19, 2023	Learning to Tokenize for Generative Retrieval
Sep 12, 2023	A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Jun 29, 2023	QLoRA: Eficient Finetuning of Quantized LLMs
Jun 15, 2023	Do Prompt-Based Models Really Understand the Meaning of Their Prompts?
Apr 20, 2023	FALSESUM : Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization
Apr 13, 2023	P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Mar 30, 2023	GPT Understands, Too
Jan 26, 2023	Task-aware Retrieval with Instructions
Jan 19, 2023	KALA: Knowledge-Augmented Language Model Adaptation