News

Aug 27, 2024: The missing papers are now available in the Anthology. 🎉

Aug 22, 2024: The following list of the missing papers has been compiled.

Aug 20, 2024: ACL noticed that there are more than 100 accepted papers are missing from the proceedings (Announced on X).

Aug 12, 2024: The proceedings of the 62nd annual meeting of the Association for Computational Linguistics, as well as Findings: ACL and associated workshops, are now available online.

List of the missing papers

[arXiv] DeVAn: Dense Video Annotation for Video-Language Models
[arXiv] How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
[arXiv] The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models
[arXiv] Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
[arXiv] L-Eval: Instituting Standardized Evaluation for Long Context Language Models
[arXiv] DIALECTBENCH: An NLP Benchmark for Dialects, Varieties, and Closely-Related Languages
[Not Found] Causal-Guided Active Learning for Debiasing Large Language Models
[Not Found] PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents
[arXiv] Towards Better Understanding of Contrastive Sentence Representation Learning: A Unified Paradigm for Gradient
[arXiv] Emergent Word Order Universals from Cognitively-Motivated Language Models
[arXiv] Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View
[arXiv] MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin
[arXiv] Distributional Inclusion Hypothesis and Quantifications: Probing for Hypernymy in Functional Distributional Semantics
[arXiv] CausalGym: Benchmarking causal interpretability methods on linguistic tasks
[arXiv] Don’t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
[arXiv] Mission: Impossible Language Models
[arXiv] Semisupervised Neural Proto-Language Reconstruction
[arXiv] Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
[arXiv] Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization?
[arXiv] D2LLM: Decomposed and Distilled Large Language Models for Semantic Search
[arXiv] Arabic Diacritics in the Wild: Exploiting Opportunities for Improved Diacritization
[arXiv] Disinformation Capabilities of Large Language Models
[arXiv] Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models
[arXiv] How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study
[arXiv] Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages
[Not Found] Must NLP be Extractive?
[arXiv] Spiral of Silence: How is Large Language Model Killing Information Retrieval?—A Case Study on Open Domain Question Answering
[arXiv] Latxa: An Open Language Model and Evaluation Suite for Basque
[arXiv] Why are Sensitive Functions Hard for Transformers?
[arXiv] Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction
[arXiv] IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators
[arXiv] The Echoes of Multilinguality: Tracing Cultural Value Shifts during Language Model Fine-tuning
[arXiv] MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
[arXiv] MultiLegalPile: A 689GB Multilingual Legal Corpus
[arXiv] WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations
[arXiv] What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages
[arXiv] Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency Parsing
[arXiv] ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
[arXiv] ChatDev: Communicative Agents for Software Development
[Not Found] Disentangled Learning with Synthetic Parallel Data for Text Style Transfer
[arXiv] PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety
[arXiv] Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation
[arXiv] $\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens
[Not Found] Natural Language Satisfiability: Exploring the Problem Distribution and Evaluating Transformer-based Language Models
[arXiv] Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
[arXiv] AI ‘News’ Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian
[arXiv] Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
[Not Found] Disambiguate Words like Composing Them: A Morphology-Informed Approach to Enhance Chinese Word Sense Disambiguation
[arXiv] Do Llamas Work in English? On the Latent Language of Multilingual Transformers
[arXiv] G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation
[Not Found] Media Framing: A typology and Survey of Computational Approaches Across Disciplines
[Not Found] SPZ: A Semantic Perturbation-based Data Augmentation Method with Zonal-Mixing for Alzheimer’s Disease Detection
[arXiv] Calibrating Large Language Models Using Their Generations Only
[arXiv] Iterative Forward Tuning Boosts In-Context Learning in Language Models
[arXiv] Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement
[arXiv] Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn’t
[arXiv] Steering Llama 2 via Contrastive Activation Addition
[arXiv] EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic Activities
[arXiv] SafetyBench: Evaluating the Safety of Large Language Models
[arXiv] Deciphering Oracle Bone Language with Diffusion Models
[arXiv] M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models
[arXiv] RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization
[arXiv] Causal Estimation of Memorisation Profiles
[arXiv] CHECKWHY: Causal Fact Verification via Argument Structure
[arXiv] Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model
[arXiv] On Efficient and Statistical Quality Estimation for Data Annotation
[Not Found] EZ-STANCE: A Large Dataset for English Zero-Shot Stance Detection
[arXiv] American Sign Language Handshapes Reflect Pressures for Communicative Efficiency
[arXiv] Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
[arXiv] OLMo: Accelerating the Science of Language Models
[arXiv] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
[arXiv] IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages
[arXiv] Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models
[arXiv] Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
[arXiv] BatchEval: Towards Human-like Text Evaluation
[arXiv] ToMBench: Benchmarking Theory of Mind in Large Language Models
[arXiv] COKE: A Cognitive Knowledge Graph for Machine Theory of Mind
[Not Found] MultiPICo: Multilingual Perspectivist Irony Corpus
[Not Found] AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
[arXiv] MMToM-QA: Multimodal Theory of Mind Question Answering
[arXiv] DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Financial Documents
[arXiv] Unintended Impacts of LLM Alignment on Global Representation
[arXiv] ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer
[arXiv] MAP’s not dead yet: Uncovering true language model modes by conditioning away degeneracy
[Not Found] Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!
[Not Found] NounAtlas: Filling the Gap in Nominal Semantic Role Labeling
[arXiv] The Earth is Flat because…: Investigating LLMs’ Belief towards Misinformation via Persuasive Conversation
[arXiv] LooGLE: Can Long-Context Language Models Understand Long Contexts?
[arXiv] Let’s Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation
[arXiv] ECBD: Evidence-Centered Benchmark Design for NLP
[arXiv] Having Beer after Prayer? Measuring Cultural Bias in Large Language Models
[arXiv] Explicating the Implicit: Argument Detection Beyond Sentence Boundaries
[arXiv] Word Embeddings Are Steers for Language Models
[arXiv] Cleaner Pretraining Corpus Curation with Neural Web Scraping
[arXiv] Linear-time Minimum Bayes Risk Decoding with Reference Aggregation
[arXiv] What Do Dialect Speakers Want? A Survey of Attitudes Towards Language Technology for German Dialects
[arXiv] Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models
[arXiv] Estimating the Level of Dialectness Predicts Inter-annotator Agreement in Multi-dialect Arabic Datasets
[arXiv] Greed is All You Need: An Evaluation of Tokenizer Inference Methods
[arXiv] Don’t Buy it! Reassessing the Ad Understanding Abilities of Contrastive Multimodal Models
[arXiv] SeeGULL Multilingual: a Dataset of Geo-Culturally Situated Stereotypes