Resources

Group highlights

From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning

David Dinucu-Jianu, Jakub Macina, Nico Daheim, Ido Hakimi, Iryna Gurevych, Mrinmaya Sachan

EMNLP 2025 Models and code

MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors

Jakub Macina, Nico Daheim, Ido Hakimi, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan

EMNLP 2025 Website (with code)

MATHGAP: OUT-OF-DISTRIBUTION EVALUATION ON PROBLEMS WITH ARBITRARILY COMPLEX PROOFS

Andreas Opedal, Haruki Shirakami, Bernhard Schölkopf, Abulhair Saparov, Mrinmaya Sachan

ICLR 2025

Language model alignment in multilingual trolley problems

Zhijing Jin, Max Kleiman-Weiner, Giorgio Piatti, Sydney Levine, Jiarui Liu, Fernando Gonzalez, Francesco Ortu, András Strausz, Mrinmaya Sachan, Rada Mihalcea, Yejin Choi, Bernhard Schölkopf

ICLR 2025

Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation

Tianyu Liu, Jirui Qi, Paul He, Arianna Bisazza, Mrinmaya Sachan, Ryan Cotterell

NAACL 2025

Diras: Efficient llm annotation of document relevance for retrieval augmented generation

Jingwei Ni, Tobias Schimanski, Meihong Lin, Mrinmaya Sachan, Elliott Ash, Markus Leippold

NAACL 2025

Implicit personalization in language models

Zhijing Jin, Nils Heil, Jiarui Liu, Shehzaad Dhuliawala, Yahang Qi, Bernhard Schölkopf, Rada Mihalcea, Mrinmaya Sachan

EMNLP 2024 (findings)

How to Select Datapoints for Efficient Human Evaluation of NLG Models?

Vilém Zouhar, Peng Cui, Mrinmaya Sachan

preprint 2025

Grammar Control in Dialogue Response Generation for Language Learning Chatbots

Dominik Glandorf, Peng Cui, Detmar Meurers, Mrinmaya Sachan

NAACL 2025

Investigating the Zone of Proximal Development of Language Models for In-Context Learning

Peng Cui, Mrinamya Sachan

NAACL 2025 findings

AI-Assisted Human Evaluation of Machine Translation

Vilém Zouhar, Tom Kocmi, Mrinmaya Sachan

preprint 2024

Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors

Nico Daheim, Jakub Macina, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan

EMNLP 2024 Code

RELIC: Investigating Large Language Model Responses using Self-Consistency

Furui Cheng, Vilém Zouhar, Simran Arora, Mrinmaya Sachan, Hendrik Strobelt, Mennatallah El-Assady

CHI 2024

Error Span Annotation: A Balanced Approach for Human Evaluation of Machine Translation

Tom Kocmi, Vilém Zouhar, Eleftherios Avramidis, Roman Grundkiewicz, Marzena Karpinska, Maja Popović, Mrinmaya Sachan, Mariya Shmatova

WMT 2024

Towards Aligning Language Models with Textual Feedback

Saüc Abadal Lloret, Shehzaad Dhuliawala, Keerthiram Murugesan, Mrinmaya Sachan

ICML 2024 Workshop MHFAIA

Book2Dial: Generating Teacher Student Interactions from Textbooks for Cost-Effective Development of Educational Chatbots

Junling Wang, Jakub Macina, Nico Daheim, Sankalan Pal Chowdhury, Mrinmaya Sachan

ACL 2024 (findings)

How to Engage your Readers? Generating Guiding Questions to Promote Active Reading

Peng Cui, Vilém Zouhar, Xiaoyu Zhang, Mrinmaya Sachan

ACL 2024

AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and Guardrails

Sankalan Pal Chowdhury, Vilém Zouhar, Mrinmaya Sachan

Learning at Scale 2024

Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?

Andreas Opedal, Alessandro Stolfo, Haruki Shirakami, Ying Jiao, Ryan Cotterell, Bernhard Schölkopf, Abulhair Saparov, Mrinmaya Sachan

ICML 2024

PWESuite: Phonetic Word Embeddings and Tasks They Facilitate

Vilém Zouhar, Kalvin Chang, Chenxuan Cui, Nathaniel Carlson, Nathaniel Robinson, Mrinmaya Sachan and David Mortensen

LREC-COLING 2024

A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis

Alessandro Stolfo, Yonatan Belinkov and Mrinmaya Sachan

EMNLP 2023

MATHDIAL: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems

Jakub Macina, Nico Daheim, Sankalan Pal Chowdhury, Tanmay Sinha, Manu Kapur, Iryna Gurevych and Mrinmaya Sachan

EMNLP 2023 (Findings) Code

Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models

Yifan Hou, Jiaoda Li, Yu Fei, Alessandro Stolfo, Wangchunshu Zhou, Guangtao Zeng, Antoine Bosselut and Mrinmaya Sachan

EMNLP 2023

Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models

Ruida Wang, Wangchunshu Zhou and Mrinmaya Sachan

EMNLP 2023 (Findings)

Re-visiting Automated Topic Model Evaluation with Large Language Models

Dominik Stammbach, Vilém Zouhar, Alexander Hoyle, Mrinmaya Sachan and Elliott Ash

EMNLP 2023 (Short)

A Diachronic Perspective on User Trust in AI under Uncertainty

Shehzaad Dhuliawala, Vilém Zouhar, Mennatallah El-Assady and Mrinmaya Sachan

EMNLP 2023

Can Large Language Models Infer Causation from Correlation?

Zhijing Jin, Jiarui Liu, Zhiheng Lyu, Spencer Poff, Mrinmaya Sachan, Rada Mihalcea, Mona Diab and Bernhard Schölkopf

arXiv 2023

CLadder: A Benchmark to Assess Causal Reasoning Capabilities of Language Models

Zhijing Jin, Yuen Chen, Felix Leeb, Luigi Gresele, Ojasv Kamal, Zhiheng LYU, Kevin Blin, Fernando Gonzalez Adauto, Max Kleiman-Weiner, Mrinmaya Sachan and Bernhard Schölkopf

NeurIPS 2023

Order-Theoretic Structured Prediction: Partially Ordering Tokens within a String

Tianyu Liu, Afra Amini, Mrinmaya Sachan and Ryan Cotterell

EMNLP 2023 (Outstanding Paper Award)

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

Wangchunshu Zhou, Yuchen Eleanor Jiang, Peng Cui, Tiannan Wang, Zhenxin Xiao, Yifan Hou, Ryan Cotterell and Mrinmaya Sachan

arXiv 2023

Efficient Prompting via Dynamic In-Context Learning

Wangchunshu Zhou, Yuchen Eleanor Jiang, Ryan Cotterell and Mrinmaya Sachan

arXiv 2023

Beyond Good Intentions: Reporting the Research Landscape of NLP for Social Good

Fernando Gonzalez, Zhijing Jin, Bernhard Schölkopf, Tom Hope, Mrinmaya Sachan and Rada Mihalcea

EMNLP 2023 (Findings)

Enhancing Textbooks with Visuals from the Web for Improved Learning

Janvijay Singh, Vilém Zouhar and Mrinmaya Sachan

EMNLP 2023

Elastic Weight Removal for Faithful and Abstractive Dialogue Generation

Nico Daheim, Nouha Dziri, Mrinmaya Sachan, Iryna Gurevych and Edoardo M Ponti

arxiv:2303.17574

Adaptive and Personalized Exercise Generation for Online Language Learning

Peng Cui and Mrinmaya Sachan

ACL 2023

A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models

Alessandro Stolfo, Zhijing Jin, Kumar Shridhar, Bernhard Schölkopf and Mrinmaya Sachan

ACL 2023 (also at MATHAI Workshop at NeurIPS 22)

Distilling Reasoning Capabilities into Smaller Language Models

Kumar Shridhar, Alessandro Stolfo and Mrinmaya Sachan

ACL 2023 (Findings)

World Models for Math Story Problems

Andreas Opedal, Niklas Stoehr, Abulhair Saparov and Mrinmaya Sachan

ACL 2023 (Findings)

Byte-Pair Encoding is Approximately Optimal

Vilém Zouhar, Tim Vieira, Clara Meister, Juan Luis Gastaldi, Mrinmaya Sachan and Ryan Cotterell

ACL 2023 (Findings)

Controlled Text Generation with Natural Language Instructions

Wangchunshu Zhou, Yuchen Eleanor Jiang, Ethan Wilcox, Ryan Cotterell and Mrinmaya Sachan

ICML 2023

Infusing Lattice Symmetry Priors in Attention Mechanisms for Sample-Efficient Abstract Geometric Reasoning

Mattia Atzeni, Mrinmaya Sachan and Andreas Loukas

ICML 2023

Educational Question Generation with Difficulty Level Controls

Ying Jiao, Kumar Shridhar, Peng Cui, Wangchunshu Zhou and Mrinmaya Sachan

AIED 2023

Opportunities and Challenges in Neural Dialog Tutoring

Jakub Macina, Nico Daheim, Lingzhi Wang, Tanmay Sinha, Manu Kapur, Iryna Gurevych and Mrinmaya Sachan

EACL 2023 Code

Strategize Before Teaching: A Conversational Tutoring System with Pedagogy Self-Distillation

Lingzhi Wang, Mrinmaya Sachan, Xingshan Zeng and Kam-Fai Wong

EACL 2023 (Short, Findings)

LongtoNotes: OntoNotes with Longer Coreference Chains

Kumar Shridhar, Nicholas Monath, Raghuveer Thirukovalluru, Alessandro Stolfo, Manzil Zaheer, Andrew McCallum and Mrinmaya Sachan

EACL 2023 (Findings)

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Kumar Shridhar, Jakub Macina, Mennatallah El-Assady, Tanmay Sinha, Manu Kapur and Mrinmaya Sachan

EMNLP 2022 (also at MATHAI Workshop at NeurIPS 22) Code

Beyond prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations

Yu Fei, Zhao Meng, Ping Nie, Roger Wattenhofer and Mrinmaya Sachan

EMNLP 2022

Differentially Private Language Models for Secure Data Sharing

Justus Mattern, Zhijing Jin, Benjamin Weggenmann, Bernhard Schölkopf and Mrinmaya Sachan

EMNLP 2022

Autoregressive Structured Prediction with Language Models

Tianyu Liu, Yuchen Eleanor Jiang, Nicholas Monath, Ryan Cotterell and Mrinmaya Sachan

EMNLP 2022 (Findings, Short paper)

Adapters for Enhanced Modeling of Multilingual Knowledge and Text

Yifan Hou, Wenxiang Jiao, Meizhen Liu, Carl Allen, Zhaopeng Tu and Mrinmaya Sachan

EMNLP 2022 (Findings) / Best paper at the Multilingual Representation Learning (MRL) Workshop

Logical Fallacy Detection

Zhijing Jin, Abhinav Lalwani, Tejas Vaidhya, Xiaoyu Shen, Yiwen Ding, Zhiheng Lyu, Mrinmaya Sachan, Rada Mihalcea and Bernhard Schölkopf

EMNLP 2022 (Findings)

What has been Enhanced in my Knowledge-Enhanced Language Model?

Yifan Hou, Guoji Fu and Mrinmaya Sachan

EMNLP 2022 (Findings) Code

Rule-Based but Flexible? Evaluating and Improving Language Models as Accounts of Human Moral Judgment

Zhijing Jin, Sydney Levine, Fernando Gonzalez Adauto, Ojasv Kamal, Maarten Sap, Mrinmaya Sachan, Rada Mihalcea, Joshua B. Tenenbaum and Bernhard Schölkopf

NeurIPS 2022 (Oral) / CogSci 2022 (Disciplinary Diversity and Integration Award)

Learning the Transformer Kernel

Sankalan Pal Chowdhury, Adamos Solomou, Avinava Dubey and Mrinmaya Sachan

TMLR 2022

Probing via Prompting

Jiaoda Li, Ryan Cotterell and Mrinmaya Sachan

NAACL 2022

BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation

Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Jian Yang, Haoyang Huang, Rico Sennrich, Mrinmaya Sachan, Ryan Cotterell and Ming Zhou

NAACL 2022

A Structured Span Selector

Tianyu Liu, Yuchen Eleanor Jiang, Ryan D Cotterell and Mrinmaya Sachan

NAACL 2022

Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance

Jingwei Ni, Zhijing Jin, Markus Freitag, Mrinmaya Sachan and Bernhard Schölkopf

NAACL 2022

Self-Supervised Contrastive Learning with Adversarial Perturbations for Robust Pretrained Language Models

Zhao Meng, Yihan Dong, Mrinmaya Sachan and Roger Wattenhofer

NAACL 2022 (Findings)

Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang

Daphna Keidar, Andreas Opedal, Zhijing Jin and Mrinmaya Sachan

ACL 2022

Calibration of Machine Reading Systems at Scale

Shehzaad Dhuliawala, Leonard Adolphs, Rajarshi Das and Mrinmaya Sachan

ACL 2022 (Findings)

Case-based Reasoning for Better Generalization in Text-Adventure Games

Mattia Atzeni, Shehzaad Dhuliawala, Keerthiram Murugesan and Mrinmaya Sachan

ICLR (2022)

Deep Clustering of Text Representations for Supervision-free Probing of Syntax

Vikram Gupta, Haoyue Shi, Kevin Gimpel and Mrinmaya Sachan

AAAI (2022)

Causal Direction in Data Matters: Implications of Causal and Anticausal Learning in NLP

Zhijing Jin, Julius von Kugelgen, Jingwei Ni, Tejas Vaidhya, Ayush Kaushal, Mrinmaya Sachan and Bernhard Schoelkopf

EMNLP 2021

Let Your Characters Tell Their Story: A Dataset for Character-Centric Narrative Understanding

Faeze Brahman, Meng Huang, Oyvind Tafjord, Chao Zhao, Mrinmaya Sachan and Snigdha Chaturvedi

EMNLP 2021 (Findings)

Differentiable Subset Pruning of Transformer Heads

Jiaoda Li, Ryan Cotterell and Mrinmaya Sachan

TACL 2021 Code

Bird's Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach

Yifan Hou and Mrinmaya Sachan

ACL 2021 Code

Scaling Within Document Coreference for Long Texts

Raghuveer Thirukovalluru, Nicholas Monath, Kumar Shridhar, Manzil Zaheer, Mrinmaya Sachan and Andrew McCallum

ACL 2021 (Findings)

How Good Is NLP? A Sober Look at NLP Tasks through the Lens of Social Impact

Zhijing Jin, Geeticka Chauhan, Brian Tse, Mrinmaya Sachan and Rada Mihalcea

ACL 2021 (Findings) MIT News Article

Efficient Text-based Reinforcement Learning by Jointly Leveraging State and Commonsense Graph Representations

Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi,Kartik Talamadupula, Mrinmaya Sachan and Murray Campbell

ACL 2021 (Short paper)

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines

Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan and Murray Campbell

AAAI 2021

Discourse in Multimedia: A Case Study in Extracting Geometry Knowledge from Textbooks

Mrinmaya Sachan, Avinava Dubey, Eduard Hovy, Tom Mitchell, Dan Roth and Eric P. Xing

Computational Linguistics (CL) journal - Dec 2019 issue