Publications

Note: We have omitted some authors' names to avoid any legal issues. 

Please find full author lists on the publishers. 

2025


JiraiBench: A Bilingual Benchmark for Evaluating Large Language Models' Detection of Human Self-Destructive Behavior Content in Jirai Community

Yunze Xiao, Tingyu He, Lionel Z. Wang, Yiming Ma, Xingyu Song, Xiaohang Xu, Irene Li, Ka Chung Ng;

Disclaimer: This paper describes human self-destructive content and potentially harmful behaviors including drug overdoese, eating disorders, and self-harming actions that may be disturbing to some readers.

arXiv, 2025, data request 


MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation

Weihao Xuan, Rui Yang, Heli Qi, Qingcheng Zeng, Yunze Xiao, Yun Xing, Junjue Wang, Huitao Li, Xin Li, Kunyu Yu, Nan Liu, Qingyu Chen, Douglas Teodoro, Edison Marrese-Taylor, Shijian Lu, Yusuke Iwasawa, Yutaka Matsuo, Irene Li; 

arXiv, 2025, data, website


ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA

Zhao Xinjie, Fan Gao, Rui Yang, Yingjian Chen, Yuyang Wang, Ying Zhu, Jiacheng Tang, Irene Li;

arXiv, 2025


GraphCheck: Breaking Long-Term Text Barriers with Extracted Knowledge Graph-Powered Fact-Checking

Yingjian Chen, Haoran Liu, Yinhong Liu, Rui Yang, Han Yuan, Yanran Fu, Pengyuan Zhou, Qingyu Chen, James Caverlee, Irene Li;

arXiv, 2025


Graphusion: A RAG Framework for Scientific Knowledge Graph Construction with a Global Perspective 

Rui Yang, Boming Yang, Xinjie Zhao, Fan Gao, Aosong Feng, Sixun Ouyang, Moritz Blum, Tianwei She, Yuang Jiang, Freddy Lecue, Jinghui Lu, Irene Li;

NLP4KGC, WWW 2025, code 


Rethinking Domain-specific Pre-training by Supervised or Self-supervised Learning for Chest Radiograph Classification: A Comparative Study against ImageNet Counterparts in Cold-start Active Learning 

Han Yuan, Mingcheng Zhu, Rui Yang, Han Liu, Irene Li, Chuan Hong;

Health Care Science, 2025 (to appear)


Large Language Models Struggle to Encode Medical Concepts-A Multilingual Benchmarking and Comparative Analysis

Hossein Rouhizadeh, Anthony Yazdani, Boya Zhang, David Vicente Alvarez, Matthias Hueser, Alexandre Vanobberghen, Rui Yang, Irene Li, Andreas Walter, Douglas Teodoro;

medRxiv, 2025


 2024

AGENTiGraph: An Interactive Knowledge Graph Platform for LLM-based Chatbots Utilizing Private Data

Xinjie Zhao, Moritz Blum, Rui Yang, Boming Yang, Luis Márquez Carpintero, Mónica Pina-Navarro, Tony Wang, Xin Li, Huitao Li, Yanran Fu, Rongrong Wang, Juntao Zhang and Irene Li;

System Demo, arxiv, 2024


EARL: Workshop on Evaluating and Applying Recommendation Systems with Large Language Models

Irene Li, Ruihai Dong, Lei Li and Li Chen;

Extended Abstract, Proceedings of the 18th ACM Conference on Recommender Systems


Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias

Yu He Ke, Rui Yang, Sui An Lie, Taylor Xin Yi Lim, Yilin Ning, Irene Li, Hairil Rizal Abdullah, Daniel Shu Wei Ting and Nan Liu;

Accepted by JMIR 2024


RecPrompt: A Prompt Tuning Framework for News Recommendation Using Large Language Models

Dairui Liu, Boming Yang, Honghui Du, Derek Greene, Aonghus Lawlor, Ruihai Dong and Irene Li;

CIKM 2024


⚕️Ascle: A Python Natural Language Processing Toolkit for Medical Text Generation

Rui Yang, Qingcheng Zeng, Keen You, Yujie Qiao, Lucas Huang, Chia-Chun Hsieh, Benjamin Rosand, Jeremy Goldwasser, Amisha D Dave, Tiarnan D.L. Keenan, Emily Y Chew, Dragomir Radev, Zhiyong Lu, Hua Xu, Qingyu Chen and Irene Li;

Accepted by JMIR 2024, Github, Models


Topic-Centric Explanations for News Recommendation

Dairui Liu, Derek Greene, Irene Li and Ruihai Dong;

ACM Transactions on Recommender Systems, 2024


KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques

Rui Yang, Haoran Liu, Edison Marrese-Taylor, Qingcheng Zeng, Yu He Ke, Wanxin Li, Lechao Cheng, Qingyu Chen, James Caverlee, Yutaka Matsuo and Irene Li;

BioNLP @ ACL 2024


Evaluating Large Language Models on Wikipedia-Style Survey Generation

Fan Gao, Hang Jiang, Rui Yang, Qingcheng Zeng, Jinghui Lu, Moritz Blum, Dairui Liu, Tianwei She, Yuang Jiang and Irene Li;

Findings, ACL 2024, demo, data


Leveraging Large Language Models for Learning Complex Legal Concepts Through Storytelling 

Hang Jiang, Xiajie Zhang, Robert Mahari, Daniel Kessler, Eric Ma, Tal August, Irene Li, Alex Pentland, Yoon Kim, Jad Kabbara and Deb Roy;

ACL 2024


Leveraging Large Language Models for Concept Graph Recovery and Question Answering in NLP Education

Rui Yang, Boming Yang, Sixun Ouyang, Tianwei She, Aosong Feng, Yuang Jiang, Freddy Lecue, Jinghui Lu and Irene Li;

2024, Github


Better Explain Transformers by Illuminating Important Information

Linxin Song, Yan Cui, Ao Luo, Freddy Lecue and Irene Li;

Findings, EACL 2024


2023

Sequence-to-sequence Text Generation with Coupled Diffusion Process

Boming Yang, Ansong Feng and Zihui Li;

2023


A UMLS-Augmented Framework for Improving Factuality in Large Language Models within Healthcare

Rui Yang, Edison Marrese-Taylor, Yuhe Ke, Lechao Cheng, Qingyu Chen and Irene Li;

arXiv 2023


NLPBench: Evaluating Large Language Models on Solving NLP Problems

Linxin Song, Jieyu Zhang, Lechao Cheng, Pengyuan Zhou, Tianyi Zhou and Irene Li;

Instruction Workshop @ NeurIPS 2023


NNKGC: Improving Knowledge Graph Completion with Node Neighborhoods 

Irene Li and Boming Yang; 

DL4KG Workshop @ ISWC 2023


EHRKit: A Python Natural Language Processing Toolkit for Electronic Health Record Texts

Irene Li, Keen You, Yujie Qiao, Lucas Huang, Chia-Chun Hsieh, Benjamin Rosand, Jeremy Goldwasser and Dragomir Radev;

arXiv, 2023


XDLM: Cross-lingual Diffusion Language Model for Machine Translation

Linyao Chen, Aosong Feng, Boming Yang and Zihui Li;

arXiv, 2023


Going Beyond Local: Global Graph-Enhanced Personalized News Recommendations

Boming Yang, Dairui Liu, Ruihai Dong and Irene Li;

Recsys 2023 (Best Student Paper Award! ⭐️)


A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation

Irene Li, Thomas George, Alexander Fabbri, Tammy Liao, Benjamin Chen, Rina Kawamura, Richard Zhou, Vanessa Yan, Swapnil Hingmire and Dragomir Radev;

BEA workshop @ ACL 2023


HiPool: Modeling Long Documents Using Graph Neural Networks

Irene Li, Aosong Feng, Dragomir Radev and Rex Ying;

ACL 2023


Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences

Aosong Feng, Irene Li, Yuang Jiang and Rex Ying

AAAI 2023


2022

Neural Natural Language Processing for Unstructured Data in Electronic Health Records: a Review
Irene Li, Jessica Pan, Jeremy Goldwasser, Neha Verma, Wai Pan Wong, Muhammed Yavuz Nuzumlalı, Benjamin Rosand, Yixin Li, Matthew Zhang, David Chang, R. Andrew Taylor, Harlan M. Krumholz and Dragomir Radev;
Computer Science Review, volume 46, 2022


Surfer100: Generating Surveys From Web Resources, Wikipedia-style
Irene Li, Alexander Fabbri, Rina Kawamura, Yixin Liu, Xiangru Tang, Jaesung Tae, Chang Shen, Sally Ma, Tomoe Mizutani, Dragomir Radev;
LREC 2022


LiGCN: Label-interpretable Graph Convolutional Networks for Multi-label Text Classification
Irene Li, Aosong Feng, Tianxiao Li, Hao Wu and Ruihai Dong;
DLG4NLP Workshop, NAACL 2022.


Variational Graph Autoencoding as Cheap Supervision for AMR Coreference Resolution
Irene Li, Linfeng Song, Kun Xu and Dong Yu;
ACL 2022.