Generative Language Model

New! Memorize and Rank: Enabling Large Language Models for Medical Event Prediction

We introduce Mera, a clinical event prediction model that bridges pertaining natural language knowledge with medical code. We apply contrastive learning on a predicted ranking list for task-specialized optimization. With concept memorization through fine-tuning, we equip the LLM with an in-depth understanding to recall the natural language definitions for medical code during inference.

Mingyu Derek Ma, Yijia Xiao, Anthony Cuturrufo, Xiaoxuan Wang, Wei Wang

New! Mitigating Bias for Question Answering Models by Tracking Bias Influence

We propose BMBI, an approach to mitigate the bias of multiple-choice QA models. Based on the intuition that a model would lean to be more biased if it learns from a biased example, we measure the bias level of a query instance by observing its influence on another instance. We then use the bias level detected as an optimization objective to form a multi-task learning setting in addition to the original QA task.

Mingyu Derek Ma, Jiun-Yu Kao, Arpit Gupta, Yu-Hsiang Lin, Wenbo Zhao, Tagyoung Chung, Wei Wang, Kai-Wei Chang, Nanyun Peng

New! Improving Event Definition Following For Zero-Shot Event Detection

We aim to improve zero-shot event detection by training models to better follow event definitions. We hypothesize that a diverse set of event types and definitions are the key for models to learn to follow event definitions while existing event extraction datasets focus on annotating many high-quality examples for a few event types. Our experiments verify our hypothesis.

Zefan Cai, Po-Nien Kung, Ashima Suvarna, Mingyu Derek Ma, Hritik Bansal, Baobao Chang, P. Jeffrey Brantingham, Wei Wang, Nanyun Peng

New! STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models

We propose STAR, a structure-to-text data generation method for complicated structure prediction tasks that first generates complicated event structures (Y) and then generates input passages (X), all with Large Language Models. We further reduce errors and improve data quality through self-reflection error identification and self-refinement with iterative revision. We show that the data generated by STAR significantly improves the performance of low-resource event extraction and relation extraction tasks, even surpassing the effectiveness of human-curated data.

Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung, P. Jeffrey Brantingham, Nanyun Peng, Wei Wang

DICE: Data-Efficient Clinical Event Extraction with Generative Models

We introduce DICE, a robust and data-efficient generative model for clinical event extraction, which specializes in clinical mention identification, and MACCROBAT-EE, the first clinical event extraction dataset with event argument annotation.

Mingyu Derek Ma, Alexander K. Taylor, Wei Wang, Nanyun Peng

Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning

We use soft prompt tokens to learn task properties, incorporate segment information and reiterate the task before predicting value. Our method drastically reduces the number of parameters needed to less than 0.5% of prior works while achieving better low-resource dialogue state tracking performance.

Mingyu Derek Ma, Jiun-Yu Kao, Shuyang Gao, Arpit Gupta, Di Jin, Tagyoung Chung, Nanyun Peng