Scopus

Fine-Tuning Mini Language Models for Legal Multiple-Choice Question Answering: A Comparative Study of Phi-3.5, Qwen 2.5 and Llama 3.2

Tạp chí / Hội thảo: The 14th Conference on Information Technology and its Applications (CITA 2025) volume 1581, 671–682 Đơn vị: Đại học Thái Nguyên DOI / Link:

Tác giả

Huu-Khanh Nguyen ; Van-Viet Nguyen ; Kim-Son Nguyen ; Thi Minh-Hue Luong ; The-Vinh Nguyen ; Duc-Quang Vu ; Huu-Cong Nguyen

Tác giả liên hệ

Tóm tắt

In this study, we explore the mini language models applications in legal domain, specifically Phi-3.5 Mini, Qwen 2.5 3B and Llama 3.2 3B, for legal multiple-choice question answering. We fine-tuned these models on CaseHOLD dataset to adapt them to the structural and semantic nuances of legal language and reasoning. The results show that fine-tuning improves performance of these models significantly with Phi-3. 5 Mini achieved a Micro F1 score of 76.93%, exceeding previous bests for the field of miniaturised models. Also, Qwen 2.5 3B and Llama 3.2 3B scored similarly competitive scores of 74.27% and 75.40%, respectively, reinforcing their viability as resource-efficient options compared to larger models. Mini language models offer competitive performance with specialize models like Legal-BERT, Caselaw-BERT, while operating on a lower computational resources and ability of natural language …