Scopus

Fine-Tuning Tiny Language Models for Legal Question Answering: A Comparative Study of Gemma 2, Qwen 2.5 and Llama 3.2

Tạp chí / Hội thảo: International Conference on Advances in Information and Communication Technology (ICTA2025) Vol 2, 336–345 Đơn vị: Đại học Thái Nguyên DOI / Link:

Tác giả

Huu-Khanh Nguyen ; Huu-Cong Nguyen ; Van-Viet Nguyen ; Thi Minh-Hue Luong ; Kim-Son Nguyen ; Dinh-Cuong Do ; The-Vinh Nguyen

Tác giả liên hệ

Tóm tắt

The domain of legal question answering presents significant challenges for natural language processing due to the complexity and nuance of legal texts. While large language models have shown promise, the efficacy of the latest generation of “tiny” language models remains underexplored. This paper presents a comparative study of three prominent open-weight TLMs - Google’s Gemma 2 2B, Alibaba’s Qwen 2.5 1.5B, and Meta’s Llama 3.2 1B - fine-tuned for legal QA on the CaseHOLD dataset. Employing parameter-efficient fine-tuning (LoRA, QLoRA) on a limited subset of 10,000 training examples to simulate resource-constrained conditions, we evaluate their performance. Our results demonstrate that fine-tuning yields substantial improvements over base models, with Gemma 2 2B achieving a top Micro F1 score of 74.47, followed by Qwen 2.5 at 72.17 and Llama 3.2 at 71.17. Notably, Gemma 2’s performance …