Scopus

RuleAugment: A Hybrid Framework Combining Rule-Based Systems and Large Language Models for Natural Language to Visualization Tasks

Năm XB 2026 Tạp chí / Hội thảo The 14th Conference on Information Technology and its Applications (CITA 2025) volume 1581, 559–571 Đơn vị CNTT DOI / Link https://doi.org/10.1007/978-3-032-00972-2_41 ↗

Tác giả

Thi Minh-Hue Luong ; Van-Viet Nguyen ; Huu-Khanh Nguyen ; Xuan-Truong Quach ; The-Vinh Nguyen ^✉

Tóm tắt

Data visualization plays an important role in conveying a large amount of information. Recent approaches utilized prompting techniques to ask large language models (LLMs) to respond codes that may not run correctly. To mitigate this problem, this paper presented RuleAugment, a hybrid framework combining a rule-based system and LLMs to simplify the task of converting natural language queries into visualization. RuleAugment handled query normalization and mapping, complexity classification, and Python code generation. The performance is evaluated on five datasets, focusing on query mapping accuracy, code generation accuracy, and graph quality. The framework achieves high query mapping accuracy (up to 98.5% with F1-Score 98.2%), accurate code generation (Exact Match Ratio of 94.5%), and high-quality graphs (average score of 4.8/5 for visual accuracy). While effective with simple data …

Tài liệu tham khảo

[1] Dibia V (2023) Lida: a tool for automatic generation of grammar-agnostic visualizations and infographics using large language models. In: Proceedings of the 61st annual meeting of the association for computational linguistics, vol 3. System Demonstrations, pp 113–126

[2] Gao T, Dontcheva M, Adar E, Liu Z, Karahalios KG (2015) Datatone: managing ambiguity in natural language interfaces for data visualization. In: Proceedings of the 28th annual ACM symposium on user interface software technology, pp 489–500

[3] Guo Y, Shi D, Guo M, Wu Y, Cao N, Chen Q (2024) Talk2data: a natural language interface for exploratory visual analysis via question decomposition. ACM Trans Interact Intell Syst 14(2):1–24

[4] Liu C, Han Y, Jiang R, Yuan X (2021) Advisor: automatic visualization answer for natural-language question on tabular data. In: 2021 IEEE 14th Pacific visualization symposium (PacificVis). IEEE, pp 11–20

[5] Luo Y, Qin X, Tang N, Li G, Wang X (2018) Deepeye: creating good data visualizations by keyword search. In: Proceedings of the 2018 international conference on management of data, pp 1733–1736

[6] Luo Y, Tang N, Li G, Tang J, Chai C, Qin X (2021) Natural language to visualization by neural machine translation. IEEE Trans Vis Comput Graph 28(1):217–226

[7] Luong-Thi-Minh H, Nguyen-The V, Xuan TQ (2024) Vizagent: towards an intelligent and versatile data visualization framework powered by large language models. In: International conference on advances in information and communication technology. Springer, pp 89–97

[8] Maddigan P, Susnjak T (2023) Chat2vis: generating data visualizations via natural language using ChatGPT, codex and GPT-3 large language models. IEEE Access 11:45181–45193

[9] Mahmud MM, Wong SF, Qazi A, Ramli NFM, Zakaria SF, Rusli R (2024) Excel-ling in data visualization: evaluating microsoft excel’s user-friendliness, visual appeal, and reputation impact. In: 2024 12th International conference on information and education technology (ICIET). IEEE, pp 507–513

[10] Nguyen TV, Phung TN (2024) Enhanced literature review visualization: a novel sorted stream graphs with integrated word elements. In: International conference on advances in information and communication technology. Springer, pp 159–168

[11] Organ N (2024) Data visualization for people of all ages. CRC Press

[12] Setlur V, Battersby SE, Tory M, Gossweiler R, Chang AX (2016) Eviza: a natural language interface for visual analysis. In: Proceedings of the 29th annual symposium on user interface software and technology, pp 365–377

[13] Song Y, Zhao X, Wong RCW, Jiang D (2022) Rgvisnet: A hybrid retrieval-generation neural framework towards automatic data visualization generation. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pp 1646–1655

[14] Sun Y, Leigh J, Johnson A, Lee S (2010) Articulate: a semi-automated model for translating natural language queries into meaningful visualizations. In: Smart graphics: 10th international symposium on smart graphics, Banff, Canada, June 24–26, 2010 Proceedings 10. Springer, pp 184–195

[15] Wang C, Thompson J, Lee B (2023) Data formulator: AI-powered concept-driven visualization authoring. IEEE Trans Vis Comput Graph 30(1):1128–38

[16] Wang L, Zhang S, Wang Y, Lim EP, Wang Y (2023) Llm4vis: explainable visualization recommendation using ChatGPT. In: Proceedings of the 2023 conference on empirical methods in natural language processing: industry track, pp 675–692

[17] Wang X, Wang Z, Gao X, Zhang F, Wu Y, Xu Z, Shi T, Wang Z, Li S, Qian Q et al (2024) Searching for best practices in retrieval-augmented generation. In: Proceedings of the 2024 conference on empirical methods in natural language processing, pp 17716–17736

[18] Ye Y, Hao J, Hou Y, Wang Z, Xiao S, Luo Y, Zeng W (2024) Generative AI for visualization: state of the art and future directions. Vis Inform 8(2):43–66

[19] Yu B, Silva CT (2019) Flowsense: a natural language interface for visual data exploration within a dataflow system. IEEE Trans Vis Comput Graph 26(1):1–11

← Quay lại danh sách bài báo