Multimodal Validation in UML Synthesis: A Dual-Case Study on Sequence and Class Diagram Generation Pipelines
Tác giả
Tóm tắt
The increasing complexity of modern software systems requires automated modeling methods that ensure clarity, consistency, and maintainability. The Unified Modeling Language (UML) remains the standard for system design, where Class Diagrams and Sequence Diagrams are essential for representing static architecture and dynamic behavior. However, their manual construction is often time-consuming and error-prone. This paper presents a unified process for automated synthesis and multi-modal validation of UML diagrams, evaluated using Class Diagrams and Sequence Diagrams. The framework adopts a three-phase approach: small language model (LLaMA 3.2 1B-Instruct) generates the specifications, while an inference-enhanced model (DeepSeek-R1-Distill-Qwen-32B) translates them into PlantUML code and uses the validation process performed using three vision-language models (Qwen2.5-VL-3B, LLaMA3.2-VL-11B, AyaVision-8B). Two datasets are introduced, consisting of 5,000 Class Diagram samples and 1,000 Sequence Diagram samples. To ensure semantic and structural fidelity, validation is performed using three visual language models (Qwen2.5-VL-3B, LLaMA3.2-VL-11B, AyaVision-8B), with quality scores aggregated through weighted MMMU benchmarks. Experimental results confirm the effectiveness of the framework in generating structurally complex and behaviorally accurate diagrams, establishing a scalable benchmark for AI-based software engineering and promoting automated system design validation.