*Result*: MENTOR: Fixing introductory programming assignments with formula-based fault localization and LLM-driven program repair.
*Further Information*
*• MENTOR addresses the Automated Program Repair (APR) problem, for the C programming language, through an LLM -driven Counterexample-Guided Inductive Synthesis (CEGIS) approach. • MENTOR employs MaxSAT-based Fault Localization to guide and minimize LLMs ' patches to incorrect programs by feeding them bug-free program sketches. • MENTOR combines state-of-the-art modules for program clustering, variable alignment, and MaxSAT-based fault localization; within an LLM -driven program fixer that orchestrates the repair process. • Experimental results show that our approach enables all six evaluated LLMs to fix more programs and produce smaller patches compared to alternative configurations and symbolic tools. • All code and experiments are publicly available on Zenodo Orvalho et al., 2025c. The increasing demand for programming education has led to online evaluations like MOOCs, which rely on introductory programming assignments (IPAs). A major challenge in these courses is providing personalized feedback at scale. This paper introduces MENTOR , a semantic automated program repair (APR) framework designed to fix faulty student programs. MENTOR validates repairs through execution on a test suite, and returns the repaired program or highlights faulty statements. Unlike symbolic repair tools like Clara and Verifix , which require correct implementations with identical control flow graphs (CFGs), MENTOR 's LLM -based approach enables flexible repairs without strict structural alignment. MENTOR clusters successful submissions regardless of CFGs, and employs a Graph Neural Network (GNN)-based variable alignment module for enhanced accuracy. Next, MENTOR 's fault localization module leverages MaxSAT techniques to pinpoint buggy code segments precisely. Finally, MENTOR 's program fixer integrates Formal Methods (FM) and Large Language Models (LLMs) through a Counterexample Guided Inductive Synthesis (CEGIS) loop, iteratively refining repairs. Experimental results show that MENTOR significantly improves repair success rates, achieving 64.4 %, far surpassing Verifix (6.3 %) and Clara (34.6 %). By merging formula-based fault localization, and LLM -driven repair, MENTOR provides an innovative, scalable framework for programming education. [ABSTRACT FROM AUTHOR]
Copyright of Journal of Systems & Software is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)*