AI4SE: AI for Software Engineering
Learning & Reasoning -> Trustworthy Code!
We build neurosymbolic techniques that combine program analysis and machine learning to improve developers' productivity and software quality. Our AI models and agents automate code generation, bug detection, and program repair for robust and trustworthy software development.
Some of our recent effort includes:

Code Language Models
Post-training and fine-tuning code models for diverse SE tasks.
-
EditLord: Learning Code Transformation Rules for Code Editing. (ICML'25)
-
SemCoder: Training Code Language Models with Comprehensive Semantics. (NeurIPS’24).
-
​LEDEX: Training LLMs to Better Self-Debug and Explain Code. (NeurIPS’24).
-
TRACED: Execution-aware Pre-training for Source Code. (ICSE'24)
-
CYCLE: Learning to Self-Refine Code Generation. (SPLASH/OOPSLA’24).
-
Towards Causal Deep Learning for Vulnerability Detection. (ICSE’24).

Coding Agents
Neurosymbolic agents to perform complex software engineering tasks
-
Test Generation Agent
-
​FaultLine: Automated Proof-of-Vulnerability Generation Using LLM Agents (preprint)
-
UTFix: Change aware unit test repairing using LLM (OOPSLA'25)
-
Code-Aware Prompting: A study of Coverage-guided Test Generation in Regression Setting using LLM (FSE'24)
-

Benchmarking
Create benchmarks for diverse software engineering tasks

Empirical Evaluation
Gaining deep insight about models' and agents' behavior at different stage of SE cycle.
