Research

Ongoing projects

Since 2025

Design Pattern Identification and Summarisation

A feature-based and LLM-based design pattern summarisation system that parses Java systems with JavaParser, produces JSON knowledge graphs and turns them into readable English generated narratives.

Captures both the structural context and usage intent for every detected pattern.
Exports enriched JSON artifacts, enabling downstream reasoning pipelines.
Automates summary text generation so reviewers can skim complex codebases quickly.

AI4SE Empirical SE MSR

GitHub GitHub

Completed projects

2022-2024

Feature-based Design Pattern Detection

A machine-learning model that detects Gang-of-Four patterns inside Java projects, combining structural fingerprints with semantic cues so teams can inventory reuse opportunities.

Ships both a classic pipeline and a Python 3 refactor for modern toolchains.
Provides a reproducible corpus for benchmarking new detection heuristics.
Feeds summaries and diagrams into the Design Pattern Summariser pipeline.

Empirical SE MSR

Python 2 Edition Python 3 Edition

2023-2024

CodeLabeller

A web-based annotation environment where researchers and practitioners label Java design pattern instances and summaries to bootstrap supervised learning datasets.

Streamlines the end-to-end labeling workflow for machine-learning-ready corpora.
Supports collaborative review cycles so multiple experts can converge on gold data.
Exports datasets compatible with the detection and summarisation pipelines below.

Empirical SE MSR

GitHub

2016-2017

Source Code Fragment Summarisation (CFS)

The first effort to blend supervised learning with crowdsourcing for summarising source code fragments harvested from Eclipse and NetBeans FAQs.

Builds a 127-fragment corpus plus feature lists curated by nine expert volunteers.
Applies SVM and Naive Bayes classifiers trained on crowd-sourced features.
Makes both the corpus and feature sets openly available for replication.

MSR Empirical SE

Corpus Features Classifiers

2017

PRST: PageRank-based Bug Report Summaries

A PageRank-inspired approach for summarising duplicate-heavy bug reports drawn from Eclipse, Mozilla, KDE, and Gnome tracker conversations.

Constructs the modified BRC corpus (28 reports) and OSCAR corpus (59 reports).
Engages human annotators to label sentence-level extracts for evaluation.
Releases both the raw bug corpus and the manually annotated summaries.

Empirical SE MSR

Bug Corpus Annotated Corpus

Najam Nazar

Research

Research areas

Empirical Software Engineering

Mining Software Repositories

AI4SE

SE4AI

Ongoing projects

Design Pattern Identification and Summarisation

Completed projects

Feature-based Design Pattern Detection

CodeLabeller

Source Code Fragment Summarisation (CFS)

PRST: PageRank-based Bug Report Summaries