Anomaly Detection based on Simplified Syntax Trees

Anomaly Detection based on Simplified Syntax Trees

Seminar Thesis

Past work proposed various approaches to detect potential errors in source code using anomaly detection. The idea is that errors often show as unlikely or atypical patterns in the code, which allows machine learners to find them.

The problem with existing approaches is that their implementations are unavailable, which makes it impossible for researchers to reproduce or replicate studies and to compare new approaches against the existing ones.

The goal of this work is to reimplement such approaches on the open Simplified Syntax Tree (SST) format and to publish the implementation as part of an Open Source library. This will form the foundation of a publicly available reference implementation of Recommender Systems for Software Engineering (RSSEs).