Theses

Open Theses

  • Bachelor Thesis, Master Thesis

    Queryella is a project to extend the current state of data flow analysis for apps. To be able to combine the different types of analytics, we are building a platform that allows arbitrary compositions of such analytics. The platform will support the development of static, dynamic and hybrid analytics to evaluate apps' handling of sensitive data against current security and privacy criteria. Not only Android or iOS apps should be analyzed, but we should be able to evaluate any application or web interface in the future.

    In the context of this project, we are always in search of students who have a strong interest in data flow analysis. Potential topics may include the identification of sources and sinks for sensitive data, the development of pre-analyses to reduce the runtime of the actual data flow analysis, the development of evaluation criteria for security or privacy issues, or the visualization of analysis results.

    If you are looking for a topic that will have an impact on the future handling of sensitive data, do not hesitate to contact

    Examiner: Prof. Dr.-Ing. Mira Mezini

    Supervisor: Dr.-Ing. Leonid Glanz

  • Master Thesis

    Current code models use text-based signals (such as next token prediction or masked language modelling) in an unsupervised setting to pretrain the models followed by fine-tuning with a labelled dataset. While this has produced good performance, it has its limitations. CodeRL, on the other hand, uses reinforcement learning with feedbacks from unit tests to train an agent to generate code given a prompt. However, unit tests don’t capture all the information about code and are not always available.

    The thesis will focus on developing an environment, that gives feedback through program analysis, within which an agent can learn to perform different tasks. Prior experience with program analysis would be required to work on this thesis.

    Supervisor: Abhinav Ananad, M.Sc.

  • Bachelor Thesis, Master Thesis

    State-of-the-art models for code generation and understanding, such as CodeBERT and CodeT5, are basically LLMs designed for natural language (BERT, T5) but trained on a large corpus of code. However, LLMs are known to hallucinate, i.e, they often produce sentences and information that are factually incorrect. One way to prevent, or at least reduce the impact of, hallucination in LLMs is to augment them with information retrieval.

    It is currently not known if these models hallucinate when used for code intelligence tasks too. The thesis would focus on developing methods to understand hallucinations in the models. Further, explore the possibilities of retrieval-based augmentation to reduce the effect of hallucinations.

    Supervisor: Abhinav Ananad, M.Sc.

  • Master Thesis

    In Probabilistic Programming, dedicated programming languages provide the means to describe the structure of a stochastic process and to then use machine learning to learn the parameters of that process from data. Usually, these languages use approximate methods to ensure tractability. We can however also use exact tractable models for this task.

    To this end, the set of legal programs needs to be restricted, preferably at compile time. This restriction can be seen as a type inference problem, where types represent the permitted operations and algorithms of an expression.

    The topic of this thesis is to find a way to express these restrictions as a type inference problem, for example as a Hindley-Milner type system.

    Examiner: Prof. Dr.-Ing. Mira Mezini

    Supervisor: David Richter, M.Sc.

  • Bachelor Thesis, Master Thesis

    OPAL is a comprehensive library for static analyses that is developed in Scala to facilitate the writing of a wide range of different kinds of analyses. OPAL supports the development of analyses ranging from bug/bug pattern detection up to full-scale data-flow analyses.

    In the context of this project we are always searching for students who are interested in static analysis and want to implement them using Scala. Topics of interest are, e.g., to develop needed base static analyses such as Call Graph Algorithm, analyses to find security issues or to visualize software.

    If you are interested in OPAL, do not hesitate to contact Dominik Helm. For further information, you can also go to The OPAL Project

    Examiner: Prof. Dr.-Ing. Mira Mezini

    Supervisors: Dr.-Ing. Dominik Helm, Tobias Roth, M.Sc.

  • Bachelor Thesis, Master Thesis

    Sofware based systems already play a major role in industrial production and this role will only grow in the context of Industrie 4.0. In order to solve new challenges that arise in this context, existing software has to be adapted in different directions, e.g. to enable the addition of new sensors or enable the creation of a digital twin. For this purpose, we want to uncover Software Product Line features and models, which are already present implicitly, and make them explicit. Therefore the development of corresponding analyses and automatic refactorings is necessary.

    Candidates would work on different topics that enable software reuse of industrial controllers written in C.

    These topics include (but are not limited to):

    • automatic identification and localization of features

    • automatic code slicing of identified features

    • adaption of analyses to the presence of C preprocessor macros

    • automatic module extraction

    Examiner: Prof. Dr.-Ing. Mira Mezini

    Supervisor: Patrick Müller, M.Sc.

  • Bachelor Thesis, Master Thesis

    Today, many applications use cryptographic components to provide a secure implementation. For a secure implementation, it is essential that a developer is aware of the correct and secure usage of cryptographic components. Recent studies have shown that developers struggle with this. Therefore, applications which are intended to be trustworthy, become insecure.

    Within our research project “Secure Integration of Cryptographic Software” of the SFB CROSSING, we want to support developers when they integrate cryptographic components in an application. To achieve this aim, we have developed an Eclipse plugin which can generate secure cryptographic code and a static analysis which identifies insecure usages. Currently, we have created all rules checked by the analyis by hand. One of our next steps is to determine how we can automatically generate rules for correct and secure usages.

    Examiner: Prof. Dr.-Ing. Mira Mezini

    Supervisor: Anna-Katharina Wickert, M.Sc.