Open Theses
Bachelor Thesis, Master Thesis
Queryella is a project to extend the current state of data flow analysis for apps. To be able to combine the different types of analytics, we are building a platform that allows arbitrary compositions of such analytics. The platform will support the development of static, dynamic and hybrid analytics to evaluate apps' handling of sensitive data against current security and privacy criteria. Not only Android or iOS apps should be analyzed, but we should be able to evaluate any application or web interface in the future.
In the context of this project, we are always in search of students who have a strong interest in data flow analysis. Potential topics may include the identification of sources and sinks for sensitive data, the development of pre-analyses to reduce the runtime of the actual data flow analysis, the development of evaluation criteria for security or privacy issues, or the visualization of analysis results.
If you are looking for a topic that will have an impact on the future handling of sensitive data, do not hesitate to contact Leonid Glanz.
Examiner: Prof. Dr.-Ing. Mira Mezini
Supervisor: Dr.-Ing. Leonid Glanz
Master Thesis
Current code models use text-based signals (such as next token prediction or masked language modelling) in an unsupervised setting to pretrain the models followed by fine-tuning with a labelled dataset. While this has produced good performance, it has its limitations. CodeRL, on the other hand, uses reinforcement learning with feedbacks from unit tests to train an agent to generate code given a prompt. However, unit tests don’t capture all the information about code and are not always available.
The thesis will focus on developing an environment, that gives feedback through program analysis, within which an agent can learn to perform different tasks. Prior experience with program analysis would be required to work on this thesis.
Supervisor: Abhinav Ananad, M.Sc.
Bachelor Thesis, Master Thesis
State-of-the-art models for code generation and understanding, such as CodeBERT and CodeT5, are basically LLMs designed for natural language (BERT, T5) but trained on a large corpus of code. However, LLMs are known to hallucinate, i.e, they often produce sentences and information that are factually incorrect. One way to prevent, or at least reduce the impact of, hallucination in LLMs is to augment them with information retrieval.
It is currently not known if these models hallucinate when used for code intelligence tasks too. The thesis would focus on developing methods to understand hallucinations in the models. Further, explore the possibilities of retrieval-based augmentation to reduce the effect of hallucinations.
Supervisor: Abhinav Ananad, M.Sc.
Bachelor Thesis, Master Thesis
Opportunistic- and disruption-tolerant networks enable resilient communication in local communities by epidemically distributing messages. This is beneficial to resist censorship or work in scenarios where global network communication might not be possible or disrupted, e.g., during a disaster or in rural areas. Due to the increased resource usage on all devices by epidemic data dissemination, finding optimal strategies to reduce bandwidth, storage space or increase resilience against local adversarial networking nodes is very important.
Supervisor: Dr. rer. nat. Lars Baumgärtner
Bachelor Thesis, Master Thesis
Unmanned-Ground Vehicles are becoming increasingly important, especially in areas such as disaster recovery. While these robots are highly dependent on communication links, such links are oft also disrupted during disasters. Therefore, it is vital to cope with fluctuating connectivity on the protocol level and provide ways to receive telemetry data and send commands even without direct connections. Intermediate hops can be used to route data, but if even this is not possible, store, carry and forward architectures provide another alternative for such use-cases.
Supervisor: Dr. rer. nat. Lars Baumgärtner
Bachelor Thesis, Master Thesis
Unmanned-Aerial Vehicles have multiple uses in daily life. Furthermore, in disaster response they are commonly used for tasks such as taking aerial pictures, sensor readings or providing means of communication. Standard protocols for UAV control such as MAVlink are designed for direct real-time links. This limits the range of operation of such a system, especially if the communication infrastructure is damaged, e.g., during a disaster. Thus, alternative extensions build on disruption-tolerant networking which is also used for deep space communication should be explored to increase usability in challenging network conditions.
Supervisor: Dr. rer. nat. Lars Baumgärtner
Bachelor Thesis
LoRa as a free, long-range communication technology enables many new use-cases. Besides its main use of building IoT sensor networks, it can also be used to build alternative communication infrastructure, e.g., for disaster responses. As LoRa has very limited bandwidth and duty-cycle restrictions, existing communication patterns and protocols from the internet cannot be directly applied here. Different approaches can be explored to maximize the usefulness of such a communication system, e.g., by combining mesh networks and store, carry and forward approaches.
Supervisor: Dr. rer. nat. Lars Baumgärtner
Master Thesis
In Probabilistic Programming, dedicated programming languages provide the means to describe the structure of a stochastic process and to then use machine learning to learn the parameters of that process from data. Usually, these languages use approximate methods to ensure tractability. We can however also use exact tractable models for this task.
To this end, the set of legal programs needs to be restricted, preferably at compile time. This restriction can be seen as a type inference problem, where types represent the permitted operations and algorithms of an expression.
The topic of this thesis is to find a way to express these restrictions as a type inference problem, for example as a Hindley-Milner type system.
Examiner: Prof. Dr.-Ing. Mira Mezini
Supervisor: David Richter, M.Sc.
Bachelor Thesis, Master Thesis
OPAL is a comprehensive library for static analyses that is developed in Scala to facilitate the writing of a wide range of different kinds of analyses. OPAL supports the development of analyses ranging from bug/bug pattern detection up to full-scale data-flow analyses.
In the context of this project we are always searching for students who are interested in static analysis and want to implement them using Scala. Topics of interest are, e.g., to develop needed base static analyses such as Call Graph Algorithm, analyses to find security issues or to visualize software.
If you are interested in OPAL, do not hesitate to contact Dominik Helm. For further information, you can also go to The OPAL Project
Examiner: Prof. Dr.-Ing. Mira Mezini
Supervisors: Dr.-Ing. Dominik Helm, Tobias Roth, M.Sc.
Bachelor Thesis, Master Thesis
Sofware based systems already play a major role in industrial production and this role will only grow in the context of Industrie 4.0. In order to solve new challenges that arise in this context, existing software has to be adapted in different directions, e.g. to enable the addition of new sensors or enable the creation of a digital twin. For this purpose, we want to uncover Software Product Line features and models, which are already present implicitly, and make them explicit. Therefore the development of corresponding analyses and automatic refactorings is necessary.
Candidates would work on different topics that enable software reuse of industrial controllers written in C.
These topics include (but are not limited to):
• automatic identification and localization of features
• automatic code slicing of identified features
• adaption of analyses to the presence of C preprocessor macros
• automatic module extraction
Examiner: Prof. Dr.-Ing. Mira Mezini
Supervisor: Patrick Müller, M.Sc.
Bachelor Thesis, Master Thesis
Today, many applications use cryptographic components to provide a secure implementation. For a secure implementation, it is essential that a developer is aware of the correct and secure usage of cryptographic components. Recent studies have shown that developers struggle with this. Therefore, applications which are intended to be trustworthy, become insecure.
Within our research project “Secure Integration of Cryptographic Software” of the SFB CROSSING, we want to support developers when they integrate cryptographic components in an application. To achieve this aim, we have developed an Eclipse plugin which can generate secure cryptographic code and a static analysis which identifies insecure usages. Currently, we have created all rules checked by the analyis by hand. One of our next steps is to determine how we can automatically generate rules for correct and secure usages.
Examiner: Prof. Dr.-Ing. Mira Mezini
Supervisor: Anna-Katharina Wickert, M.Sc.