Research

The following is an overview of all my publicly released research reports. Clicking a panel will take you to the full document.

Indexing Conjunctive Path Queries for Accelerated Query Evaluation

This is my Master’s Thesis for the Computer Science and Engineering program at the Eindhoven University of Technology, written abroad at Osaka University. The goal of my thesis is to explore how we can index conjunctive path queries (CPQ) in a graph database such that we can accelerate query evaluation. Within this thesis I successfully developed new theory and algorithms for index construction, graph core computation, and tree decompositions. The final result is a promising index that can be used to accelerate queries in any domain that makes use of graph databases.

The codebase for the CPQ-native Index is available as open source at RoanH/CPQ-native-index and RoanH/gMark.

CPQ Keys: Conjunctive Path Query Key Generation

The goal of this research project is to investigate and compare algorithms that can be used to compute a canonical form of a conjunctive path query (CPQ) for use as a key in a language-aware index for a graph database. Such a key has to be a unique representation for the entire isomorphic equivalence class of a CPQ. This project was carried out as an internship at the University of Osaka.

The codebase used to evaluate various algorithms is available as open source at RoanH/CPQKeys and notes for the project can be found at cpqkeys.roanh.dev.

ConvexMerger: Algorithmic Optimisations & Challenges

ConvexMerger is an area maximisation game based on the idea of merging convex shapes. The goal of the game is to claim as large an area as possible while competing against your opponents that try to do the same. The game was originally created for a course on geometric algorithms and extended during a capita selecta, this research report describes the optimisations, data structures and algorithms implemented during the capita selecta project.

The codebase for the game is available as open source at RoanH/ConvexMerger.

NotPetya Malware Analysis

In this project we performed a static analysis of the NotPetya malware using Ghidra. The NotPetya malware gets its name from being almost identical to the GoldenEye variant of the Petya ransomware, with some tweaks to make it unique. During the project we performed a static analysis where we tried to analyze as many components of the malware binary as we could within the given time frame. The files used in and resulting from the analysis, along with a much more detailed log file keeping track of our findings and actions can be found in a GitHub repository at RoanH/NotPetya.

During the project we were able to reverse engineering the majority of the NotPetya malware and discover most of its characteristic features. Though some of the extra embedded binaries could not be fully explored due to a lack of time.

Graph Database & Query Evaluation Terminology

This report is intended as a short introduction to the terminology and concepts used in databases and query evaluation. It is primarily intended as a quick reference when working on database internals and was originally written for use with RoanH/gMark.

Conjunctive Path Query Generation for Benchmarking

Conjunctive path queries (CPQ) are one of the most frequently used queries for complex graph analysis. However tailored support for them in various domains is lacking. One such domain is benchmark generation. The goal of this project is to propose a scalable method for generating CPQs for benchmarking purposes, while also allowing the user to control the selectivity of the generated queries. To accomplish this we will build upon the query generation framework presented in gbagan/gmark, an open-source query workload generator that makes use of regular path queries (RPQ) instead of CPQs.

An open-source implementation of the algorithm presented is provided as part of a rewrite of the gMark software at RoanH/gMark. Documentation (javadoc) can be found at: gmark.docs.roanh.dev and more details on the original version of gMark can be found in the technical report arxiv.org/abs/1511.08386.