To content
Department of Computer Science

Projects

We are or were involved in the following officially funded projects.

The aim of this project is to make general-purpose text indices fit for data sizes in the multi-terabyte region. We want to make use of all aspects of parallel computing: local shared memory and global distributed computing. The local computations will also exploit techniques for succinct and external memory data structures. These techniques were previously only considered in isolation; this project is the first in the stringology community that integrates all of them and thereby being able to index really large texts.

This project is part of the DFG-SPP 1736 priority programme on Algorithms for BIG DATA and has been funded from 2014 to 2022.

Visit Project Website

We want to develop practical algorithms for compressing highly repetitive data that overcome the shortcomings of currently common compressors such as gzip or bzip2. These have been established in the 1990s and targeted hardware that was standard in those days; their main disadvantage is that they do not capture repetitions of substrings that are far apart. The first goal is to design and engineer a compression tool that does also benefit from such long range repetitions, but still has only moderate memory requirements.As a second goal, we want to exploit the shared-memory parallelism present in virtually any CPU in order to speed up compression, without losing too much compression ratio. Here, we want to have a broader look at compression algorithms, and in particular include grammar compressors which offer excellent opportunitie for parallelization.In the ideal case, both ideas to make better use of modern resources will be integrated into production-ready software repositories (like Linux distributions) so that end consumers can benefit easily from our algorithm engineering efforts.

This project is being funded by the DFG since 2022 (project No. 501086801).

 

The internet of things (IoT) has already started to generate huge amounts of data. Infrastructures, machines, vehicles, and everyday objects such as smartphones or TVs are equipped with intelligent functions that are linked to each other. These objects contain sensors, RFID chips, and cameras that continuously produce data and communicate within these cyber-physical systems (CPSs). A natural representation of a linked data set is provided by a graph, where entities are represented as vertices and their relationships are encoded by edges. Compared to the classical representation of objects as feature vectors, the graph structure additionally allows the representation of the complex relationships between these objects. Project A6 deals with the development of new methods for analysing graphs at a large scale or on a large number of graphs in resource constrained environments.

This project is part of the SFB 876: Providing Information by Resource-Constrained Data Analysis (project A6) and has been funded from 2010 to 2022.

Visit Project Website