Research
We pursue research across a range of topics often exploring surprising or interesting connections between domains. Longstanding research themes include program generation for performance, FPGA design and generators, signal processing, and theory. Recently there has been also a focus on machine learning and program analysis. Most projects have a salient mathematical aspect.
Current research
Program generation for performance. For functionality of mathematical nature we aim to automatically generate highest performance code from a high level mathematical description, an area we started with Spiral. The project combines techniques from mathematics, programming languages, symbolic computation, and compilers. Recent examples:
- Spiral in Scala & Generic Programming in Space and Time (Ofenbeck et al.)
- Program generation for small-scale linear algebra (Spampinato et al.)
Fast abstract domains (with Martin Vechev). We have developed a new methodolgy to speed up certain types of program analysis by orders of magnitude:
Signal processing on new index domains. We have developed a new theory of discrete signal processing that makes it possible to generalize it to signals (or data) with various index domains including power sets, meet/join lattices, and hypergraphs:
- Discrete Signal Processing with Set Functions, Lattices, Hypergraphs
- Algebraic Signal Processing Theory
Machine learning on new index domains. We devise methods for learning functions indexed by digraphs, powersets, lattices and posets. Furthermore, we port deep learning architectures to these index domains. In order to do so, we draw inspiration from our new signal processing frameworks:
SIMD Instructions in managed runtimes. Using generative programming techniques, SIMD intrinsics can be made available in managed languages like Java:
Fast quantized arithmetic. Low-precision quantized arithmetic can be used in various iterative algorithms from optimization and machine learning while still guaranteeing convergence. We provide an efficient library including for 4-bit arithmetic that offers speedups:
Generators and design of streaming IP cores. We study the design and computer generation of streaming IP cores on FPGAs:
- Memory-Efficient Fast Fourier Transform on Streaming Data by Fusing Permutations (Serre et al.)
- Optimal Circuits for Streamed Linear Permutations using RAM (Serre et al.)
- Sorting network generator
Heterogeneous tasks on Homogeneous cores. We implement a heterogeneous algorithm for optimization of GMLs on a single machine with many homogeneous cores:
Compilers for sound floating-point. We implement a compiler to automatically transform numerical computations using floating-point to equivalent sound computations using interval arithmetic: