
Michel Steuwer
I am a professor leading the chair of Compilers and Programming Languages at TU Berlin.
Michel Steuwer
I am a professor at Technische Universität Berlin and lead the chair of Compilers and Programming Languages. I am a member of the Institute of Software Engineering and Theoretical Computer Science in the Faculty IV - Electrical Engineering and Computer Science at TU Berlin.
Before joining TU Berlin, I was a lecturer (assistant professor) in the School of Informatics at the University of Edinburgh, a lecturer at the School of Computing Science at the University of Glasgow and a postdoctoral researcher at the University of Edinburgh. I received my PhD from the University of Münster in Germany.
You can download my CV here.
Research
I am interested in programming languages and compilers.
I have a particular interest in compiler design, optimization techniques, and programming languages for parallel programming, heterogeneous, and GPU computing.
Highlights
-
Our ICFP 2020 paper has been selected as a ACM SIGPLAN Research Highlight in September 2021 and has been published as a Communications of the ACM Research Highlight in March 2023
-
Our POPL 2024 paper has been selected as one of nine for the MIT Programming Languages Review 2024 highlighting papers believed to have significant potential to shape the future direction of PL research
-
Best Paper Award Winner at CGO 2018 and SLE 2022
-
6 HiPEAC Paper Awards for our ASPLOS 2028, 2023, 2024, POPL 2024 (2x), and PLDI 2024, 2025 papers
-
Most cited papers at ICFP 2015 and CGO 2017
-
Most downloaded research paper of the Proceedings of the ACM on Programming Languages (PACMPL) with over 15,000 downloads from the ACM Digital Library
Team
PhD Students
-
Serkan Muhcu
Topics: Algebraic Effects, Functional Programming
-
Nicole Heinimann
Topics: E-Graphs, Machine Learning
-
Rudi Schneider
Topics: E-Graphs, Program Analysis
-
Johannes Lenfers
Topics: Rewrite-based Optimizations, Autotuning
supervised with Sergei Gorlatch
-
Bastian Köpcke
Topics: Safe GPU Computing
supervised with Sergei Gorlatch
Former PhD Students
-
Dr. Martin Lücke
PhD Thesis: Precise Control of Compilers: A Practical Approach to Principled Optimization
Now: Research Scientist at Brium
-
Dr. Xueying Qin
PhD Thesis: Studies Concerning the Meaning of Computer Programs
Now: Postdoctoral Researcher at the University of South Denmark
-
Dr. Rongxiao Fu
PhD Thesis: Type Systems for Safe Strategic Rewriting
Now: Researcher at Huawei Research, China
-
Dr. Federico Pizzuti
PhD Thesis: Efficient Code Generation for Irregular Applications with Lift
main supervisor Christophe Dubach
Now: Senior Software Engineer at Huawei Research, Edinburgh, UK
-
Dr. Thomas Kœhler
PhD Thesis: Domain-Extensible Optimizing Compilers
Now: Researcher at CNRS, Strasbourg, France
-
Dr. Larisa Stolzfus
PhD Thesis: Stencil-based HPC Applications in Lift
main supervisor Christophe Dubach
Now: HPC Benchmark Specialist at Eviden, UK
-
Dr. Toomas Remmelg
PhD Thesis: Automatic Performance Optimisation of Parallel Programs for GPUs via Rewrite Rules
main supervisor Christophe Dubach
Now: Senior Compiler Engineer at ARM, Norway
-
Dr. Bastian Hagedorn
PhD Thesis: High-Performance Domain-Specific Compilation without Domain-Specific Compilers
supervised with Sergei Gorlatch
Now: Senior Compiler Engineer at NVIDIA, Germany
-
Dr. Michael Haidl
PhD Thesis: PACXX: a Unified Programming Model for Programming Accelerators
main supervisor Sergei Gorlatch
Now: Senior Systems Software Manager at NVIDIA, Germany
-
Dr. Juan Jose Fumero
PhD Thesis: Accelerating Interpreted Programming Languages on GPUs with Just-In-Time Compilation and Runtime Optimizations
main supervisor Christophe Dubach
Now: Research Fellow at the University of Manchester, UK
Publications
Here you find my journal and conference papers, workshop papers, technical reports, book chapters, and my PhD thesis.
You can also find my publications on my dblp profile and my Google Scholar profile.
Conference and Journal Paper
2025
-
Slotted E-Graphs - First-Class Support for (Bound) Variables in E-Graphs
PLDIPLDI' 25: 45th ACM SIGPLAN International Conference on Programming Language Design and Implementation, Seoul, South Korea, June 16-20, 2025
(accepted for publication)
-
xDSL: Sidekick compilation for SSA-Based Compilers
CGO -
The MLIR Transform Dialect - Your compiler is more powerful than you think
CGO
2024
-
Descend: A Safe GPU Systems Programming Language
PLDI -
A shared compilation stack for distributed-memory parallelism in stencil DSLs
ASPLOS -
Guided Equality Saturation
POPL -
Shoggoth - A Formal Foundation for Strategic Rewriting
POPL -
Collection skeletons: Declarative abstractions for data collections
JoSS
2023
-
BaCO: A Fast and Portable Bayesian Compiler Optimization Framework
ASPLOS -
Structural Subtyping as Parametric Polymorphism
OOPSLA -
Achieving High Performance the Functional Way: Expressing High-Performance Optimizations as Rewrite Strategies
CACM -
Primrose: Selecting Container Data Types by Their Properties
<Programming>
2022
-
Collection Skeletons: Declarative Abstractions for Data Collections
SLE -
Investigating magic numbers: improving the inlining heuristic in the Glasgow Haskell Compiler
Haskell -
Generating Work Efficient Scan Implementations for GPUs the Functional Way
Euro-Par
2021
-
Code Generation for Room Acoustics Simulations with Complex Boundary Conditions
IPDPS35th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2021, Portland, OR, USA, May 17-21, 2021
-
Integrating a functional pattern-based IR into MLIR
CC -
Towards a Domain-Extensible Compiler: Optimizing an Image Processing Pipeline on Mobile CPUs
CGO -
Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF)
TACO
2020
-
DelayRepay: delayed execution for kernel fusion in Python
DLS -
Achieving high-performance the functional way: a functional pearl on expressing high-performance optimizations as rewrite strategies
ICFPProceedings of the 25th ACM SIGPLAN International Conference on Functional Programming, ICFP 2020, Virtual Event, USA, August 23-26, 2020
83 citations on Google Scholar, selected as only 1 of 4 ACM SIGPLAN Research Highlights from 2020, HiPEAC Paper Award, selected for publication as a Communications of the ACM Research Highlight.
-
Tiling Optimizations for Stencil Computations Using Rewrite Rules in Lift
TACO -
Generating fast sparse matrix vector multiplication from a high level generic functional IR
CC
2018
-
Automatic Matching of Legacy Code to Heterogeneous APIs: An Idiomatic Approach
ASPLOS -
High performance stencil code generation with Lift
CGO
2017
-
A Transformation-Based Approach to Developing High-Performance GPU Programs
PSI -
Just-In-Time GPU Compilation for Interpreted Languages with Partial Evaluation
VEE -
Lift: a functional data-parallel IR for high-performance GPU code generation
CGO
2016
-
Matrix multiplication beyond auto-tuning: rewrite-based GPU code generation
CASES
2015
-
Generating performance portable code using rewrite rules: from high-level functional expressions to high-performance OpenCL code
ICFP -
Runtime Code Generation and Data Management for Heterogeneous Computing in Java
PPPJ
2014
-
High-Level Programming of Stencil Computations on Multi-GPU Systems Using the SkelCL Library
PPL -
gCUP: Rapid GPU-based HIV-1 Coreceptor Usage Prediction for Next-Generation Sequencing
Bioinformatics -
SkelCL: A High-Level Extension of OpenCL for Multi-GPU Systems
JoS -
Introducing and Implementing the Allpairs Skeleton for Programming Multi-GPU Systems
IJPP -
Towards High-Level Programming for Systems with Many Cores
PSI
2013
-
dOpenCL: Towards uniform programming of distributed heterogeneous multi-/many-core systems
JPDC -
High-Level Programming for Medical Imaging on Multi-GPU Systems Using the SkelCL Library
ICCS -
SkelCL: Enhancing OpenCL for High-Level Programming of Multi-GPU Systems
PaCT
2012
-
A High-Level Programming Approach for Distributed Systems with Accelerators
SoMeT
Workshop Paper
2022
-
Systematically extending a high-level code generator with support for tensor cores
2021
-
Generating high performance code for irregular data structures using dependent types
2020
-
High-level hardware feature extraction for GPU performance prediction of stencils
-
A functional pattern-based language in MLIR
2019
-
Generating efficient FFT GPU code with Lift
-
Position-dependent arrays and their application for high performance code generation
-
Generating Fast FFT Code for GPU from High-Level Pattern-Based Abstractions
Proceedings of the International Symposium on High-Level Parallel Programming and Applications, HLPP 2019, Linköping, Sweden, July 3-5, 2019
-
High-level synthesis of functional patterns with Lift
-
Towards Mapping Lift to Deep Neural Network Accelerators
2018
-
Introducing Parallelism to the Ranges TS
2017
-
A Modular Approach to Performance, Portability and Productivity for 3D Wave Models
-
OpenCL JIT Compilation for Dynamic Programming Languages
-
Towards Composable GPU Programming: Programming GPUs with Eager Actions and Lazy Views
2016
-
Performance portable GPU code generation for matrix multiplication
-
Multi-stage programming for GPUs in C++ using PACXX
-
Compositional Compilation for Sparse, Irregular Data Parallelism
Proceedings of the Workshop on High-Level Programming for Heterogeneous and Hierarchical Parallel Systems, HLPGPGPU@HiPEAC 2016, Prague, Czech Republic, January 19, 2016
-
Towards Collaborative Performance Tuning of Algorithmic Skeletons
Proceedings of the Workshop on High-Level Programming for Heterogeneous and Hierarchical Parallel Systems, HLPGPGPU@HiPEAC 2016, Prague, Czech Republic, January 19, 2016
-
Autotuning OpenCL Workgroup Size for Stencil Patterns
Proceedings of the 2016 International Workshop on Adaptive Self-tuning Computing Systems, ADAPT@HiPEAC 2016, Prague, Czech Republic, January 18, 2016
36 citations on Google Scholar.
2014
-
A Composable Array Function Interface for Heterogeneous Computing in Java
-
Extending the SkelCL Skeleton Library for Stencil Computations on Multi-GPU Systems
Proceedings of the 1st International Workshop on High-Performance Stencil Computations, HiStencils@HiPEAC 2014, Vienna, Austria, January 22, 2014
21 citations on Google Scholar.
2012
-
Using the SkelCL Library for High-Level GPU Programming of 2D Applications
-
Uniform High-Level Programming of Many-Core and Multi-GPU Systems
-
Towards High-Level Programming of Multi-GPU Systems Using the SkelCL Library
-
dOpenCL: Towards a Uniform Programming Approach for Distributed Heterogeneous Multi-/Many-Core Systems
2011
-
SkelCL - A Portable Skeleton Library for High-Level GPU Programming
Technical Reports
2024
-
A shared compilation stack for distributed-memory parallelism in stencil DSLs
-
The MLIR Transform Dialect. Your compiler is more powerful than you think
2023
-
Sidekick compilation with xDSL
-
Descend: A Safe GPU Systems Programming Language
-
Traced Types for Safe Strategic Rewriting
-
Structural Subtyping as Parametric Polymorphism
2022
-
BaCO: A Fast and Portable Bayesian Compiler Optimization Framework
-
Primrose: Selecting Container Data Types by their Properties
-
RISE & Shine: Language-Oriented Compiler Design
2021
-
Sketch-Guided Equality Saturation: Scaling Equality Saturation to Complex Optimizations in Languages with Bindings
-
Row-Polymorphic Types for Strategic Rewriting
2018
-
P0836R0 Introduce Parallelism to the Ranges TS
C++ Standards Committee Papers.
2017
-
Strategy Preserving Compilation for Parallel Functional Code
2015
-
Patterns and Rewrite Rules for Systematic Code Generation (From High-Level Functional Patterns to High-Performance OpenCL Code)
-
Autotuning OpenCL Workgroup Size for Stencil Patterns
Book Chapters
2015
-
Verbesserung der Programmierbarkeit und Performance-Portabilität von Manycore-Prozessoren (Improving Programmability and Performance Portability on Many-Core Processors)
2014
-
Skeleton Programming for Portable Many-Core Computing
Programming Multi-core and Many-core Computing Systems
PhD Thesis
Talks
2025
-
Invited Talk: Slotted E-Graphs04/2025, System Seminar at the University of Glasgow, Glasgow, UK.
2024
-
Invited Talk: Scaling Equality Saturation10/2024, Shaghai Huawei Research Center, Shanghai, China.
-
Invited Talk: Descend: A Safe GPU Systems Programming Language08/2024, Bayes Coffee House Tech Talk Series, Edinburgh, UK.
2023
-
Invited Talk: Guided Equality Saturation10/2023, Seminar of the Chair of Compiler Construction at TU Dresden, Desden, Germany.
-
Invited Talk: On bringing a functional pearl into practice: An MLIR-based implementation of the strategy language ELEVATE03/2023, LAIV/DSG seminar at Heriot-Watt University, Edinburgh, UK.
-
Invited Talk: On bringing a functional pearl into practice: An MLIR-based implementation of the strategy language ELEVATE01/2023, Programming Languages at Glasgow (PLUG) seminar at the University of Glasgow, Glasgow, UK.
2022
-
Invited Talk: Modern DSL Compiler Development with MLIR11/2022, Huawei TRC Innovation Summit 2022, Tel Aviv, Israel.
-
Invited Talk: How to Design the Next 700 Optimizing Compilers06/2022, High-efficiency computer graphics group at MIT CSAIL, Cambridge, MA, USA.
-
Talk: Achieving High-Performance the Functional Way: Expressing High-Performance Optimizations as Rewrite Strategies06/2022, SIGPLAN Track at the SIGPLAN Conference on Programming Language Design and Implementation (PLDI), San Diego, CA, USA.
-
Invited Talk: RISE & Shine: Language-Oriented Compiler Design06/2022, Compiler Design Lab Seminar at Saarland University, Saarland, Germany.
-
Talk: Systematically Extending a High-Level Code Generator with Sup- port for Tensor Cores04/2022, Workshop on General Purpose Processing using GPU (GPGPU), virtual.
2021
-
Talk: FHPNC Community Update09/2021, Workshop on Functional High- Performance and Numerical Computing (FHPNC), virtual.
2020
-
Invited Talk: Achieving High-Performance the Functional Way - Expressing High-Performance Optimizations as Rewrite Strategies12/2020, Programming Languages and Systems Research Group (PLAS) group seminar at the University of Kent, virtual.
-
Invited Talk: Compiler Intermediate Representations08/2020, Scottish Programming Languages and Verification Summer School 2020 (SPLV 2020), virtual.
-
Talk: Achieving High-Performance the Functional Way - Expressing High-Performance Optimizations as Rewrite Strategies17/2020, Scottish Programming Languages Seminar (SPLS), virtual.
2019
-
Invited Talk: ELEVATE: a language to write composable program optimizations09/2019, Google DeepMind, London, UK.
-
Invited Talk: Lift: Generating High Performance Code with Rewrite Rules02/2019, Programming Languages and Software Engineering Group seminar at the University of Washington, Seattle, WA, USA.
-
Invited Talk: Lift: Generating High Performance Code with Rewrite Rules02/2019, Microsoft Research, Redmond, WA, USA.
2018
-
Talk: Implementing lambda calculus in Python and C++12/2018, Programming Languages at Glasgow (PLUG) seminar at the University of Glasgow, Glasgow, UK.
-
Talk: High-level Features - Low-level Performance: GPU Performance Pre- diction of Stencils11/2018, System Seminar at the University of Glasgow, Glasgow, UK.
-
Invited Talk: Generating Performance Portable Code with Lift09/2018, Shonan Meeting No.134: Advances in Heterogeneous Computing from Hardware to Software, Shōnan, Japan.
-
Invited Talk: Lift: Code Generation by Rewriting Algorithmic Skeletons03/2018, Dagstuhl Seminar 18111 on Loop Optimizations, Schloss Dagstuhl, Germany.
-
Invited Talk: Programming GPUs with Eager Actions and Lazy Views03/2018, Compiler and Architecture Design Group Seminar at the University of Edinburgh, Edinburgh, UK.
-
Talk: The Lift Project: Performance Portable Parallel Code Generation via Rewrite Rules02/2018, Formal Analysis, Theory and Algorithms Seminar at the University of Glasgow, Glasgow, UK.
2017
-
Talk: Programming GPUs with Eager Actions and Lazy Views11/2017, System Seminar at the University of Glasgow, Glasgow, UK.
-
Talk: The Lift Project: Performance Portable Parallel Code Generation via Rewrite Rules11/2017, System Seminar at the University of Glasgow, Glasgow, UK.
-
Invited Talk: The Lift Project: Performance Portable Parallel Code Generation via Rewrite Rules10/2017, Microsoft Research, Cambridge, UK.
-
Talk: The Lift Project: Performance Portable Parallel Code Generation via Rewrite Rules09/2017, University of Hull HPC Symposium 2017, Hull, UK.
-
Invited Talk: The Lift Project: Performance Portable Parallel Code Generation via Rewrite Rules07/2017, University of Münster, Münster, Germany.
-
Talk: Programming GPUs with Eager Actions and Lazy Views06/2017, Scottish Programming Languages Seminar (SPLS) at the University of the West of Scotland, Paisley, UK.
-
Talk: Programming GPUs with Eager Actions and Lazy Views04/2017, C++ Edinburgh Meetup, Edinburgh, UK.
-
Talk: Lift: A Functional Data-Parallel IR for High-Performance GPU Code Generation02/2017, International Symposium on Code Generation and Optimization (CGO) 2017, Austin, TX, USA.
-
Talk: Programming GPUs with Eager Actions and Lazy Views02/2017, International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM) 2017, Austin, TX, USA.
2016
-
Invited Talk: The Lift Project: Performance Portable GPU Code Genera- tion via Rewrite Rules12/2016, Computer Laboratory Systems Research Group Seminar at the University of Cambridge, Cambridge, UK.
-
Invited Talk: Structured Parallel Programming - From High-Level Func- tional Expressions to High-Performance OpenCL Code08/2016, Center for Advanced Electornics at TU Dresden, Dresden, Germany.
-
Invited Talk: Improving Programmability and Performance Portability on Many-Core Processors05/2016, Colloquium of candidates nominated for the prize for best dissertation awarded by the German Informatics Society, Scholss Dagstuhl, Germany.
-
Invited Talk: The Lift Project: Performance Portability via Rewrite Rules04/2016, Compiler Design Lab Seminar at Saarland University, Saarland, Germany.
-
Invited Talk: Performance Portable GPU Code Generation01/2016, Multicore Programming Group seminar at Imperial College, London, UK.
2015
-
Talk: Functional Programming in C++12/2015, Programming Language Interest Group at the University of Edinburgh, Edinburgh, UK.
-
Invited Talk: Generating Performance Portable Code using Rewrite Rules10/2015, Multicore Programming Group seminar at Imperial College, London, UK.
-
Talk: Generating Performance Portable Code using Rewrite Rules: From High-Level Functional Expressions to High-Performance OpenCL Code09/2015, International Conference on Functional Programming (ICFP) 2015, Vancouver, Canada.
-
Talk: Generating Performance Portable Code using Rewrite Rules06/2015, Scottish Programming Languages Seminar (SPLS) at the University of St. Andrews, St. Andrews, UK.
2014
-
Invited Talk: SkelCL: High-Level Programming of Multi-GPU Systems05/2014, Institute for Computational and Applied Mathematics at the University of Münster, Münster, Germany.
-
Invited Talk: SkelCL: High-Level Programming of Multi-GPU Systems05/2014, Workshop on Fast Data Processing on GPUs, Dresden, Germany.
-
Talk: Extending the SkelCL Library for Stencil Computations on Multi-GPU Systems01/2014, HiStencils 2014 workshop, Vienna, Austria.
2013
-
Invited Talk: SkelCL: High-Level Programming of Multi-GPU Systems12/2013, Research group on elementary particle physics at the University of Wuppertal, Wuppertal, Germany.
-
Talk: Introducing and Implementing the Allpairs Skeleton for GPU Systems07/2013, HLPP 2013 workshop, Paris, France.
-
Talk: High-Level Programming for Medical Imaging on Multi-GPU Systems using the SkelCL Library06/2013, ICCS 2013 conference, Barcelona, Spain.
2012
-
Talk: Using the SkelCL Library for High-Level GPU Programming of 2D Applications08/2012, ParaPhrase 2012 workshop, Rhodes, Greece.
-
Talk: High-Level Programming for Heterogeneous Systems with Accelerators06/2012, PDESoft 2012 workshop, Münster, Germany.
-
Talk: Towards High-Level Programming of Multi-GPU Systems Using the SkelCL Library05/2012, AsHES 2012 workshop, Shanghai, China.
-
Invited Talk: A Skeleton Library for Heterogeneous Multi-/Many-Core Systems04/2012, NAIS workshop, Edinburgh, UK.
-
Talk: Towards a High-Level Approach for Programming Distributed Systems with GPUs01/2012, COST Action IC0805 (“ComplexHPC”) meeting, Timisoara, Romania.
2011
-
Invited Talk: SkelCL - A High-Level Programming Library for GPU Pro- gramming12/2011, Jülich Supercomputing Centre (JSC), Jülich, Germany.
-
Talk: SkelCL - A Portable Skeleton Library for High-Level GPU Programming05/2011, HIPS 2011 workshop, Anchorage, AK, USA.
2008
-
Invited Talk: Development of an Online Game as a Student Project09/2008, ITSoft-TEAM workshop, Chernihiv, Ukraine.
Community Activities
Organization Committees
- General Chair of PPoPP 2024
- Steering Committee Chair of CGO (since 2025)
- Steering Committee Member of CGO (since 2021) and PPoPP (since 2024)
- Local Organization Co-Chair of HiPEAC Computer Systems Week April 2019, Scottish Programming Language Seminar March 2018 and October 2019, and, UK Many-Core Developer Conference May 2016.
Program Committees
- Program Committee Co-Chair of CGO 2024
- Program Committee Member of OOPSLA 2026, PLDI 2025, GPCE 2025, 2024, 2020, 2019, SLE 2024, Haskell 2023, Euro-Par 2023, CGO 2022, 2020, 2019, CC 2020, ICPP 2020, FHPNC 2021, 2020, HLPP 2020, 2019, 2018, 2017, 2016, LCTES 2019, 2018, DHPC++ Workshop 2019, 2018, and, IEEE ScalCom 2016.
Artifact Evaluation Committees
- Artifact Evaluation Co-Chair of CGO 2021, 2020, 2019, 2018, CC 2021, 2020, and, LCTES 2019, 2018
- Artifact Evaluation Committee Member of ICFP 2017, CGO 2017, PACT 2016.
Other Reviewing Activities
- External reviewer for journals: Communications of the ACM, ACM TODS, ACM TACO, ACM Computing Surveys, Science of Computer Programming Journal (Elsevier), The Journal of Supercomputing (Springer), and, Software: Practice and Experience (Wiley).
- External reviewer for conferences: MLSys, CC, CGO, Euro-Par, EuroMPI, CC-Grid, and, ParCo.
- Reviewer for funding bodies: German Research Foundation (DFG), German Federal Ministry of Education and Research (BMBF), UK Engineering and Physical Science Research Council (EPSRC), Netherlands Organisation for Scientific Research (NWO), and, Natural Sciences and Engineering Research Council of Canada (NSERC).