My picture

George Papamakarios

I'm a research scientist at DeepMind London. Previously, I did a PhD in Data Science at the University of Edinburgh and an MSc in Advanced Computing at Imperial College London.

I'm interested in probabilistic machine learning, Bayesian inference, deep learning, generative modelling, and reinforcement learning.


PhD in Data Science, University of Edinburgh.
Thesis: Neural density estimation and likelihood-free inference. Supervised by Iain Murray.

MSc by Research in Data Science, University of Edinburgh.
Grade 92%, Distinction. Won the MSc by Research in Data Science class prize.

MSc in Advanced Computing, Imperial College London.
Grade 90%, Distinction. Won the Corporate Partnership Programme award for academic excellence and the Winton Capital applied computing MSc project prize.

MEng in Electrical and Computer Engineering, Aristotle University of Thessaloniki.
Grade 89.6%, Distinction.

Previous work

Research intern, DeepMind London.
I worked with Theo Weber on reinforcement learning for partially observed environments.

Research intern, Microsoft Research Cambridge.
I worked with John Winn on Bayesian inference for computer-vision models using Infer.NET.

Teaching assistant, University of Edinburgh.
I tutored (and sometimes marked) the following courses: Machine Learning & Pattern Recognition; Introductory Applied Machine Learning; Probabilistic Modelling & Reasoning; Informatics 2B - Algorithms, Data Structures & Learning; Introduction to Theoretical Computer Science.

Research assistant, Information Technologies Institute, Centre for Research & Technology Hellas.
I participated in the EU-funded project Adapt4EE and the Greek-funded project EnNoisis. Most of my work focused on automatic activity recognition in smart homes with ambient sensors and Kinect cameras. Quite a lot of machine learning and computer vision involved.

Research assistant, Aristotle University of Thessaloniki.
I participated in the EU-funded project AutoGPU, where I developed software for fast parallel low-level image processing on GPUs. I used to write a lot of CUDA back then.

Sequential Neural Likelihood

Sequential Neural Likelihood

Sequential Neural Likelihood is a fast and robust algorithm for inference in simulator models, which are models we can simulate but whose likelihood we can't compute. SNL works by trainining a Masked Autoregressive Flow on simulated data to learn the simulator model's intractable likelihood. By guiding simulations during training, we can reduce the simulation cost dramatically. SNL brings together ideas from likelihood-free inference and neural density estimation, and is a more robust alternative to related methods that learn the posterior directly.

For more information, see the paper and the code.

Masked Autoregressive Flow

Masked Autoregressive Flow

Autoregressive models and normalizing flows are types of neural networks that achieve good performance in density estimation. We developed Masked Autoregressive Flow, a normalizing flow whose layers are autoregressive. Masked Autoregressive Flow is closely related to Inverse Autoregressive Flow and RealNVP, and performs well as a general-purpose density estimator.

For more information, see the paper and the code. There's also the oral presentation at NeurIPS 2017, and my interview with TWIMLAI. Since then, Masked Autoregressive Flow has become a standard part of TensorFlow probability.

Fast ε-free inference of simulation models

Fast ε-free Inference of Simulation Models

Suppose we have a probabilistic model which we can simulate data from, but whose likelihood we can't evaluate. How can we do Bayesian inference in such a model? We propose learning the posterior with a Bayesian neural network trained on simulated data. By guiding future simulations, we can dramatically speed up the process.

For more information, see the paper and the code. Dennis Prangle wrote a nice blog post about this work.

Distilling model knowledge

Distilling Model Knowledge

In machine learning, many good models are large, expensive or intractable. Knowledge distillation refers to training a convenient model to match the performance of a good but cumbersome model. We apply knowledge distillation in: (a) model compression, where we compress large ensembles into small neural networks; (b) Bayesian inference, where we distil MCMC chains into closed-form predictive distributions; (c) intractable generative models, where we distil unnormalizable RBMs into tractable NADEs.

For more information, see my MSc thesis and the code.

Robust low-rank modelling on matrices and tensors

Robust Low-Rank Modelling on Matrices and Tensors

When represented as matrices, real-world data often have low-rank structure, whereas corruptions are often sparse. Based on this observation, several methods that aim to separate the low-rank from the sparse component have been developed. In this work, we extend existing matrix-based methods to tensors, and apply them to computer-vision problems.

For more information see my MSc thesis. This thesis won the Winton Capital applied computing MSc project prize.

Comparison of modern stochastic optimization algorithms

Comparison of Modern Stochastic Optimization Algorithms

Stochastic gradient descent is standard for training deep-learning models, because it scales well to large datasets. However, stochastic gradient descent has slower convergence rate compared to batch gradient descent. Semi-stochastic algorithms, such as S2GD and SAG, combine fast convergence with scalability. In this project, we compare semi-stochastic gradient descent to stochastic and batch gradient descent. We find that semi-stochastic gradient descent converges faster, but it doesn't necessarily lead to better generalization.

This is a small project I did with Peter Richtárik. For more information, see the technical report and the code.

Fast convolution and local correlation coefficients on GPUs

Fast Convolution and Local Correlation Coefficients on GPUs

Convolution and correlation are fundamental low-level operations in image processing. In this project, we develop algorithms and software for their fast computation, based on (a) using the Fourier domain for large filters, and (b) parallelizing on GPUs. We developed the FLCC Library, a software tool that automatically determines which algorithm-device combination is the fastest for a given problem.

For more information see my MEng thesis. Code is hosted on the FLCC Library website and the AutoGPU website.


Neural Spline Flows
C. Durkan, A. Bekasov, I. Murray, G. Papamakarios
Advances in Neural Information Processing Systems, 2019
arXiv bibtex code

Temporal Difference Variational Auto-Encoder
K. Gregor, G. Papamakarios, F. Besse, L. Buesing, T. Weber
International Conference on Learning Representations, 2019
arXiv bibtex

Sequential Neural Likelihood: Fast Likelihood-free Inference with Autoregressive Flows
G. Papamakarios, D. C. Sterratt, I. Murray
International Conference on Artificial Intelligence and Statistics, 2019
arXiv bibtex code

Masked Autoregressive Flow for Density Estimation
G. Papamakarios, T. Pavlakou, I. Murray
Advances in Neural Information Processing Systems, 2017
arXiv bibtex code

Robust low-rank tensor modelling using Tucker and CP decomposition
N. Xue, G. Papamakarios, M. Bahri, Y. Panagakis, S. Zafeiriou
European Signal Processing Conference, 2017
pdf bibtex

Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation
G. Papamakarios, I. Murray
Advances in Neural Information Processing Systems, 2016
arXiv bibtex code

Generalised Scalable Robust Principal Component Analysis
G. Papamakarios, Y. Panagakis, S. Zafeiriou
British Machine Vision Conference, 2014
pdf bibtex code

Synthetic Ground Truth Data Generation for Automatic Trajectory-based ADL Detection
G. Papamakarios, D. Giakoumis, K. Votis, S. Segouli, D. Tzovaras, C. Karagiannidis
IEEE-EMBS International Conference on Biomedical and Health Informatics, 2014
web bibtex

Fast Computation of Local Correlation Coefficients on Graphics Processing Units
G. Papamakarios, G. Rizos, N. P. Pitsianis, X. Sun
Proceedings of SPIE, 2009
pdf bibtex


Cubic-Spline Flows
C. Durkan, A. Bekasov, I. Murray, G. Papamakarios
Workshop on Invertible Neural Networks and Normalizing Flows at International Conference on Machine Learning, 2019
arXiv bibtex

Neural belief states for partially observed domains
P. Moreno, J. Humplik, G. Papamakarios, B. Á. Pires, L. Buesing, N. Heess, T. Weber
Reinforcement Learning under Partial Observability Workshop at Neural Information Processing Systems, 2018
pdf bibtex

Sequential Neural Methods for Likelihood-free Inference
C. Durkan, G. Papamakarios, I. Murray
Bayesian Deep Learning Workshop at Neural Information Processing Systems, 2018
arXiv bibtex

Distilling Intractable Generative Models
G. Papamakarios, I. Murray
Probabilistic Integration Workshop at Neural Information Processing Systems, 2015
pdf bibtex code

A Tool to Monitor and Support Physical Exercise Interventions for MCI and AD Patients
G. Papamakarios, D. Giakoumis, M. Vasileiadis, K. Votis, D. Tzovaras, S. Segouli, C. Karagiannidis
Patient Rehabilitation Techniques Workshop at International Conference on Pervasive Computing Technologies for Healthcare, 2014
web bibtex

Book chapters

Human Computer Confluence in the Smart Home Paradigm: Detecting Human States and Behaviours for 24/7 Support of Mild-Cognitive Impairments
G. Papamakarios, D. Giakoumis, M. Vasileiadis, A. Drosou, D. Tzovaras
Human Computer Confluence: Transforming Human Experience Through Symbiotic Technologies, De Gruyter Open, 2016
pdf bibtex


Neural Density Estimation and Likelihood-free Inference
G. Papamakarios
PhD Thesis, University of Edinburgh, 2019
arXiv bibtex

Distilling Model Knowledge
G. Papamakarios
MSc by Research Thesis, University of Edinburgh, 2015
arXiv bibtex code

Robust Low-Rank Modelling on Matrices and Tensors
G. Papamakarios
MSc Thesis, Imperial College London, 2014
pdf bibtex

FLCC: A Library for Fast Computation of Convolution and Local Correlation Coefficients
G. Papamakarios, G. Rizos
MEng Thesis, Aristotle University of Thessaloniki, 2011
pdf bibtex code