Lectures at ETH and UZH

All about transformers and stable diffusion: a step-by-step guide

Learn to implement from scratch a transformer-based translator and a small version of stable diffusion. Only basic knowledge of PyTorch is required. We explain everything in simple terms and provide the necessary intuitions behind each component. We start by giving an overview of how NLP works, then move to how attention mechanisms are conceived, and then arrive to the transformer architecture. Afterwards, we introduce diffusion processes, UNets, and cross-attention mechanisms. At the end, we implement a simplified yet complete stable diffusion model that can be trained in a few minutes in a normal laptop. Throughout the tutorial we work with a simple fictitious dataset consisting of pairs of texts and images. The texts are English sentences consising only of the following words: "circle", "triangle", "after", "one", "two", "and". Each sentence describes at most 4 figures. One example of such a sentence would be "two triangles and one square after one triangle." We call this English fragment Shape English. Each text comes with an image that illustrates the sentence.

Statistical learning theory 2020-2021

[Course website | Script]

Advanced machine learning 2020-2022, 2024

[Course website]

An introduction to machine learning for medicine 2020-2021

[Course description | Slides]

Other

An elementary introduction to quantum computing (VMI retreat 2022)

We present here some of the basics for quantum computation, using only linear algebra over the real numbers. No previous knowledge of quantum mechanics or analysis with complex numbers is required.
[Notes | Presentation]

Algorithm validation via information theory (Statistical learning theory 2020)

How can you tell if your algorithm is learning correctly from your data? These notes present a method for this, using information theory. Originally proposed by Prof. Joachim Buhmann.
[Notes | Slides part 1 | Slides part 2 | Slides part 3]

Support vector machines (Advanced machine learning 2019)

Derivation of SVMs, with a preface on Lagrange multipliers, illustrated with a petrel, a cat, and a fish.
[Video]

Bayesianism, frequentism, and maximum-likelihood estimators (Advanced machine learning 2019)

An overview of Bayesian inference and maximum-likelihood, using a shoe shop as an example.
[Video]

A dog, a vegan flea, and the EM algorithm (Statistical learning theory 2019)

A simple, but rigorous derivation of the expectation-maximization algorithm using a bidimensional dog and a vegan flea.
[Notes | Presentation | GIF]

The essentials of machine and deep learning (Software crafters 2018)

A half-day workshop giving an introduction to classification, using machine and deep learning. Only basic programming knowledge in Python is required.
[Presentation ML | Presentation DL | Docker image ML | Docker Image DL]