The biggest European conference about ML AI and Deep Learning applications
running in person in Prague and online.

Machine Learning Prague 2022

In cooperation with

, 2022


World class expertise and practical content packed in 3 days!

You can look forward to an excellent lineup of 45 international experts in ML and AI business and academic applications at ML Prague 2022. They will present advanced practical talks, hands-on workshops and other forms of interactive content to you.

Stay tuned. We will publish our full program with talks and 1-day, hands-on workshops soon!

What to expect

  • 500+ Attendees
  • 3 Days
  • 45 Speakers
  • 8 Workshops
  • 2 Parties

Phenomenal Speakers

Radovan Kavicky

Principal Data Scientist & President, GapData Institute

Radovan Kavicky joined Datacamp among its first employees (historically 1st Data Science Instructor from CEE region & is still historically the only one worldwide who have made successful transition from regular student to instructor and employee after being #1 worldwide @ Datacamp platform for nearly a year, back in 2017).

Radovan is Data Science Polyglot (R, Python, Julia ++more) and Data Science Veteran with over 10 years of experience in Data Science and Applied AI/ML Consulting & extensive knowledge in the area (Data Science consulting, education & community building with successful cooperation with global leaders within our industry, like f.e., Anaconda or Tableau). Radovan is also co-founder of Slovak.AI (Slovak Research Center for Artificial Intelligence) and member of various international professional societies within our Data Science & AI/ML industry, like f.e. IEEE Computer Society, CLAIRE (Confederation of Laboratories for Artificial Intelligence Research in Europe), European AI Alliance (European Commission/Futurium), TAILOR network (Trustworthy AI - Integrating Learning, Optimisation and Reasoning), UDSC (United Data Science Communities), PyData Global Network, Global Tableau #DataLeader network & The Python Software Foundation (PSF).

Radovan is also Founder of PyData Slovakia/Bratislava (#PyDataSK #PyDataBA), R <- Slovakia (#RSlovakia), Julia Users Group Slovakia (#JUGSlovakia), SK/CZ Tableau User Group (#skczTUG) & Effective Altruism Slovakia (#EASlovakia) that you are all welcome to join.

Practical & Inspiring Program


at CEVRO Institut, Jungmannova 28/17, Prague 1 (workshops won't be streamed)


Room 103 Room 106 Room 203 Room 205
coffee break

Text analysis with Apache Spark 3.x and Python

Room 103

David Vrba, Emplifi

Apache Spark became a standard for data processing in a big data environment. It is well integrated with the Python programming language and the integration became even more emphasized in the 3.x releases. In this hands-on workshop we will see how Spark can be used for analyzing textual data using Spark SQL along with the native package for machine learning - Spark ML. We will also explore Spark NLP which is a state-of-the-art library for natural language processing that provides machine learning and deep learning capabilities for text analysis on top of Spark.

Synthetic Data Generation for Computer Vision

Room 106

Frederick Bednar, EBCONT
Julian-Thomas Erdödy, EBCONT
Clifford Bednar, EBCONT

Collecting reliable and properly labeled image and video data in sufficient quantities denotes one of the major challenges in computer vision still preventing many projects both in research and industrial domains from seeing the light of day. In this workshop we would like to show you how to generate and use synthetic datasets with the help of game engines in order to accelerate the image annotation process. We will augment our datasets using domain randomization techniques to simulate possible variations and scenarios in the real data. Finally we will use these datasets to train a neural network and demonstrate the benefit of this approach by measuring the network’s performance against real data.

Language Model Essentials: Pre-training, Metrics, and Community

Room 203

Nick Doiron, Hewlett Packard Enterprise

Learn the essentials to train fine-tune and patch language models with the Transformers library. In this workshop we will compare accuracy of masked language models on select tasks using architectures such as BERT and T5. For generative models (such as GPT-2) we explore the options to generate text through greedy search and beam search. In the end we will cover how to participate in the open source NLP community including sharing language models on HuggingFace and/or AdapterHub.

ML in live data processing

Room 205

Tomáš Neubauer, Quix
Javier Blanco Cordero, Quix

In this workshop you will learn how to use machine learning in real-time systems. You will process data live with a trained ML model with almost no latency. In 3 hour workshop you will get a chance to build PoC using Python from scratch with a team that worked in F1 racing processing car telemetry at a massive scale.

coffee break

Explainable AI/ML (XAI) in Python

Room 103

Radovan Kavicky, GapData Institute

In this workshop led by Radovan Kavicky from Datacamp &amp; GapData Institute you will get familiar with Explainable AI (XAI) and how to implement these principles in Python. Together we will open the "black box" of machine learning where sometimes even its designers cannot fully explain why an AI/ML arrived at a specific decision and also point out differences from statistical learning. We will learn how to better design systems that imitate intelligence in transparent way and you will also get an overview of current trends in Explainable AI/ML.

Practical aspects of reinforcement learning: Build your own elements of contextual bandits in TF-Agents

Room 106

Michal Kubišta,
Petr Stanislav,

Reinforcement learning (RL) models are a new type of intelligent machine that can help you drive your car or beat you in Starcraft. There are many RL libraries and they implement a wide range of policies (models) environments and other elements of RL. However they can never cover all use cases and you might quickly find out you need to build your pieces to make the package work for your project. This decision likely leads to scarcely documented protocols and interfaces you need to fulfil and this is where we want to help. Since implementing the full reinforcement learning solutions in a business setup (outside of the typical use cases with simulated environments) leads to additional complexities this workshop will focus on contextual multi-armed bandits (CMAB) a middle step between supervised and reinforcement learning.&nbsp;We will first review all building blocks of the RL / CMAB framework and then walk you through building a custom implementation of those elements which will include a lot of code running on tf.Graph. After this session you should understand the (dis)advantages of using CMAB and be ready to start using TF-Agents in your projects.

Reverse Image Search

Room 203

Jan Rus, Emplifi
Peter Jung, Emplifi

Find most similar images in the data given a reference image. We start with a simple baseline using ImageHash. Then show its limitations and proceed to a more robust solution using the latest DL models. With an adjustable threshold specifying how big differences are allowed. Along with fixes for edge-cases like completely black or white images.

Recommendation systems and user representations

Room 205

Tomáš Nováčik,
Adam Jurčík,
Václav Blahut,
Radek Tomšů,
Vít Líbal,

Popularity of deep neural networks and embeddings in machine learning is transcending into the realm of recommender systems and is getting attraction within industry. Recommendation systems are used in many industries such as eCommerce social networks content providers and many more. They are improving user experience radically. In the theoretical part of the workshop we will go through different architectures of neural networks that are currently the state of the art in the recommendation domain. In the practical part we will train deep neural networks on our internal datasets and demonstrate benefits of various architectures and user features. In particular we will show how to employ a variety of user features to address the cold-start problem.



La Fabrika, Komunardů 30, Praha 7 (and on-line)

Registration from 9:00

Welcome to ML Prague 2022

The high-dimensional geometry of deep neural network loss landscapes

Stanislav Fort, Anthropic, Stanford University

Large deep neural networks trained with gradient descent have been extremely successful at learning solutions to a broad suite of difficult problems across a wide range of domains. Despite their tremendous success, we still do not have a detailed, predictive understanding of how they work and what makes them so effective. In this talk, I will describe recent efforts to understand the structure of deep neural network loss landscapes and how gradient descent navigates them during training. In particular, I will discuss a phenomenological approach to modeling their large-scale structure using high-dimensional geometry, the role of their nonlinear nature in the early phases of training, and its effects on ensembling, calibration, and approximate Bayesian techniques. organic search going semantic

Jakub Náplava,

In 2021, the organic search of was enhanced with semantic vectors. These form a separate branch to the traditional term search and allow retrieving documents that do not have a textual match with the query but are semantically relevant to it. The so-called vector branch comprises new vector indices that allow for fast retrieval of tens of thousands of potentially semantically-relevant documents which are further filtered using features computed by more computationally expensive Siamese BERT-based models. The deployment of the vector branch was the biggest technology change of the organic search of the last 10 years and improved overall search quality significantly. In the talk, I will describe the main components of the overall architecture and discuss the models behind it more thoroughly.

Thinking outside of the Euclidean Space: An introduction study to Graph Machine Learning and its Applications

Sachin Sharma, ArangoDB

So far we have read a lot about the Convolutional Neural Networks which is a well known method to handle euclidean data structures (like images, text, etc.). However in the real world, we are also surrounded by the non-euclidean data structure like graphs and a machine learning method to handle this type of data domain is known as the Graph Neural Networks. Therefore in this session we will first deep dive into the concepts of Graph ML and its applications in number of domains.


Towards human-like synthetic voice

Petr Fousek, The Mama ai

Synthetic speech is helping to replace humans in all sorts of dialogue systems in phones, voice services or at public places for its good intelligibility and availability. However, artificial voice is typically not good enough to read aloud books or replace actors' voices in dubbed movies. At we work towards the goal of building customizable voices which would carry personality traits of real humans. We use open-source technologies and data. In the talk we will show where we are, how we build voice models and share the lessons learned on our way. We will also discuss the challenges when synthesizing audio from a given text. And we will play some audio examples.

Lessons learned while training GANs

Jan Maly, STRV

Over the past few years, the Generative Adversarial Networks applications have seen astounding growth even though there are still many challenges in training. We will share some generalizable techniques that helped us overcome those challenges when separating music tracks from audio recorded during live events such as sports matches.

Living in Perfect Harmony - Where Music and Machine Learning Meet

Yama Anin Aminof, Meta (formerly Facebook)

The revolution of machine learning is reaching every aspect of our lives - including art and music.
In this talk, we will dive into the world of song analysis and the extraction of lyrical and musical features. We will discuss existing approaches, both in machine learning - Natural Language Processing, Digital Signal Processing - and in music theory & linguistics. Next, we will see how we can use these features in different kinds of machine learning models, and how these models can be used to solve problems in the music industry, such as song tags and song similarity.
Attend this talk to learn how your technical skills can be useful also to your hobbies.


Rule induction and reasoning in knowledge graphs

Daria Stepanova, Bosch Center for AI

Advances in information extraction have enabled the automatic construction of large knowledge graphs (KGs) like DBpedia, YAGO, Wikidata or Google Knowledge Graph. Learning rules from KGs is a crucial task for KG completion, cleaning and curation. This talk presents recent rule induction methods, research opportunities as well as open challenges along this avenue. A particular emphasis is put on the problem of learning exception-enriched and numerical rules from highly biased and incomplete data. We discuss possible extensions of classical rule induction techniques to account for unstructured resources (e.g., text) along with the structured ones.

Multi-modal question answering on text and tables

Timo Möller, Deepset

Previous Question Answering systems mostly worked on plain text data alone. In this talk, I will describe how you can use the open-source framework Haystack for searching inside both text and tables with the latest NLP technology. For Question Answering to work on tables you have to extract tables from PDFs, find the table that might contain the wanted information, and finally pick the answer from the table itself. To give a practical example we will showcase a prototype that answers questions on a pilot's manual, a project we developed in collaboration with Airbus.

Alquist, the social bot

Jan Šedivý, CIIRC, Czech Technical University

The presentation will introduce Alquist the social bot developed by a group of doctoral students working in Conversational AI at CIIRC CTU. The Alquist team made it four times to the finals of the Alexa Prize from more than a hundred academic teams and won the competition last year. Alquist carries an engaging and entertaining dialog about popular topics such as sports, celebrities, movies, etc. The presentation will introduce and explain the basic architecture and the NLP algorithms.


Combating drift in production ML

Ashley Scillitoe,

Deployed machine learning models can fail spectacularly in response to seemingly benign changes to the underlying process being modelled. In this talk, we give a practical overview to drift detection, the discipline focused on detecting such changes. We will start by building an understanding of the ways in which drift can occur, why it pays to detect it, and how it can be detected in a principled manner. A range of drift detection strategies will be introduced, and we will examine how they can be applied to realistic high-dimensional datasets. We will then discuss specific considerations regarding drift deployment in production environments, where data often arrives continuously. To finish, we will demonstrate how the theory can be put into practice using the open-source alibi-detect Python library.

The changing EU legal landscape on AI – challenges and opportunities

Christina Hitrova, PwC

This talk will offer an overview and actionable guidance on the challenges and opportunities offered by key legal developments in the EU that will affect AI and data-driven innovation, touching on the AI Act, the Digital Services Act, and the Data Act. The goal is to help companies understand how to adapt to the changing legal landscape and leverage it strategically to bolster their competitiveness. We will look at how legislative trends affect the work of organisations that design, deploy, use, and maintain AI systems and the challenges for compliance. In addition, we will look at the opportunities offered by these changes, such as greater access to data, increased trust in AI, and competitiveness of AI systems produced in the EU. Participants can take away concrete insights on what legal changes they should be aware of and what that means for their innovation strategy going forward.


La Fabrika, Komunardů 30, Praha 7

Conference day 1

La Fabrika, Komunardů 30, Praha 7 (and on-line)

Doors open at 08:30

Deep Learning Discovery of New Exoplanets

Hamed Valizadegan, NASA

Understanding ML via exactly solvable models

Lenka Zdeborová, École Polytechnique Fédérale de Lausanne

Bayesian Modeling in Industry

Thomas Wiecki, PyMC Labs

Bayesian Modeling is being widely adopted across various industries to solve data science problems. In this talk, we will look at why Bayesian modeling is so powerful and effective at solving problems ranging from marketing to biotech.


Image-to-lidar self-supervised distillation for autonomous driving data

Gilles Puy,

Lidar sensors deliver rich information about the 3D world, and making sense of this kind of information is crucial for an autonomous driving vehicle to properly act in its environment. I will briefly describe some of the 3D perception tasks we tackle at and then concentrate on one of our latest contributions which improves the annotation efficiency for semantic segmentation and object detection in sparse Lidar point clouds. This technique leverages synchronized and calibrated image and Lidar sensors in autonomous driving setups to distill self-supervised pre-trained 2D image representations into 3D models.

Zero to Hero: AI based assistance in industrial machine operation

Timo Leitritz, Fraunhofer Institute for Manufacturing Engineering and Automation

Training new users at a production machine is a time intensive and expensive task. In this talk we want to discuss the possibilities in leveraging AI technologies in computer vision, machine analysis and language to create a system that assists users in learning and executing machine operation. The focus lies on user centered human machine interaction and a flexible way to fit the system to new scenarios and machines as well as different user seniorities. There are unique challenges that we will outline and additionally present our own experiences from previous projects.sequences of activities that represent a task and text generation models to generate a human understandable step-by-step guideline for inexperienced users. This guideline can then be played back to this user accordingly. SLEM learns by watching an experienced user working on the machine. This combined approach is designed to be adaptable to a variety of different machines for multiple tasks like maintenance and operation. It leverages the recent advancements in AI and implements them in an industrial setting.

Kaggle competitions in object detection

Yauhen Babakhin,

In this talk, we will start by discussing what Kaggle is and its pros&cons for Data Scientists / ML Engineers of different levels. Then we will review a recent Great Barrier Reef competition where my team took the 3rd place. We will cover Object Detection approaches starting from the initial ideas towards the current State-of-the-Art models. Finally, we will discuss the top solutions of the Great Barrier Reef competition and their real-world applications.


Deploying transformers at scale: Addressing challenges and increasing performance

Pieter Luitjens, Private AI

Transformer networks have taken the NLP world by storm, powering everything from sentiment analysis to chatbots. However, the sheer size of these networks presents new challenges for deployment, such as how to provide acceptable latency and unit economics. The de-identification tasks Private AI services rely heavily on Transformer networks and involve processing large amounts of data. In this talk, I will go over the challenges we faced and how we managed to improve the latency and throughput of our Transformer networks, allowing our system to process Terabytes of data easily and cost-effectively.

Building complex ML pipelines to tackle business document understanding

Milan Šulc,
Petr Baudis,

Solving real-world problems at scale often requires more than direct application of straightforward ML models. Let's journey together through architecting a complex ML pipeline and show how a challenging high-level task can be decomposed into a series of trainable sub-tasks while not compromising on a pure machine learning approach. We will demonstrate this on the problem of document information extraction that we are solving at Rossum, and look at how it can be decomposed to (still hard, but attackable) sub-tasks such as named entity (field) localisation, tables recognition, key-value detection, few-shot learning via similar document retrieval, and, of course, OCR. And perhaps we will manage to show why building an AI system capable of understanding document content is so much more than “That’s just an OCR / NER problem, resolved a loong time ago...“.

Using machine learning to accelerate drug discovery

Aisling O’Sullivan, Dataclair



Lenka Zdeborová, École Polytechnique Fédérale de Lausanne
Daria Stepanova, Bosch Center for AI
Thomas Wiecki, PyMC Labs
Petr Baudis,


Have a great time Prague, the city that never sleeps

You can feel centuries of history at every corner in this unique capital. We'll invite you to get a taste of our best pivo (that’s beer in Czech) and then bring you back to the present day to party at one of the local clubs all night long!


Venue ML Prague 2022 will run hybrid, in person and online!

We are happy to announce that ML Prague is back as an in-person event in 2022. The main conference will be held at La Fabrika while our workshops will take place at CEVRO Institute. After 3 years, we can finally enjoy the conference together in one place.

We will also livestream the talks for all those participants who prefer to attend the conference online. Our platform will allow interaction with speakers and other participants too. Workshops require intensive interaction and won't be streamed.

Conference Hall

La Fabrika
Komunardů 30, Praha 7


CEVRO Institut
Jungmannova 28/17, Prague 1

Now or never Tickets

Early Bird

Sold Out

  • Conference days € 195
  • Only workshops € 150
  • Conference + workshops € 330

Standard Ticket

Sold Out

  • Conference days € 240
  • Only workshops € 170
  • Conference + workshops € 390

Hybrid Ticket

Sold out

  • Conference days € 280
  • Only workshops € 195
  • Conference + workshops € 450

What You Get

  • Practical and advanced level talks led by top experts
  • 2 parties in the city with people from around the world. Let’s go wild!
  • Delicious food and snacks throughout the conference

They’re among us We are in The ML Revolution age

Machines can learn. Incredibly fast. Faster than you. They are getting smarter and smarter every single day, changing the world we’re living in, our business and our life. The artificial intelligence revolution is here. Come, learn and make this threat your biggest advantage.

Our Attendees What they say about ML Prague

Thank you to Our Partners

Co-organizing Partner

Platinum Partners

Gold partners

Silver Partners

Communities and Further support

Happy to help Contact

If you have any questions about Machine Learning Prague, please e-mail us at


Jiří Materna
Scientific program & Co-Founder

Teresa Caklova
Event production

Natalija Slavkovska
Social media

Jona Azizaj
Communities and partnerships

Gonzalo V. Fernández