The biggest European conference about ML AI and Deep Learning applications
running in person in Prague and online.
Machine Learning Prague 2022
In cooperation with Kiwi.com
– , 2022Tickets
World class expertise and practical content packed in 3 days!
You can look forward to an excellent lineup of 45 international experts in ML and AI business and academic applications at ML Prague 2022. They will present advanced practical talks, hands-on workshops and other forms of interactive content to you.
Stay tuned. We will publish our full program with talks and 1-day, hands-on workshops soon!
What to expect
- 500+ Attendees
- 3 Days
- 45 Speakers
- 8 Workshops
- 2 Parties
Hamed ValizadeganSenior Machine Learning Scientist, NASA
Holder of a PhD in computer science with focus on machine learning and data mining, Hamed Valizadegan joined NASA Ames Research Center (USRA) as a machine learning research scientist in 2013. At Ames, he has been involved with multiple projects including Automatic Planet Discovery (Kepler and TESS missions), Vascular Image Segmentation (Space Biology), Display Verification (Orion mission), and data driven prognostics (Hubble Space Telescope). Before joining NASA Ames, he spent three years at University of Pittsburgh conducting research in Medical Informatics. He has published more than 25 peer reviewed papers and been invited to many industrial level conferences as speaker and keynote speaker.
Daria StepanovaResearch scientist, Bosch Center for AI
Daria Stepanova is a research scientist at Bosch Center for Artificial Intelligence. Her research interests include knowledge representation and reasoning, machine learning and neuro-symbolic AI. Previously Daria was a senior researcher at Max Plank Institute for Informatics (Germany), where she was heading a group on semantic data. Daria got her diploma degree in Applied Computer Science from the Department of Mathematics and Mechanics of St. Petersburg State University (Russia) in 2010 and a PhD in Computational Logic from Vienna University of Technology (Austria) in 2015. Before starting her PhD she worked as a visiting researcher at the School of Computing Science at Newcastle University (UK) in an industrially-oriented project.
Stanislav FortResearch Scientist, Anthropic, Stanford University
Stanislav Fort is a researcher scientist at Anthropic where he focuses on large language models and trying to build a safe general AI. He has a PhD from Stanford University where he was advised by Prof Surya Ganguli. His research focuses on developing the "Science of Deep Learning" -- a principled, scientific understanding of what makes deep learning as successful as it is. His secondary focus is on applications of machine learning and artificial intelligence in the physical sciences, in domains spanning from X-ray astrophysics to quantum computing. In the past, Stanislav did research at Google Brain, DeepMind, and Salesforce AI. He received his Bachelor’s and Master’s degrees in Physics at Trinity College, University of Cambridge, and a Master’s degree at Stanford University.
Lenka ZdeborováProfessor, École Polytechnique Fédérale de Lausanne
Lenka Zdeborová is a Professor of Physics and of Computer Science in École Polytechnique Fédérale de Lausanne where she leads the Statistical Physics of Computation Laboratory. She received a PhD in physics from University Paris-Sud and from Charles University in Prague in 2008. She spent two years in the Los Alamos National Laboratory as the Director's Postdoctoral Fellow. Between 2010 and 2020 she was a researcher at CNRS working in the Institute of Theoretical Physics in CEA Saclay, France. In 2014, she was awarded the CNRS bronze medal, in 2016 Philippe Meyer prize in theoretical physics and an ERC Starting Grant, in 2018 the Irène Joliot-Curie prize, in 2021 the Gibbs lectureship of AMS, and the Neuron Fund award. She is an editorial board member for Journal of Physics A, Physical Review E, Physical Review X, SIMODS, Machine Learning: Science and Technology, and Information and Inference. Lenka's expertise is in applications of concepts from statistical physics, such as advanced mean field methods, replica method and related message-passing algorithms, to problems in machine learning, signal processing, inference and optimization. She enjoys erasing the boundaries between theoretical physics, mathematics and computer science.
Jan ŠedivýResearcher, CIIRC, Czech Technical University
Jan Šedivý has three decades of experience in the IT industry. He has led numerous global research and development projects and is the holder of 19 US patents. He began as a researcher and research manager at the IBM Thomas J. Watson Research Center (1992-2008), later moving to Google as a Technical Lead Manager (2008-2010). He subsequently returned to CTU-CIIRC, where he leads the NLP group, winning the Amazon Alexa Prize 4 in 2021 and two seconds and one-third place in previous contests.
Timo MöllerHead of ML, Deepset
Timo Möller is Co-Founder of deepset and Head of Field Engineering, where he is at the intersection of latest NLP technology and production use. He studied computational neuroscience, is an open-source fan and passionate NLP engineer. Currently he researches better evaluation metrics for Question Answering and continually improves out of domain performance of deepsets open-source framework Haystack at customer projects.
Yama Anin AminofData Scientist, Meta (formerly Facebook)
Yama Anin Aminof is a Data Scientist at Meta. In her previous role, she worked at MyPart, an Israeli startup in the music industry, developing algorithms and researching lyrical and musical song features. Yama is an activist both in the social world, fighting the violence against women and children, and in the technological world, giving tech talks and mentoring female developers through their first steps in the data science world. Yama has a B.Sc in Mathematics and Physics from Tel Aviv University where she also expresses her passion for music by playing the saxophone in the TAU Wind Band.
Aisling O’SullivanSenior Data Scientist, Dataclair
Aisling O’Sullivan is a Senior Data Scientist at O2 AICentre/Dataclair where she uses machine learning to help accelerate medical research. She has worked with pharmaceutical companies to help discover novel targets for cancer immunotherapy and other applications. She previously received her PhD in Computational Neuroscience from Trinity College Dublin where she used machine learning models to understand how the brain processes speech and language. She also spent time as a visiting PhD researcher at the University of Rochester, USA and received her degree and master’s in Biomedical Engineering from Trinity College Dublin.
Gilles PuyResearch Scientist, Valeo.ai
Gilles Puy has been a research scientist at valeo.ai since 2019, working mainly on 3D perception for automotive applications, assisted and autonomous driving. Before, he was a researcher at Technicolor R&I (2016-2019), and a postdoc at INRIA (2014-2016). He obtained his PhD in signal/image processing from EPFL in 2014.
Thomas WieckiCEO & Founder, PyMC Labs
Dr. Thomas Wiecki is a PyMC author that holds a Ph.D. on computational cognitive neuroscience from Brown University. He is a former VP of data science and head of research at Quantopian Inc: building a team of data scientists to build a hedge fund from a pool of 300k crowd researchers. Thomas is also a recognized public speaker with keynotes at the Open Data Science Conference (ODSC) & TACC as well as talks at Strata Hadoop & AI, Newsweek AI conference, and various PyData conferences around the world
He runs a data science podcast, blogs about Bayesian statistics, and is an avid Twitter personality.
He’s been recognized as well as data scientist to follow of the year 2015, and ODSC Open Source award 2018.
Ashley ScillitoeData science research engineer, Seldon.io
Ashley is a data science research engineer at Seldon, where he works on developing production-ready tools for drift, adversarial and outlier detection. Prior to joining Seldon, he spent a number of years as a Research Fellow at The Alan Turing Institute. Here, he explored the use of machine learning for tackling aerospace engineering problems, with a focus on explainability and uncertainty quantification. Ashley also completed a PhD at the University of Cambridge, and is a keen proponent of open-source software, regularly contributing to a number of libraries as well as mentoring for programs such as Google Summer of Code.
Yauhen BabakhinSenior Data Scientist, H2O.ai
Yauhen is a Kaggle competitions Grandmaster with a total of 9 gold medals in classic ML, NLP and CV competitions. He holds a Master’s Degree in Applied Data Analysis from the Belarusian State University. Yauhen has 7 years of experience in Data Science having worked in the Banking, Gaming and eCommerce industries. At H2O.ai his focus area is Deep Learning and Computer Vision, in particular. Yauhen has developed Computer Vision AutoML functionality for H2O.ai’s Driverless AI.
Pieter LuitjensCTO & Co-Founder, Private AI
Pieter Luitjens has a Bachelor of Science in Physics and Mathematics and a Bachelor of Engineering from the University of Western Australia, as well as a Masters from the University of Toronto. He worked on software for Mercedes-Benz and developed the first deep learning algorithms for traffic sign recognition deployed in cars made by one of the most prestigious car manufacturers in the world. He has over 10 years of engineering experience, with code deployed in multi-billion dollar industrial projects. Pieter specializes in ML edge deployment & model optimization for resource-constrained environments.
Christina HitrovaDigital Ethics and Compliance Consultant, PwC
Christina is a Digital Ethics and Compliance Consultant at PwC Czech Republic where she works on responsible AI, data governance, and digital transformation projects. Previously Christina researched law and technology design at the Technical University in Munich, consulted UK public authorities on responsible and ethical data-driven innovation at The Alan Turing Institute in London, and worked with international research consortia to implement privacy- and ethics-by-design practices in the creation of cutting edge technologies at Trilateral Research. She has experience working with diverse technologies, including machine learning, civil drones, OSINT, blockchain, and self-sovereign identity. She holds two Master degrees in International and European Law from the University of Zurich and the Catholic University of Leuven (KU Leuven) and spent time working at the European Commission on international trade disputes.
Timo LeitritzComputer Vision Researcher, Fraunhofer Institute for Manufacturing Engineering and Automation
Timo Leitritz is a Computer Vision Researcher at Fraunhofer Institute for Manufacturing Engineering and Automation IPA in Germany, focusing on worker assistance in industrial applications using AI. Before working at Fraunhofer IPA he graduated at the Karlsruhe Institute of Technology with a Master’s Degree in Mechatronics and Information Technology, gaining interdisciplinary knowledge in the fields of Robotics, Computer Vision and Deep Learning. Together with industrial partners he investigates new ways of improving user experience in manufacturing environments with the help of human activity recognition.
Sachin SharmaML Research Engineer, ArangoDB
Sachin is a Machine Learning Research Engineer at ArangoDB whose aim is to build Intelligent products using thorough research and engineering in the area of Graph Machine Learning. He completed his Masters’s degree in Computer Science with a specialization in Intelligent Systems. He is an AI Enthusiast who has conducted research in the areas of Computer Vision, NLP, and Graph Neural Networks at DFKI (German Research Centre for AI) during his academic career. Sachin also worked on building Machine Learning pipelines at Define Media Gmbh where he worked as a Machine Learning Engineer and Scientist.
Jakub NáplavaResearcher, Seznam.cz
Jakub is a researcher at Seznam.cz with current focus on improving web search relevance ranking. As a Ph.D. candidate at Institute of Formal and Applied Linguistics at Charles University in Prague, he developed several state-of-the-art systems for grammatical error correction. He also created a large and diverse grammatical error correction corpus for Czech.
Petr FousekHead of Speech Research, The Mama ai
Petr is the head of speech research at The Mama AI. During over ten years at IBM he built speech recognition and speaker diarization engines which currently power IBM Watson speech services. He holds several US patents. His background is in acoustic modeling using neural networks which he studied at IDIAP, LIMSI and CTU Prague. Petr's current goal at Mama AI is to build human-like expressive voice synthesis which can adapt to fit individual customer's desires.
Jan MalyMachine Learning Lead, STRV
Jan Maly leads the Data Science Team at STRV. He is passionate about AI and its potential to transform business. He has been in charge of multiple data-driven projects involving Machine Learning and Advanced Data Analytics. He enjoys solving business challenges through better data understanding and problem formulation. He got his master's degree in Artificial Intelligence at the Faculty of Electrical Engineering, Czech Technical University. Currently, he is on a quest to help STRV develop its platform for Data Science.
Jan has also recently become a father. Besides work and parenthood, he is an avid traveler and swimmer who enjoys sci-fi and fantasy.
Milan ŠulcHead of AI Lab, Rossum.ai
Milan Sulc leads the machine learning research in Rossum AI Labs, focusing on problems related to document understanding and information extraction. He holds a Ph.D. from the Visual Recognition Group at CTU in Prague and worked with Toyota, Google, Electrolux and Xerox Research on applied research projects including fine-grained image classification, object detection and monocular 3D detection, image retrieval, and domain adaptation. Milan won several ML competitions, was awarded university prizes for his theses and the Prize of Josef Hlávka for the best students and graduates, and has been supervising successful theses and projects of his students.
Frederick BednarData Analyst, EBCONT
Having an economic & statistical background (Master degree in Management Science at WU Wien), Frederick Bednar has also gathered technical experience as a “data aficionado” for about two decades now. At EBCONT, he works as a Data Analytics and Data Science Consultant and is responsible for projects especially in Data Science, ML/DL, NLP & NER, but also BI consulting and IoT projects.
Julian-Thomas ErdödyExpert in deep learning, EBCONT
Julian Erdödy is considered an expert in the design of Deep Learning Models for smart applications in security as well as retail industries. Before joining EBCONT, he co-founded and led a Startup in Vienna for more than five years that successfully developed and deployed a computer vision engine creating situational knowledge from image and video data.
Clifford BednarData Analyst, EBCONT
Starting from Software Development & Engineering as well as Data Analytics, Clifford Bednar has evolved his interest in various Machine Learning & Deep Learning techniques. He is now successfully combining his experience at EBCONT in different projects.
Jan Russenior researcher, Emplifi
Jan Rus graduated in Computer Graphics at the Faculty of Applied Sciences, University of West Bohemia, Pilsen where he also worked as a Scientific Researcher for 5 years. For data compression research, he received the Best Suitable Commercial Application award in 2010.
After leaving academia, Jan became a founding member of the research team at Emplifi (formerly Socialbakers), where he currently works as a Senior Researcher.
At Emplifi, Jan is mostly responsible for the design, research and development of core product features exploiting big data analysis and machine learning techniques. Creation of concepts and bringing them from concepts to working prototypes and implementations.
When not working for Emplifi, Jan cooperates with various startups helping them to solve data-related problems. In his free time, Jan enjoys movies and virtual reality.
Peter Jungjunior researcher, Emplifi
Peter Jung got his master’s degree in artificial intelligence at the Faculty of Electrical Engineering, Czech Technical University in Prague. In that time, he was working in Heureka as a Python engineer and helped them with the first bigger machine learning project as a part of his diploma thesis. Currently he continues as a part-time PhD student at the same university and at Emplifi, he works as a Junior Researcher, where his main responsibility is to deliver natural language processing and computer vision oriented solutions. At Emplifi, he also hosts an advanced Python education group where people gather monthly and share their knowledge.
When he isn’t training models, he’s training Tobias, the chihuahua. He likes to travel and would like to speak French at some time.
Radovan KavickyPrincipal Data Scientist & President, GapData Institute
Radovan Kavicky joined Datacamp among its first employees (historically 1st Data Science Instructor from CEE region & is still historically the only one worldwide who have made successful transition from regular student to instructor and employee after being #1 worldwide @ Datacamp platform for nearly a year, back in 2017).
Radovan is Data Science Polyglot (R, Python, Julia ++more) and Data Science Veteran with over 10 years of experience in Data Science and Applied AI/ML Consulting & extensive knowledge in the area (Data Science consulting, education & community building with successful cooperation with global leaders within our industry, like f.e. H2O.ai, Anaconda or Tableau). Radovan is also co-founder of Slovak.AI (Slovak Research Center for Artificial Intelligence) and member of various international professional societies within our Data Science & AI/ML industry, like f.e. IEEE Computer Society, CLAIRE (Confederation of Laboratories for Artificial Intelligence Research in Europe), European AI Alliance (European Commission/Futurium), TAILOR network (Trustworthy AI - Integrating Learning, Optimisation and Reasoning), UDSC (United Data Science Communities), PyData Global Network, Global Tableau #DataLeader network & The Python Software Foundation (PSF).
Radovan is also Founder of PyData Slovakia/Bratislava (#PyDataSK #PyDataBA), R <- Slovakia (#RSlovakia), Julia Users Group Slovakia (#JUGSlovakia), SK/CZ Tableau User Group (#skczTUG) & Effective Altruism Slovakia (#EASlovakia) that you are all welcome to join.
Tomáš NeubauerCTO, Quix
Tomáš Neubauer is cofounder and CTO at Quix, responsible for the technical direction of the company across the full technical stack, and working as a technical authority for the engineering team. He was previously technical lead at McLaren, where he led architecture uplift for Formula One racing real-time telemetry acquisition. He later led platform development outside motorsport, reusing the knowhow he gained from racing.
Javier Blanco CorderoSenior Data Scientist, Quix
Javier Blanco Cordero is a Senior Data Scientist at Quix, where he helps customers getting the most out of their data science projects. He was previously a Senior Data Scientist at Orange, developing churn prediction, marketing mix modelling, propensity to purchase models and more. Javier is a masters’ lecturer and speaker, specializing in pragmatic data science and causality.
Michal KubištaSenior Data Scientist, Dataclair.ai
Michal is a Senior Data Scientist at O2 AICentre/Dataclair, where he works on optimising the customer lifetime value. He is a major advocate of TF-Agents and has facilitated migration to this framework with his team. He also lectures master's courses on machine learning and mathematical methods at CERGE-EI and Charles University.
Petr StanislavHead of AI technology, Dataclair.ai
Petr's mission is to make the data scientist's life easier. He is responsible for developing the Data and Machine Learning platform in O2 AICentre/Dataclair. Currently, he's working on the implementation of a feature store. He also leads the data and machine learning engineering team. Machine learning and data is his passion.
Until the end of 2019, he also served as a researcher for the Department of Cybernetics of the Faculty of Applied Science and as a Teacher. There he worked for more than eight years on research and development in artificial intelligence, speech technologies, natural language processing, and web technologies.
In June 2020, he successfully defended his PhD in Artificial Intelligence.
Tomáš NováčikData scientist, Seznam.cz
Tomas is Data scientist at Seznam. He focuses on machine learning model's personalization. His main goal is to allow machine learning model's personalization across Seznam's ecosystem . Tomas likes climbing and running.
Adam JurčíkData Scientist, Seznam.cz
Adam did his master degree in Computer Science and then he focused on Visual Data Analysis while working as a researcher at visitlab, Masaryk University. Later, Adam moved from academia to Seznam where he currently helps to turn data of milions of users into Seznam's products. At Seznam, Adam moved from data analysis to data modeling.
Václav BlahutMachine Learning Researcher, Seznam.cz
Václav mastered AI & NLP course at FI MUNI and is now teaching machines how to recommend articles and other content to Seznam users. He's been doing it for more than three years now. Currently, he is trying to figure out how to combine multiple recommender systems and also how to use his NLP knowledge to improve recommendations in general.
Radek TomšůMachine Learning Researcher, Seznam.cz
Radek started as a Machine Learning Engineer at Seznam.cz four years ago and over time moved into Data Science area. His main focus in the past year was to find new models and approaches in our content recommendation system to improve user satisfaction and performance of the system.
Vít LíbalMachine Learning Researcher, Seznam.cz
Vit is a machine Learning Researcher at Seznam.cz since 2021. His work at Seznam.cz focuses on reinforcing the Seznam Advertizing System with Data Science and Machine Learning methods with aim to make it an efficient data centric system that provides the best experience for Seznam customers, publishers and Internet users. Vit received PhD in Electrical Engineering and Informatics in 2001 from Czech Technical University in Prague. His past career includes research positions at research labs in IBM and Honeywell.
Practical & Inspiring Program
at CEVRO Institut, Jungmannova 28/17, Prague 1 (workshops won't be streamed)
|Room 103||Room 106||Room 203||Room 205|
Text analysis with Apache Spark 3.x and Python
David Vrba, Emplifi
Apache Spark became a standard for data processing in a big data environment. It is well integrated with the Python programming language and the integration became even more emphasized in the 3.x releases. In this hands-on workshop we will see how Spark can be used for analyzing textual data using Spark SQL along with the native package for machine learning - Spark ML. We will also explore Spark NLP which is a state-of-the-art library for natural language processing that provides machine learning and deep learning capabilities for text analysis on top of Spark.
Synthetic Data Generation for Computer Vision
Frederick Bednar, EBCONT
Collecting reliable and properly labeled image and video data in sufficient quantities denotes one of the major challenges in computer vision still preventing many projects both in research and industrial domains from seeing the light of day. In this workshop we would like to show you how to generate and use synthetic datasets with the help of game engines in order to accelerate the image annotation process. We will augment our datasets using domain randomization techniques to simulate possible variations and scenarios in the real data. Finally we will use these datasets to train a neural network and demonstrate the benefit of this approach by measuring the network’s performance against real data.
Language Model Essentials: Pre-training, Metrics, and Community
Nick Doiron, Hewlett Packard Enterprise
Learn the essentials to train fine-tune and patch language models with the Transformers library. In this workshop we will compare accuracy of masked language models on select tasks using architectures such as BERT and T5. For generative models (such as GPT-2) we explore the options to generate text through greedy search and beam search. In the end we will cover how to participate in the open source NLP community including sharing language models on HuggingFace and/or AdapterHub.
ML in live data processing
Tomáš Neubauer, Quix
In this workshop you will learn how to use machine learning in real-time systems. You will process data live with a trained ML model with almost no latency. In 3 hour workshop you will get a chance to build PoC using Python from scratch with a team that worked in F1 racing processing car telemetry at a massive scale.
Recommendation systems and user representations
Tomáš Nováčik, Seznam.cz
Popularity of deep neural networks and embeddings in machine learning is transcending into the realm of recommender systems and is getting attraction within industry. Recommendation systems are used in many industries such as eCommerce social networks content providers and many more. They are improving user experience radically. In the theoretical part of the workshop we will go through different architectures of neural networks that are currently the state of the art in the recommendation domain. In the practical part we will train deep neural networks on our internal datasets and demonstrate benefits of various architectures and user features. In particular we will show how to employ a variety of user features to address the cold-start problem.
Practical aspects of reinforcement learning: Build your own elements of contextual bandits in TF-Agents
Michal Kubišta, Dataclair.ai
Reinforcement learning (RL) models are a new type of intelligent machine that can help you drive your car or beat you in Starcraft. There are many RL libraries and they implement a wide range of policies (models) environments and other elements of RL. However they can never cover all use cases and you might quickly find out you need to build your pieces to make the package work for your project. This decision likely leads to scarcely documented protocols and interfaces you need to fulfil and this is where we want to help. Since implementing the full reinforcement learning solutions in a business setup (outside of the typical use cases with simulated environments) leads to additional complexities this workshop will focus on contextual multi-armed bandits (CMAB) a middle step between supervised and reinforcement learning. We will first review all building blocks of the RL / CMAB framework and then walk you through building a custom implementation of those elements which will include a lot of code running on tf.Graph. After this session you should understand the (dis)advantages of using CMAB and be ready to start using TF-Agents in your projects.
Reverse Image Search
Jan Rus, Emplifi
Find most similar images in the data given a reference image. We start with a simple baseline using ImageHash. Then show its limitations and proceed to a more robust solution using the latest DL models. With an adjustable threshold specifying how big differences are allowed. Along with fixes for edge-cases like completely black or white images.
Explainable AI/ML (XAI) in Python
Radovan Kavicky, GapData Institute
In this workshop led by Radovan Kavicky from Datacamp Basecamp.ai & GapData Institute you will get familiar with Explainable AI (XAI) and how to implement these principles in Python. Together we will open the "black box" of machine learning where sometimes even its designers cannot fully explain why an AI/ML arrived at a specific decision and also point out differences from statistical learning. We will learn how to better design systems that imitate intelligence in transparent way and you will also get an overview of current trends in Explainable AI/ML.
La Fabrika, Komunardů 30, Praha 7 (and on-line)
Registration from 9:00
Welcome to ML Prague 2022
Deep Learning Discovery of New ExoplanetsHamed Valizadegan, NASA
Seznam.cz organic search going semanticJakub Náplava, Seznam.cz
In 2021, the organic search of Seznam.cz was enhanced with semantic vectors. These form a separate branch to the traditional term search and allow retrieving documents that do not have a textual match with the query but are semantically relevant to it. The so-called vector branch comprises new vector indices that allow for fast retrieval of tens of thousands of potentially semantically-relevant documents which are further filtered using features computed by more computationally expensive Siamese BERT-based models. The deployment of the vector branch was the biggest technology change of the organic search of the last 10 years and improved overall search quality significantly. In the talk, I will describe the main components of the overall architecture and discuss the models behind it more thoroughly.
Thinking outside of the Euclidean Space: An introduction study to Graph Machine Learning and its ApplicationsSachin Sharma, ArangoDB
So far we have read a lot about the Convolutional Neural Networks which is a well known method to handle euclidean data structures (like images, text, etc.). However in the real world, we are also surrounded by the non-euclidean data structure like graphs and a machine learning method to handle this type of data domain is known as the Graph Neural Networks. Therefore in this session we will first deep dive into the concepts of Graph ML and its applications in number of domains.
Towards human-like synthetic voicePetr Fousek, The Mama ai
Synthetic speech is helping to replace humans in all sorts of dialogue systems in phones, voice services or at public places for its good intelligibility and availability. However, artificial voice is typically not good enough to read aloud books or replace actors' voices in dubbed movies. At MAMA.ai we work towards the goal of building customizable voices which would carry personality traits of real humans. We use open-source technologies and data. In the talk we will show where we are, how we build voice models and share the lessons learned on our way. We will also discuss the challenges when synthesizing audio from a given text. And we will play some audio examples.
Lessons learned while training GANsJan Maly, STRV
Over the past few years, the Generative Adversarial Networks applications have seen astounding growth even though there are still many challenges in training. We will share some generalizable techniques that helped us overcome those challenges when separating music tracks from audio recorded during live events such as sports matches.
Living in Perfect Harmony - Where Music and Machine Learning MeetYama Anin Aminof, Meta (formerly Facebook)
The revolution of machine learning is reaching every aspect of our lives - including art and music.
In this talk, we will dive into the world of song analysis and the extraction of lyrical and musical features. We will discuss existing approaches, both in machine learning - Natural Language Processing, Digital Signal Processing - and in music theory & linguistics. Next, we will see how we can use these features in different kinds of machine learning models, and how these models can be used to solve problems in the music industry, such as song tags and song similarity.
Attend this talk to learn how your technical skills can be useful also to your hobbies.
Rule induction and reasoning in knowledge graphsDaria Stepanova, Bosch Center for AI
Advances in information extraction have enabled the automatic construction of large knowledge graphs (KGs) like DBpedia, YAGO, Wikidata of Google Knowledge Graph. Learning rules from KGs is a crucial task for KG completion, cleaning and curation. This tutorial presents state-of-the-art rule induction methods, recent advances, research opportunities as well as open challenges along this avenue. We put a particular emphasis on the problems of learning exception-enriched and numerical rules from highly biased and incomplete data. Finally, we discuss possible extensions of classical rule induction techniques to account for unstructured resources (e.g., text) along with the structured ones.
Multi-modal question answering on text and tablesTimo Möller, Deepset
Previous Question Answering systems mostly worked on plain text data alone. In this talk, I will describe how you can use the open-source framework Haystack for searching inside both text and tables with the latest NLP technology. For Question Answering to work on tables you have to extract tables from PDFs, find the table that might contain the wanted information, and finally pick the answer from the table itself. To give a practical example we will showcase a prototype that answers questions on a pilot's manual, a project we developed in collaboration with Airbus.
Alquist, the social botJan Šedivý, CIIRC, Czech Technical University
The presentation will introduce Alquist the social bot developed by a group of doctoral students working in Conversational AI at CIIRC CTU. The Alquist team made it four times to the finals of the Alexa Prize from more than a hundred academic teams and won the competition last year. Alquist carries an engaging and entertaining dialog about popular topics such as sports, celebrities, movies, etc. The presentation will introduce and explain the basic architecture and the NLP algorithms.
Combating drift in production MLAshley Scillitoe, Seldon.io
Deployed machine learning models can fail spectacularly in response to seemingly benign changes to the underlying process being modelled. In this talk, we give a practical overview to drift detection, the discipline focused on detecting such changes. We will start by building an understanding of the ways in which drift can occur, why it pays to detect it, and how it can be detected in a principled manner. A range of drift detection strategies will be introduced, and we will examine how they can be applied to realistic high-dimensional datasets. We will then discuss specific considerations regarding drift deployment in production environments, where data often arrives continuously. To finish, we will demonstrate how the theory can be put into practice using the open-source alibi-detect Python library.
The changing EU legal landscape on AI – challenges and opportunitiesChristina Hitrova, PwC
This talk will offer an overview and actionable guidance on the challenges and opportunities offered by key legal developments in the EU that will affect AI and data-driven innovation, touching on the AI Act, the Digital Services Act, and the Data Act. The goal is to help companies understand how to adapt to the changing legal landscape and leverage it strategically to bolster their competitiveness. We will look at how legislative trends affect the work of organisations that design, deploy, use, and maintain AI systems and the challenges for compliance. In addition, we will look at the opportunities offered by these changes, such as greater access to data, increased trust in AI, and competitiveness of AI systems produced in the EU. Participants can take away concrete insights on what legal changes they should be aware of and what that means for their innovation strategy going forward.
Conference day 1
La Fabrika, Komunardů 30, Praha 7 (and on-line)
Doors open at 08:30
The high-dimensional geometry of deep neural network loss landscapesStanislav Fort, Anthropic, Stanford University
Large deep neural networks trained with gradient descent have been extremely successful at learning solutions to a broad suite of difficult problems across a wide range of domains. Despite their tremendous success, we still do not have a detailed, predictive understanding of how they work and what makes them so effective. In this talk, I will describe recent efforts to understand the structure of deep neural network loss landscapes and how gradient descent navigates them during training. In particular, I will discuss a phenomenological approach to modeling their large-scale structure using high-dimensional geometry, the role of their nonlinear nature in the early phases of training, and its effects on ensembling, calibration, and approximate Bayesian techniques.
Understanding ML via exactly solvable modelsLenka Zdeborová, École Polytechnique Fédérale de Lausanne
Bayesian Modeling in IndustryThomas Wiecki, PyMC Labs
Bayesian Modeling is being widely adopted across various industries to solve data science problems. In this talk, we will look at why Bayesian modeling is so powerful and effective at solving problems ranging from marketing to biotech.
Image-to-lidar self-supervised distillation for autonomous driving dataGilles Puy, Valeo.ai
Lidar sensors deliver rich information about the 3D world, and making sense of this kind of information is crucial for an autonomous driving vehicle to properly act in its environment. I will briefly describe some of the 3D perception tasks we tackle at valeo.ai and then concentrate on one of our latest contributions which improves the annotation efficiency for semantic segmentation and object detection in sparse Lidar point clouds. This technique leverages synchronized and calibrated image and Lidar sensors in autonomous driving setups to distill self-supervised pre-trained 2D image representations into 3D models.
Zero to Hero: AI based assistance in industrial machine operationTimo Leitritz, Fraunhofer Institute for Manufacturing Engineering and Automation
Training new users at a production machine is a time intensive and expensive task. In this talk we want to discuss the possibilities in leveraging AI technologies in computer vision, machine analysis and language to create a system that assists users in learning and executing machine operation. The focus lies on user centered human machine interaction and a flexible way to fit the system to new scenarios and machines as well as different user seniorities. There are unique challenges that we will outline and additionally present our own experiences from previous projects.sequences of activities that represent a task and text generation models to generate a human understandable step-by-step guideline for inexperienced users. This guideline can then be played back to this user accordingly. SLEM learns by watching an experienced user working on the machine. This combined approach is designed to be adaptable to a variety of different machines for multiple tasks like maintenance and operation. It leverages the recent advancements in AI and implements them in an industrial setting.
Kaggle competitions in object detectionYauhen Babakhin, H2O.ai
In this talk, we will start by discussing what Kaggle is and its pros&cons for Data Scientists / ML Engineers of different levels. Then we will review a recent Great Barrier Reef competition where my team took the 3rd place. We will cover Object Detection approaches starting from the initial ideas towards the current State-of-the-Art models. Finally, we will discuss the top solutions of the Great Barrier Reef competition and their real-world applications.
Deploying transformers at scale: Addressing challenges and increasing performancePieter Luitjens, Private AI
Transformer networks have taken the NLP world by storm, powering everything from sentiment analysis to chatbots. However, the sheer size of these networks presents new challenges for deployment, such as how to provide acceptable latency and unit economics. The de-identification tasks Private AI services rely heavily on Transformer networks and involve processing large amounts of data. In this talk, I will go over the challenges we faced and how we managed to improve the latency and throughput of our Transformer networks, allowing our system to process Terabytes of data easily and cost-effectively.
Building complex ML pipelines to tackle business document understandingMilan Šulc, Rossum.ai
Solving real-world problems at scale often requires more than direct application of straightforward ML models. Let's journey together through architecting a complex ML pipeline and show how a challenging high-level task can be decomposed into a series of trainable sub-tasks while not compromising on a pure machine learning approach. We will demonstrate this on the problem of document information extraction that we are solving at Rossum, and look at how it can be decomposed to (still hard, but attackable) sub-tasks such as named entity (field) localisation, tables recognition, key-value detection, few-shot learning via similar document retrieval, and, of course, OCR. And perhaps we will manage to show why building an AI system capable of understanding document content is so much more than “That’s just an OCR / NER problem, resolved a loong time ago...“.
Using machine learning to accelerate drug discoveryAisling O’Sullivan, Dataclair
Have a great time Prague, the city that never sleeps
You can feel centuries of history at every corner in this unique capital. We'll invite you to get a taste of our best pivo (that’s beer in Czech) and then bring you back to the present day to party at one of the local clubs all night long!
Venue ML Prague 2022 will run hybrid, in person and online!
We are happy to announce that ML Prague is back as an in-person event in 2022. The main conference will be held at La Fabrika while our workshops will take place at CEVRO Institute. After 3 years, we can finally enjoy the conference together in one place.
We will also livestream the talks for all those participants who prefer to attend the conference online. Our platform will allow interaction with speakers and other participants too. Workshops require intensive interaction and won't be streamed.
Komunardů 30, Praha 7
Jungmannova 28/17, Prague 1
Now or never Tickets
What You Get
- Practical and advanced level talks led by top experts
- 2 parties in the city with people from around the world. Let’s go wild!
- Delicious food and snacks throughout the conference
They’re among us We are in The ML Revolution age
Machines can learn. Incredibly fast. Faster than you. They are getting smarter and smarter every single day, changing the world we’re living in, our business and our life. The artificial intelligence revolution is here. Come, learn and make this threat your biggest advantage.
Our Attendees What they say about ML Prague
Are you attending too? Do you have tips for what not to miss?February 27, 2021
Guys, job more than well done 👍 thanks for great conference🙂— Ivan Kasanický (@IvanKasanicky) February 28, 2021
Thank you to Our Partners
Communities and Further support
Happy to help Contact
If you have any questions about Machine Learning Prague, please e-mail us at
Scientific program & Co-Founder
Communities and partnerships
Gonzalo V. Fernández