Christian Thygesen: Efficient generative modelling of protein structure fragments using a Deep Markov Model (Part 2 of 3)

Christian Thygesen:  Efficient generative modelling of protein structure fragments using a Deep Markov Model (Part 2 of 3)

Machine Learning Seminar presentation

Topic: Efficient generative modelling of protein structure fragments using a Deep Markov Model (Part 2 of 3)

Speaker: Christian ThygesenIndustrial PhD student with Evaxion / University of Copenhagen

Time: Wednesday, 2022.02.09, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

Fragment libraries are often used in protein structure prediction, simulation and design as a means to significantly reduce the vast conformational search space. Current state-of-the-art methods for fragment library generation do not properly account for aleatory and epistemic uncertainty, respectively due to the dynamic nature of proteins and experimental errors in protein structures. Additionally, they typically rely on information that is not generally or readily available, such as homologous sequences, related protein structures and other complementary information.
To address these issues, we developed BIFROST, a novel take on the fragment library problem based on a Deep Markov Model architecture combined with directional statistics for angular degrees of freedom, implemented in the deep probabilistic programming language Pyro. BIFROST is a probabilistic, generative model of the protein backbone dihedral angles conditioned solely on the amino acid sequence. BIFROST generates fragment libraries with a quality on par with current state-of-the-art methods at a fraction of the run-time, while requiring considerably less information and allowing efficient evaluation of probabilities.

Additional material:

Video recording: https://youtu.be/hydwEaq_Uuo

Thomas Hamelryck: Deep probabilistic programming and the protein folding problem (Part 1 of 3)

Thomas Hamelryck: Deep probabilistic programming and the protein folding problem (Part 1 of 3)

Machine Learning Seminar presentation

Topic: Deep probabilistic programming and the protein folding problem (Part 1 of 3)

Speaker: Thomas HamelryckHead of Probabilistic programming group, Dpt. of Computer Science, University of Copenhagen

Time: Wednesday, 2022.02.02, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

For many decades, the protein structure prediction problem has been a major open problem in science, medicine and biotechnology. It has now morphed into a paradigm problem for machine learning and computational statistics. The success of DeepMind’s AlphaFold in predicting protein structures from sequence caused the organisers of the biannual CASP contest (the “Olympic games of protein structure prediction”) to declare on November, 30th 2020: “an artificial intelligence (AI) solution to the challenge has been found”. But several practical and principled challenges remain, including the accurate modelling of the folding process and representing aleatory and epistemic uncertainty (respectively due to protein dynamics and experimental noise) with a bona fide Bayesian model. In a series of three talks, we will provide an introduction to the protein structure prediction problem, its current status and our ongoing work in this area using deep probabilistic programming, directional statistics and Stein variational inference. This work is done in collaboration with Christophe Ley at the Université du Luxembourg and Kanti Mardia at the University of Leeds.

Additional material:

Video recording: https://youtu.be/Xi89LCMdlZI

 

Ioannis Kalogeris: Accelerating the solution of parametrized partial differential equations using machine learning tools

Ioannis Kalogeris: Accelerating the solution of parametrized partial differential equations using machine learning tools

Machine Learning Seminar presentation

Topic: Accelerating the solution of parametrized partial differential equations using machine learning tools

Speaker: Ioannis KalogerisComputational Science and Engineering Laboratory, ETH Zurich, Switzerland

Time: Wednesday, 2022.01.26, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

Simulation of complex physical systems described by nonlinear partial differential equations is central to engineering and physical sciences, with applications ranging from engineering design of vehicles or buildings to weather and climate. From a practical perspective, however, the computational cost of carrying out direct numerical simulation on detailed complex systems may be too large, reaching up to several days for a model evaluation in certain cases. In addition, many important applications such as uncertainty quantification, optimization or sensitivity analysis rely on performing a massive number of model simulations, which renders them computationally unapproachable. In this regard, the field of machine learning, with its recent advances, offers a promising new avenue for solving this type of problems and has already attracted major interest from the research community.

This presentation consists of two parts. In the first part a novel non-intrusive surrogate modeling scheme is introduced for predictive modeling of complex systems described by parametrized time-dependent PDEs. The proposed methodology utilizes a Convolutional Autoencoder in conjunction with a feed forward Neural Network to establish a mapping from the problem’s parametric space to its solution space, thus delivering a cheap-to-evaluate and highly accurate emulator of the complex model under investigation. The second part of the presentation addresses the issue of increasing the predictive capabilities of machine learning-based surrogate models by combining them with iterative solvers and, especially, the Algebraic Multigrid Method.

For this purpose, training data are collected by solving the high-fidelity model via finite elements for a reduced set of parameter values. Then, by applying the Convolutional Autoencoder, a low-dimensional vector representation of the high dimensional solution matrices is provided by the encoder, while the reconstruction map is obtained by the decoder. Using the latent vectors given by the encoder, a feed forward Neural Network is efficiently trained to map points from the parametric space to the compressed version of the respective solution matrices. This way, the proposed surrogate model is capable of predicting the entire time history response simultaneously with remarkable computational gains and very high accuracy. The elaborated methodology is demonstrated on the stochastic analysis of time-dependent partial differential equations solved with the Monte Carlo method.

Additional material:

Video recording: https://youtu.be/ac3-6K7RyMQ

Niccolo Gentile: Understanding the determinants of well-being: a case study in interpretable machine learning

Niccolo Gentile: Understanding the determinants of well-being: a case study in interpretable machine learning

Machine Learning Seminar presentation

Topic: Understanding the determinants of well-being: a case study in interpretable machine learning

Speaker: Niccolo GentileFaculty of Humanities, Education and Social Sciences, University of Luxembourg

Time: Wednesday, 2022.01.19, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

A key objective of empirical research in subjective well-being is the understanding of the relationship between the features and the target, in particular in terms of marginal effect. This necessity explains why, to this day, linear methods remain the mainstream in this field of research. In this presentation, we explore the application of novel model-agnostic interpretable tools in nonlinear contexts, in particular with reference to novel techniques like SHAP Feature Importances in Random Forests, including a comparison with other methods like Feature and Permutation Importance. Differences, advantages and disadvantages of each method are presented in the context of understanding of the determinants of subjective well-being.

Additional material:

Video recording: https://youtu.be/pDbJ55C8c1c

Tjeerd V. olde Scheper: Criticality Analysis for Non-linear Data Representation

Tjeerd V. olde Scheper: Criticality Analysis for Non-linear Data Representation

Machine Learning Seminar presentation

Topic: Criticality Analysis for Non-linear Data Representation

Speaker: Tjeerd V. olde ScheperSchool of Engineering, Computing and Mathematics, Oxford Brookes University, United Kingdoms

Time: Wednesday, 2022.01.12, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

Within this seminar, I will provide a short introduction to the concept of Criticality Analysis and expand on the application and relevance of this method to generate nonlinear representation spaces. These can be used for deterministic representation of multimodal data for categorisation, as a regularisation method, or for controllable self-organised reservoir computing.
Criticality Analysis is based on the concept of a Self-Organised Critical (SOC) system. Such a system is in an apparent stable state, but can change rapidly to another critical state when perturbed. They are commonly observed within large aggregates of nonlinear smaller systems, such as sand particles or snow flakes, forming SOC piles. They seem to have a specific affinity with scale-free power law relations, and have been proposed to underpin biological systems.
To generate a deterministic SOC, a network of controlled nonlinear oscillators is used which shares key properties with SOC systems, namely non-trivial scaling due to the external perturbation, spatiotemporal power-law correlation in respect to the total behaviour of the network, and self-tuning to the critical point where the network self-selects the periodic orbit. The underlying control is based on the method of Rate Control of Chaos, which is a nonlinear control method that can stabilise chaotic systems.
Because these networks are in a critical state and deterministic, they can be used to create stable global states when perturbed with data, and this property is exploited to allow readily classification of the data, irrespective of its modality. It has been used to classify gait patterns, and seems especially useful for categorisation of dynamic biological data.

Additional material:

Video recording: https://youtu.be/jvNNRx4WeYc

Ayan Chakraborty: Deep neural network on PDEs

Ayan Chakraborty: Deep neural network on PDEs

Machine Learning Seminar presentation

Topic: Deep neural network on PDEs

Speaker: Ayan Chakraborty, Faculty of Mathematics and Physics, Leibniz University Hannover

Time: Wednesday, 2021.12.08, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

In this seminar, we will focus on the connections of deep learning-based approximation methods for partial differential equations (PDEs). We will briefly discuss the mathematical analysis of such deep learning-based approximation algorithms and demonstrate the design and the performance of such algorithms in the context of PDEs.

Additional material:

Presentation slides: https://legato-team.eu/wp-content/uploads/2021/12/Ayan-Chakraborty-Deep-neural-network-on-PDEs.pdf

Marharyta Aleksandrova: Causal Inference & Causal Learning: Towards Causal ML (Part 3 of 3)

Marharyta Aleksandrova: Causal Inference & Causal Learning: Towards Causal ML (Part 3 of 3)

Machine Learning Seminar presentation

Topic: Causal Inference & Causal Learning: Towards Causal ML (Part 3 of 3)

Speaker: Marharyta Aleksandrova, Faculty of Science, Technology and Medicine; University of Luxembourg

Time: Wednesday, 2021.11.24, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

In this seminar, we will discuss the basics of the Causality theory: Causal Inference (how to measure the strength of a causal effect) and Causal Learning (how to learn causal structure, that is how to identify which variable is the cause and which is the effect). We will talk of what analysts should consider if they want to make causal conclusions from their data analysis. The lectures will be accompanied with a series of practical demonstrations.

Additional material:

Presentation and collab: https://drive.google.com/file/d/1xGsyCEwaCGsTiiqXfgohi1FEy66zOc0n/view?usp=sharing

Further reading: https://docs.google.com/document/d/15_YfYsQrRQHaJ62FLIeuG91aSI6-lqiXz8Rqvbl9UmA/edit

Video recording: https://youtu.be/7T9JO6Qvyuc

 

Fabio Cuzzolin: The epistemic artificial intelligence project

Fabio Cuzzolin: The epistemic artificial intelligence project

Machine Learning Seminar presentation

Topic: The epistemic artificial intelligence project

Speaker: Fabio Cuzzolin, School of Engineering, Computing and Mathematics — Oxford Brookes University

Time: Wednesday, 2021.11.17, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

Although artificial intelligence (AI) has improved remarkably over the last years, its inability to deal with uncertainty severely limits its future applications. In its current form, AI cannot confidently make predictions robust enough to stand the test of data generated by processes different (even by tiny details, as shown by ‘adversarial’ results) from those seen at training time. While recognising this issue under different names (e.g. ‘overfitting’ or ‘domain adaptation’), traditional machine learning seems unable to address it in nonincremental ways. As a result, even state-of-the-art AI systems suffer from brittle behaviour, and find it difficult to operate in new situations. The epistemic AI project re-imagines AI from the foundations, through a proper treatment of the “epistemic” uncertainty stemming from our forcibly partial knowledge of the world. Its overall objective is to create a new learning paradigm designed to provide worst-case guarantees on its predictions, thanks to a proper modelling of real-world uncertainties. The project aims to formulate a novel mathematical framework for optimisation under epistemic uncertainty, radically departing from current approaches that only focus on aleatory uncertainty. This new optimisation framework will in turn allow the creation of new ‘epistemic’ learning settings, spanning all the major areas of machine learning: unsupervised learning, supervised learning and reinforcement learning. Last but not least, the project aims to foster an ecosystem of academic, research, industry and societal partners throughout Europe able to drive and sustain the EU’s leadership ambition in the search for a next-generation AI.

Additional material:

Video recording: https://youtu.be/GNCKqoQODR0

Marharyta Aleksandrova: Causal Inference & Causal Learning: Towards Causal ML (Part 2 of 3)

Marharyta Aleksandrova: Causal Inference & Causal Learning: Towards Causal ML (Part 2 of 3)

Machine Learning Seminar presentation

Topic: Causal Inference & Causal Learning: Towards Causal ML (Part 2 of 3)

Speaker: Marharyta Aleksandrova, Faculty of Science, Technology and Medicine; University of Luxembourg

Time: Wednesday, 2021.11.10, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

In this seminar, we will discuss the basics of the Causality theory: Causal Inference (how to measure the strength of a causal effect) and Causal Learning (how to learn causal structure, that is how to identify which variable is the cause and which is the effect). We will talk of what analysts should consider if they want to make causal conclusions from their data analysis. The lectures will be accompanied with a series of practical demonstrations.

Additional material:

Presentation and collab: https://drive.google.com/file/d/1xGsyCEwaCGsTiiqXfgohi1FEy66zOc0n/view?usp=sharing

Further reading: https://docs.google.com/document/d/15_YfYsQrRQHaJ62FLIeuG91aSI6-lqiXz8Rqvbl9UmA/edit

Video recording: https://youtu.be/9Qp2Lb0FaQE

Marharyta Aleksandrova: Causal Inference & Causal Learning: Towards Causal ML (Part 1 of 3)

Marharyta Aleksandrova: Causal Inference & Causal Learning: Towards Causal ML (Part 1 of 3)

Machine Learning Seminar presentation

Topic: Causal Inference & Causal Learning: Towards Causal ML (Part 1 of 3)

Speaker: Marharyta Aleksandrova, Faculty of Science, Technology and Medicine; University of Luxembourg

Time: Wednesday, 2021.11.03, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

In this seminar, we will discuss the basics of the Causality theory: Causal Inference (how to measure the strength of a causal effect) and Causal Learning (how to learn causal structure, that is how to identify which variable is the cause and which is the effect). We will talk of what analysts should consider if they want to make causal conclusions from their data analysis. The lectures will be accompanied with a series of practical demonstrations.

Additional material:

Presentation and collab: https://drive.google.com/file/d/1xGsyCEwaCGsTiiqXfgohi1FEy66zOc0n/view?usp=sharing

Video recording: https://youtu.be/K8L5WEFA3hA