Lorella Viola: The problem with GPT for the Humanities (and for humanity)

Lorella Viola: The problem with GPT for the Humanities (and for humanity)

Machine Learning Seminar presentation

Topic: The problem with GPT for the Humanities (and for humanity).

Speaker: Lorella Viola, Luxembourg Centre for Contemporary and Digital History (C2DH ), University of Luxembourg

Time: Wednesday, 2022.05.11, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

GPT models (Generative Pre-trained Transformer) have increasingly become a popular choice by researchers and practitioners. Their success is mostly due to the technology’s ability to move beyond single word predictions. Indeed, unlike in traditional neural network language models, GPT generates text by looking at the entirety of the input text. Thus, rather than determining relevance sequentially by looking at the most recent segment of input, GPT models determine each word’s relevance selectively. However, if on the one hand this ability allows the machine to ‘learn’ faster, the datasets used for training have to be fed as one single document, meaning that all metadata information is inevitably lost (e.g., date, authors, original source). Moreover, as GPT models are trained on crawled, English web material from 2016, these models are not only ignorant of the world prior to this date, but they also express the language as used exclusively by English-speaking users (mostly white, young males). They also expect data pristine in quality, in the sense that these models have been trained on digitally-born material which do not present the typical problems of digitized, historical content (e.g., OCR mistakes, unusual fonts). Although a powerful technology, these issues seriously hinder its application for humanistic enquiry, particularly historical. In this presentation, I discuss these and other problematic aspects of GPT and I present the specific challenges I encountered while working on a historical archive of Italian American immigrant newspapers.

Additional material:

Video recording: https://youtu.be/J1zkS1mKGlE

Aubin Geoffre: Gaussian Process Regression with anisotropic kernel for supervised feature selection.

Aubin Geoffre: Gaussian Process Regression with anisotropic kernel for supervised feature selection.

Machine Learning Seminar presentation

Topic: Gaussian Process Regression with anisotropic kernel for supervised feature selection.

Speaker: Aubin Geoffre, École Mines de Saint-Étienne

Time: Wednesday, 2022.05.04, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

Studying flows in random porous media leads to consider a permeability tensor which directly depends on the pore geometry. The latter can be characterised through the computation of various morphological parameters: Delaunay triangulation characteristics, nearest neighbour distance,… The natural question is: which morphological parameters provide the best statistical description of permeability? This question can be difficult to answer for several reasons: non-linear correlation between input parameters, non-linear correlation between inputs and outputs, small dataset, variability,…
A method of feature selection based on Gaussian Process Regression has been proposed. It can be applied to a wide range of applications where the parameters that best explain a given output are sought among a set of correlated features. The method uses anisotropic kernel that associates a hyperparameter to each feature. These hyperparameters can be interpreted as correlation lengths providing an estimation of the weight of each feature w.r.t the output.

Additional material:

Video recording: https://youtu.be/S7_RcE2WmGk

Vu Chau: Non-parametric data-driven constitutive modelling using artificial neural networks.

Vu Chau: Non-parametric data-driven constitutive modelling using artificial neural networks.

Machine Learning Seminar presentation

Topic: Non-parametric data-driven constitutive modelling using artificial neural networks.

Speaker: Vu Chau, Department of Engineering, FSTM, University of Luxembourg

Time: Wednesday, 2022.04.13, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

This seminar talk addresses certain challenges associated with data-driven modelling of advanced materials — with special interest in the non-linear deformation response of rubber-like materials, soft polymers or biological tissue. The underlying (isotropic) hyperelastic deformation problem is formulated in the principal space, using principal stretches and principal stresses. The sought data-driven constitutive relation is expressed in terms of these principal quantities and to be captured by a non-parametric representation using a trained artificial neural network (ANN).
The presentation investigates certain physics-motivated consistency requirements (e.g. limit behaviour, monotonicity) for the ANN-based prediction of principal stresses for given principal stretches, and discusses the implications on the architecture of such constitutive ANNs. The neural network is exemplarily constructed, trained and tested using PyTorch.
The computational embedding of the data-driven material descriptor is demonstrated for the open-source finite element framework FEniCS which builds on the symbolic representation of the constitutive ANN operator in the Unified Form Language (UFL). We discuss the performance of the overall formulation within the non-linear solution process and will explain some future directions of research.

Additional material:

Video recording: https://youtu.be/Vn0fLGHAkbk

Alban Odot: Deformation approximation : Improve your Artificial Neural Network training using the Finite Element Method formulation. A case for static deformations.

Alban Odot: Deformation approximation : Improve your Artificial Neural Network training using the Finite Element Method formulation. A case for static deformations.

Machine Learning Seminar presentation

Topic: Deformation approximation: Improve your Artificial Neural Network training using the Finite Element Method formulation. A case for static deformations.

Speaker: Alban Odot, Ph.D. student at INRIA Strasbourg (MIMESIS-Inria)

Time: Wednesday, 2022.03.16, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

Recently, with the increase in GPU computational power, deep learning started to revolutionise several fields, in particular computer vision, language processing, and image processing. Deep learning engineering can be split into three main categories: Dataset, Network Architecture and Learning policy. Setting the network architecture to its simplest form, we will modify the dataset and learning policy formulation using the Finite Element Method to improve the training.
The Finite Element Method is often used as the numerical method of reference for solving the PDE associated with non-linear object deformations. In order to solve the resulting energy minimisation equations, root-finding algorithms such as the Newton-Raphson method are used. During its iterative process, the Newton-Raphson reveals important information about the state of the system which can be used in both the dataset formulation and the training policy.

Additional material:

Video recording: https://youtu.be/FNIX0XCix8c

Onkar Jadhav: Parametric model order reduction with an adaptive greedy sampling approach based on surrogate modeling. An application of the pMOR in financial risk analysis.

Onkar Jadhav: Parametric model order reduction with an adaptive greedy sampling approach based on surrogate modeling. An application of the pMOR in financial risk analysis.

Machine Learning Seminar presentation

Topic: Parametric model order reduction with an adaptive greedy sampling approach based on surrogate modeling. An application of the pMOR in financial risk analysis.

Speaker: Onkar Jadhav, Research Associate, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg.

Time: Wednesday, 2022.03.02, 15:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

The numerical simulation of physical systems typically involves the solution of large-scale systems of equations resulting from the discretization of PDEs. Model order reduction techniques are very advantageous to overcome this computational hurdle. Based on the proper orthogonal decomposition approach, the talk will present a model order reduction approach for parametric high dimensional convection-diffusion-reaction partial differential equations. The proper orthogonal decomposition requires solving the high dimensional model for some training parameters to obtain the reduced basis. In this work, the training parameters are chosen based on greedy sampling approaches. We propose an adaptive greedy sampling approach that utilizes an optimized search based on surrogate modeling for the selection of the training parameter set. The work also presents an elegant approach for monitoring the convergence of the developed greedy sampling approach along with its error and sensitivity analysis.
The developed techniques are analyzed, implemented, and tested on industrial data of a floater with caps and floors under the one-factor Hull-White model. The results illustrate that the model order reduction approach provides a significant speedup with excellent accuracy for short-rate models.

Additional material:

Video recording: https://youtu.be/rJAUDG6Y0Pc

Lester Mackey: Kernel Thinning and Stein Thinning

Lester Mackey: Kernel Thinning and Stein Thinning

Machine Learning Seminar presentation

Topic: Kernel Thinning and Stein Thinning

Speaker: Lester Mackey, a researcher at Microsoft Research New England, and an adjunct professor at Stanford University.

Time: Wednesday, 2022.02.23, 15:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

This talk will introduce two new tools for summarizing a probability distribution more effectively than independent sampling or standard Markov chain Monte Carlo thinning:

1. Given an initial n point summary (for example, from independent sampling or a Markov chain), kernel thinning finds a subset of only square-root n points with comparable worst-case integration error across a reproducing kernel Hilbert space.

2. If the initial summary suffers from biases due to off-target sampling, tempering, or burn-in, Stein thinning simultaneously compresses the summary and improves the accuracy by correcting for these biases.

These tools are especially well-suited for tasks that incur substantial downstream computation costs per summary point like organ and tissue modeling in which each simulation consumes 1000s of CPU hours.

Additional material:

Video recording: https://youtu.be/91p4Octzc6E

Ola Rønning: ELBO-within-Stein: General and integrated Stein Variational Inference (Part 3 of 3)

Ola Rønning: ELBO-within-Stein: General and integrated Stein Variational Inference (Part 3 of 3)

Machine Learning Seminar presentation

Topic: ELBO-within-Stein: General and integrated Stein Variational Inference (Part 3 of 3)

Speaker: Ola RønningPhD student in the Probabilistic programming group of Prof. Thomas Hamelryck, Dpt. of Computer Science, University of Copenhagen

Time: Wednesday, 2022.02.16, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

Bayesian inference provides a unified framework for quantifying uncertainty in probabilistic models with latent variables. However, exact inference algorithms generally scale poorly with the dimensionality of the model and the size of the data. To overcome the issue of scaling, the ML community has turned to approximate inference. For the Big Data case, the most prominent method is Variational inference (VI), which uses a simpler parametric model to approximate the target distribution of the latent variables. In recent years, Stein’s method has caught the attention of the ML community as a way to formulate new schemes for performing variational inference. Stein’s method provides a fundamental technique for approximating and bounding distances between probability distributions. The kernel Stein discrepancy underlies Stein Variational Gradient Descent (SVGD) which works by iteratively transporting particles sampled from a simple distribution to the target distribution. We introduce the ELBO-within-Stein algorithm that combines SVGD and VI to alleviate issues due to high-dimensional models and large data sets. The ELBO-within-Stein algorithm is available in our computational framework EinStein distributed with the deep probabilistic programming language NumPyro. We will draw upon our framework to illustrate key concepts with examples. EinStein is currently freely available on GitHub and will be available in NumPyro from the next release. The framework is an efficient inference tool for practitioners and a flexible and unified codebase for researchers.

Additional material:

Video recording: https://youtu.be/vsI7J5pgTv0

Google Colab: https://colab.research.google.com/drive/17vAcyDXbcuDRj8IUym8zdP0OxN5kpLmD?usp=sharing

Christian Thygesen: Efficient generative modelling of protein structure fragments using a Deep Markov Model (Part 2 of 3)

Christian Thygesen:  Efficient generative modelling of protein structure fragments using a Deep Markov Model (Part 2 of 3)

Machine Learning Seminar presentation

Topic: Efficient generative modelling of protein structure fragments using a Deep Markov Model (Part 2 of 3)

Speaker: Christian ThygesenIndustrial PhD student with Evaxion / University of Copenhagen

Time: Wednesday, 2022.02.09, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

Fragment libraries are often used in protein structure prediction, simulation and design as a means to significantly reduce the vast conformational search space. Current state-of-the-art methods for fragment library generation do not properly account for aleatory and epistemic uncertainty, respectively due to the dynamic nature of proteins and experimental errors in protein structures. Additionally, they typically rely on information that is not generally or readily available, such as homologous sequences, related protein structures and other complementary information.
To address these issues, we developed BIFROST, a novel take on the fragment library problem based on a Deep Markov Model architecture combined with directional statistics for angular degrees of freedom, implemented in the deep probabilistic programming language Pyro. BIFROST is a probabilistic, generative model of the protein backbone dihedral angles conditioned solely on the amino acid sequence. BIFROST generates fragment libraries with a quality on par with current state-of-the-art methods at a fraction of the run-time, while requiring considerably less information and allowing efficient evaluation of probabilities.

Additional material:

Video recording: https://youtu.be/hydwEaq_Uuo

Thomas Hamelryck: Deep probabilistic programming and the protein folding problem (Part 1 of 3)

Thomas Hamelryck: Deep probabilistic programming and the protein folding problem (Part 1 of 3)

Machine Learning Seminar presentation

Topic: Deep probabilistic programming and the protein folding problem (Part 1 of 3)

Speaker: Thomas HamelryckHead of Probabilistic programming group, Dpt. of Computer Science, University of Copenhagen

Time: Wednesday, 2022.02.02, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

For many decades, the protein structure prediction problem has been a major open problem in science, medicine and biotechnology. It has now morphed into a paradigm problem for machine learning and computational statistics. The success of DeepMind’s AlphaFold in predicting protein structures from sequence caused the organisers of the biannual CASP contest (the “Olympic games of protein structure prediction”) to declare on November, 30th 2020: “an artificial intelligence (AI) solution to the challenge has been found”. But several practical and principled challenges remain, including the accurate modelling of the folding process and representing aleatory and epistemic uncertainty (respectively due to protein dynamics and experimental noise) with a bona fide Bayesian model. In a series of three talks, we will provide an introduction to the protein structure prediction problem, its current status and our ongoing work in this area using deep probabilistic programming, directional statistics and Stein variational inference. This work is done in collaboration with Christophe Ley at the Université du Luxembourg and Kanti Mardia at the University of Leeds.

Additional material:

Video recording: https://youtu.be/Xi89LCMdlZI

 

Ioannis Kalogeris: Accelerating the solution of parametrized partial differential equations using machine learning tools

Ioannis Kalogeris: Accelerating the solution of parametrized partial differential equations using machine learning tools

Machine Learning Seminar presentation

Topic: Accelerating the solution of parametrized partial differential equations using machine learning tools

Speaker: Ioannis KalogerisComputational Science and Engineering Laboratory, ETH Zurich, Switzerland

Time: Wednesday, 2022.01.26, 10:00 CET

How to join: Please contact Jakub Lengiewicz

Abstract:

Simulation of complex physical systems described by nonlinear partial differential equations is central to engineering and physical sciences, with applications ranging from engineering design of vehicles or buildings to weather and climate. From a practical perspective, however, the computational cost of carrying out direct numerical simulation on detailed complex systems may be too large, reaching up to several days for a model evaluation in certain cases. In addition, many important applications such as uncertainty quantification, optimization or sensitivity analysis rely on performing a massive number of model simulations, which renders them computationally unapproachable. In this regard, the field of machine learning, with its recent advances, offers a promising new avenue for solving this type of problems and has already attracted major interest from the research community.

This presentation consists of two parts. In the first part a novel non-intrusive surrogate modeling scheme is introduced for predictive modeling of complex systems described by parametrized time-dependent PDEs. The proposed methodology utilizes a Convolutional Autoencoder in conjunction with a feed forward Neural Network to establish a mapping from the problem’s parametric space to its solution space, thus delivering a cheap-to-evaluate and highly accurate emulator of the complex model under investigation. The second part of the presentation addresses the issue of increasing the predictive capabilities of machine learning-based surrogate models by combining them with iterative solvers and, especially, the Algebraic Multigrid Method.

For this purpose, training data are collected by solving the high-fidelity model via finite elements for a reduced set of parameter values. Then, by applying the Convolutional Autoencoder, a low-dimensional vector representation of the high dimensional solution matrices is provided by the encoder, while the reconstruction map is obtained by the decoder. Using the latent vectors given by the encoder, a feed forward Neural Network is efficiently trained to map points from the parametric space to the compressed version of the respective solution matrices. This way, the proposed surrogate model is capable of predicting the entire time history response simultaneously with remarkable computational gains and very high accuracy. The elaborated methodology is demonstrated on the stochastic analysis of time-dependent partial differential equations solved with the Monte Carlo method.

Additional material:

Video recording: https://youtu.be/ac3-6K7RyMQ