Our Research

Our Research in pictures!

Atomic simulations passed through a MLCG model

Atomistic simulations of various proteins with various sequences and secondary structure arrangements are processed through a machine-learned coarse-grained model. Once trained, this model is able to run efficient simulations and explore the conformational landscape of any sequence.

Reference

Conceptual overview of MLCG

Pipeline for building and testing a transferable, bottom–up, machine-learned, CG protein force field from a diverse dataset of all-atom simulations, a chosen CG resolution, and a set of basic physical prior energy terms (bonds, angles, dihedrals and purely repulsive interactions). The CG atom types z and CG coordinates x are transformed into pairwise distances dij are fed into the neural network architecture to predict the CG effective potential energy U and corresponding CG forces F. The trained neural network can subsequently be used to simulate new sequences and predict observables such as root mean square deviations (RMSD), radii of gyration (Rg) or dictionaries of secondary structure in proteins (d.s.s.p.).

Reference

Conceptual overview of RANGE

In (c) and (d), the aggregation and broadcast blocks project senders and receivers onto key and query spaces, respectively, via the linear layers AK, AQ and BK, BQ. A positional encoding projected onto the edge space via AE and BE is included in the calculation of the attention weights. During the broadcast phase (d), a memory effect, modeled by self-loops, is introduced for balancing local and global information content inside each graph node. Lastly, the attention weights are used to modulate the contributions of the receivers projected via the AV and BV linear layers.

Reference

some of the long range tasks that RANGE can achieve

RANGE-extended message-passing neural networks (SchNet, PaiNN, SO3krates, and MACE) are compared to their baseline versions in a set of tasks designed to assess the ability to model long-range effects. (a) Relative energies of Na and Na⁺ systems (Na in purple, Cl in green) as a function of Na–Na distance, as highlighted in the structure figure. As the furthest Na atom is added or removed, the charge is kept constant. Offsets are added to distinguish the different models. (b) Relative energy of periodic MgO as the Au₂ dimer approaches the surface of the Al-doped and undoped crystal (Mg in red, O in green, Al in blue, and Au in yellow) as a function of the Au–O distance, as highlighted in the structure figure. Offsets are added to distinguish the different models. Structures and reference energy in (a) and (b) are taken from48. (c) Mean absolute error (MAE) of energy and forces of different organic dimers in an extrapolation task beyond the distance cutoff explored during training. Results are divided by the electronic distribution of the molecules in the dimers: (A)polar, (P)olar, (C)harged. The depicted molecule is extracted from the CC test set. The reported legend is shared among all panels. Source data are provided as a Source Data file.

Reference

Learning data-efficient coarse-grained molecular dynamics from forces and noise

Training strategy for CG from noisy data

Training data consisting of “real” particles with associated forces (blue) are first combined with noise to add additional sites (red); these new particles interact with the original particles, changing forces throughout the system. The “real” particles (green dotted circles) are then systematically CG out and associated with a linear combination of “real” and noise-derived force information (purple), providing data for subsequent force-field training.

Reference

Breaking the Barriers of Molecular Dynamics With Deep-Learning: Opportunities, Pitfalls, and How to Navigate Them

The scientist is hiking towards the treasure of accurate and predictive simulations of relevant phenomena. Molecular Dynamics shows a path riddled with obstacles such as accuracy, speed or sampling issues. Deep-learning offers a way around these obstacles, but runs into hurdles of its own.

Reference

Peering inside the black box by learning the relevance of many-body functions in neural network potentials

(a) In Graph Neural Networks, the input graph is defined by a cutoff radius (illustrated in shaded blue) that determines the direct neighbors for each input node. By iterative message-passing (MP) steps in multiple layers (layer index t ∈ {0, 1, 2}), information can be exchanged between more distant nodes (outside of the cutoff region), updating the representation of each node. The model output (in our case the potential energy E) is obtained by passing the learned feature representations through a multilayer perceptron and final pooling over the individual bead energies ei. Obtaining the relevance Rwalk involves propagating the output back through the network, by considering the connections between each node in one layer and the nodes in the previous layer. This procedure defines “walks” across the network layers. b The walks involving the same subset of n nodes are aggregated to obtain a decomposition of the output into n-body contributions.

Reference

Funding Agencies

DFG, click here to know more

BMBF, click here to know more

Einstein Foundation Berlin, click here to know more

European Research Council click here to know more

Heibrids, click here to know more

European Commission, click here to know more

NHR ZIB gives us access to the Lise cluster, click here to know more

Juelich - JUPITER cluster (top-5 supercomputer worldwide), click here to know more