Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Series TitleSeries Title
-
Reading LevelReading Level
-
YearFrom:-To:
-
More FiltersMore FiltersContent TypeItem TypeIs Full-Text AvailableSubjectCountry Of PublicationPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
1,042
result(s) for
"Kovacs, David"
Sort by:
Quantitative interpretation explains machine learning models for chemical reaction prediction and uncovers bias
by
Kovács, Dávid Péter
,
Lee, Alpha A.
,
McCorkindale, William
in
639/638/549/973
,
639/638/563/606
,
639/638/630
2021
Organic synthesis remains a major challenge in drug discovery. Although a plethora of machine learning models have been proposed as solutions in the literature, they suffer from being opaque black-boxes. It is neither clear if the models are making correct predictions because they inferred the salient chemistry, nor is it clear which training data they are relying on to reach a prediction. This opaqueness hinders both model developers and users. In this paper, we quantitatively interpret the Molecular Transformer, the state-of-the-art model for reaction prediction. We develop a framework to attribute predicted reaction outcomes both to specific parts of reactants, and to reactions in the training set. Furthermore, we demonstrate how to retrieve evidence for predicted reaction outcomes, and understand counterintuitive predictions by scrutinising the data. Additionally, we identify Clever Hans predictions where the correct prediction is reached for the wrong reason due to dataset bias. We present a new debiased dataset that provides a more realistic assessment of model performance, which we propose as the new standard benchmark for comparing reaction prediction models.
Machine learning algorithms offer new possibilities for automating reaction procedures. The present paper investigates automated reaction’s prediction with Molecular Transformer, the state-of-the-art model for reaction prediction, proposing a new debiased dataset for a realistic assessment of the model’s performance.
Journal Article
Cloud-Free Global Maps of Essential Vegetation Traits Processed from the TOA Sentinel-3 Catalogue in Google Earth Engine
by
Salinero-Delgado, Matías
,
Verrelst, Jochem
,
Kovács, Dávid D.
in
Algorithms
,
Atmosphere
,
Chlorophyll
2023
Global mapping of essential vegetation traits (EVTs) through data acquired by Earth-observing satellites provides a spatially explicit way to analyze the current vegetation states and dynamics of our planet. Although significant efforts have been made, there is still a lack of global and consistently derived multi-temporal trait maps that are cloud-free. Here we present the processing chain for the spatiotemporally continuous production of four EVTs at a global scale: (1) fraction of absorbed photosynthetically active radiation (FAPAR), (2) leaf area index (LAI), (3) fractional vegetation cover (FVC), and (4) leaf chlorophyll content (LCC). The proposed workflow presents a scalable processing approach to the global cloud-free mapping of the EVTs. Hybrid retrieval models, named S3-TOA-GPR-1.0-WS, were implemented into Google Earth Engine (GEE) using Sentinel-3 Ocean and Land Color Instrument (OLCI) Level-1B for the mapping of the four EVTs along with associated uncertainty estimates. We used the Whittaker smoother (WS) for the temporal reconstruction of the four EVTs, which led to continuous data streams, here applied to the year 2019. Cloud-free maps were produced at 5 km spatial resolution at 10-day time intervals. The consistency and plausibility of the EVT estimates for the resulting annual profiles were evaluated by per-pixel intra-annually correlating against corresponding vegetation products of both MODIS and Copernicus Global Land Service (CGLS). The most consistent results were obtained for LAI, which showed intra-annual correlations with an average Pearson correlation coefficient (R) of 0.57 against the CGLS LAI product. Globally, the EVT products showed consistent results, specifically obtaining higher correlation than R> 0.5 with reference products between 30 and 60° latitude in the Northern Hemisphere. Additionally, intra-annual goodness-of-fit statistics were also calculated locally against reference products over four distinct vegetated land covers. As a general trend, vegetated land covers with pronounced phenological dynamics led to high correlations between the different products. However, sparsely vegetated fields as well as areas near the equator linked to smaller seasonality led to lower correlations. We conclude that the global gap-free mapping of the four EVTs was overall consistent. Thanks to GEE, the entire OLCI L1B catalogue can be processed efficiently into the EVT products on a global scale and made cloud-free with the WS temporal reconstruction method. Additionally, GEE facilitates the workflow to be operationally applicable and easily accessible to the broader community.
Journal Article
Untangling the Causal Links between Satellite Vegetation Products and Environmental Drivers on a Global Scale by the Granger Causality Method
by
Verrelst, Jochem
,
Kovács, Dávid D.
,
Reyes-Muñoz, Pablo
in
Air temperature
,
Artificial satellites in remote sensing
,
Causality
2023
The Granger Causality (GC) statistical test explores the causal relationships between different time series variables. By employing the GC method, the underlying causal links between environmental drivers and global vegetation properties can be untangled, which opens possibilities to forecast the increasing strain on ecosystems by droughts, global warming, and climate change. This study aimed to quantify the spatial distribution of four distinct satellite vegetation products’ (VPs) sensitivities to four environmental land variables (ELVs) at the global scale given the GC method. The GC analysis assessed the spatially explicit response of the VPs: (i) the fraction of absorbed photosynthetically active radiation (FAPAR), (ii) the leaf area index (LAI), (iii) solar-induced fluorescence (SIF), and, finally, (iv) the normalized difference vegetation index (NDVI) to the ELVs. These ELVs can be categorized as water availability assessing root zone soil moisture (SM) and accumulated precipitation (P), as well as, energy availability considering the effect of air temperature (T) and solar shortwave (R) radiation. The results indicate SM and P are key drivers, particularly causing changes in the LAI. SM alone accounts for 43%, while P accounts for 41%, of the explicitly caused areas over arid biomes. SM further significantly influences the LAI at northern latitudes, covering 44% of cold and 50% of polar biome areas. These areas exhibit a predominant response to R, which is a possible trigger for snowmelt, showing more than 40% caused by both cold and polar biomes for all VPs. Finally, T’s causality is evenly distributed amongst all biomes with fractional covers between ∼10 and 20%. By using the GC method, the analysis presents a novel way to monitor the planet’s ecosystem, based on solely two years as input data, with four VPs acquired by the synergy of Sentinel-3 (S3) and 5P (S5P) satellite data streams. The findings indicated unique, biome-specific responses of vegetation to distinct environmental drivers.
Journal Article
Hyperactive learning for data-driven interatomic potentials
by
Kovács, Dávid Péter
,
Ortner, Christoph
,
Sachs, Matthias
in
639/301/1034/1035
,
639/301/1034/1037
,
Accuracy
2023
Data-driven interatomic potentials have emerged as a powerful tool for approximating ab initio potential energy surfaces. The most time-consuming step in creating these interatomic potentials is typically the generation of a suitable training database. To aid this process hyperactive learning (HAL), an accelerated active learning scheme, is presented as a method for rapid automated training database assembly. HAL adds a biasing term to a physically motivated sampler (e.g. molecular dynamics) driving atomic structures towards uncertainty in turn generating unseen or valuable training configurations. The proposed HAL framework is used to develop atomic cluster expansion (ACE) interatomic potentials for the AlSi10 alloy and polyethylene glycol (PEG) polymer starting from roughly a dozen initial configurations. The HAL generated ACE potentials are shown to be able to determine macroscopic properties, such as melting temperature and density, with close to experimental accuracy.
Journal Article
The design space of E(3)-equivariant atom-centred interatomic potentials
by
Musaelian, Albert
,
Csányi, Gábor
,
Drautz, Ralf
in
639/301/1034/1035
,
639/301/1034/1037
,
639/638/563/606
2025
Molecular dynamics simulation is an important tool in computational materials science and chemistry, and in the past decade it has been revolutionized by machine learning. This rapid progress in machine learning interatomic potentials has produced a number of new architectures in just the past few years. Particularly notable among these are the atomic cluster expansion, which unified many of the earlier ideas around atom-density-based descriptors, and Neural Equivariant Interatomic Potentials (NequIP), a message-passing neural network with equivariant features that exhibited state-of-the-art accuracy at the time. Here we construct a mathematical framework that unifies these models: atomic cluster expansion is extended and recast as one layer of a multi-layer architecture, while the linearized version of NequIP is understood as a particular sparsification of a much larger polynomial model. Our framework also provides a practical tool for systematically probing different choices in this unified design space. An ablation study of NequIP, via a set of experiments looking at in- and out-of-domain accuracy and smooth extrapolation very far from the training data, sheds some light on which design choices are critical to achieving high accuracy. A much-simplified version of NequIP, which we call BOTnet (for body-ordered tensor network), has an interpretable architecture and maintains its accuracy on benchmark datasets.
Batatia and colleagues introduce a computational framework that combines message-passing networks with the atomic cluster expansion architecture and incorporates a many-body description of the geometry of molecular structures. The resulting models are interpretable and accurate.
Journal Article