Catalogue Search | MBRL

Accurate predictions on small data with a tabular foundation model

by Hutter, Frank , Purucker, Lennart , Körfer, Max in 639/705/1042 , 639/705/1046 , 639/705/117

2025

Tabular data, spreadsheets organized in rows and columns, are ubiquitous across scientific fields, from biomedicine to particle physics to economics and climate science 1 , 2 . The fundamental prediction task of filling in missing values of a label column based on the rest of the columns is essential for various applications as diverse as biomedical risk models, drug discovery and materials science. Although deep learning has revolutionized learning from raw data and led to numerous high-profile success stories 3 , 4 – 5 , gradient-boosted decision trees 6 , 7 , 8 – 9 have dominated tabular data for the past 20 years. Here we present the Tabular Prior-data Fitted Network (TabPFN), a tabular foundation model that outperforms all previous methods on datasets with up to 10,000 samples by a wide margin, using substantially less training time. In 2.8 s, TabPFN outperforms an ensemble of the strongest baselines tuned for 4 h in a classification setting. As a generative transformer-based foundation model, this model also allows fine-tuning, data generation, density estimation and learning reusable embeddings. TabPFN is a learning algorithm that is itself learned across millions of synthetic datasets, demonstrating the power of this approach for algorithm development. By improving modelling abilities across diverse fields, TabPFN has the potential to accelerate scientific discovery and enhance important decision-making in various domains. Tabular Prior-data Fitted Network, a tabular foundation model, provides accurate predictions on small data and outperforms all previous methods on datasets with up to 10,000 samples by a wide margin.

Journal Article

Share this book

Add to My Shelf

Evaluating Glycemic Control in Patients of South Asian Origin With Type 2 Diabetes Using a Digital Therapeutic Platform: Analysis of Real-World Data

by Sosale, Aravind , Shah, Abhishek , Shaikh, Maaz in Original Paper

2021

Digital therapeutics are evidence-based therapeutic interventions driven by high-quality software programs for the treatment, prevention, or management of a medical disorder or disease. Many studies in the western population have shown the effectiveness of mobile app-based digital therapeutics for improving glycemic control in patients with type 2 diabetes (T2D). However, few studies have assessed similar outcomes in the South Asian population. This study aims to investigate the real-world effectiveness of the Wellthy CARE digital therapeutic for improving glycemic control among the South Asian population of Indian origin. We analyzed deidentified data from 102 patients with T2D from India enrolled in a 16-week structured self-management program delivered using the Wellthy CARE mobile app. Patients recorded their meals, weight, physical activity, and blood sugar in the app, and they received lessons on self-care behaviors (healthy eating, being active, monitoring, medication adherence, problem solving, healthy coping, and reducing risks); feedback provided by an artificial intelligence-powered chatbot; and periodic interactions with certified diabetes educators via voice calls and chats. The primary outcome of the program was a change in glycated hemoglobin A (HbA ). Secondary outcomes included the difference between preintervention and postintervention fasting blood glucose (FBG) and postprandial blood glucose (PPBG) levels; changes in BMI and weight at the completion of 16 weeks; and the association between program engagement and the changes in HbA , FBG, and PPBG levels. At the end of 16 weeks, the average change in HbA was -0.49% (n=102; 95% CI -0.73 to 0.25; P<.001). Of all the patients, 63.7% (65/102) had improved HbA levels, with a mean change of -1.16% (n=65; 95% CI -1.40 to -0.92; P<.001). The mean preintervention and postintervention FBG levels were 145 mg/dL (n=51; 95% CI 135-155) and 134 mg/dL (n=51; 95% CI 122-146; P=.02) and PPBG levels were 188 mg/dL (n=51; 95% CI 172-203) and 166 mg/dL (n=51; 95% CI 153-180; P=.03), respectively. The mean changes in BMI and weight were -0.47 kg/m (n=59; 95% CI -0.22 to -0.71; P<.001) and -1.32 kg (n=59; 95% CI -0.63 to -2.01; P<.001), respectively. There was a stepwise decrease in HbA , FBG, and PPBG levels as the program engagement increased. Patients in the highest tertile of program engagement had a significantly higher reduction in HbA (-0.84% vs -0.06%; P=.02), FBG (-21.4 mg/dL vs -0.18 mg/dL; P=.02), and PPBG levels (-22.03 mg/dL vs 2.35 mg/dL; P=.002) than those in the lowest tertile. The use of the Wellthy CARE digital therapeutic for patients with T2D showed a significant reduction in the levels of HbA , FBG, and PPBG after 16 weeks. A higher level of participation showed improved glycemic control, suggesting the potential of the Wellthy CARE platform for better management of the disease.

Journal Article

Share this book

Add to My Shelf

Weight-Entanglement Meets Gradient-Based Neural Architecture Search

by Hutter, Frank , Safari, Mahmoud , Krishnakumar, Arjun in Entanglement , Neural architecture search

2025

Weight sharing is a fundamental concept in neural architecture search (NAS), enabling gradient-based methods to explore cell-based architectural spaces significantly faster than traditional black-box approaches. In parallel, weight-entanglement has emerged as a technique for more intricate parameter sharing amongst macro-architectural spaces. Since weight-entanglement is not directly compatible with gradient-based NAS methods, these two paradigms have largely developed independently in parallel sub-communities. This paper aims to bridge the gap between these sub-communities by proposing a novel scheme to adapt gradient-based methods for weight-entangled spaces. This enables us to conduct an in-depth comparative assessment and analysis of the performance of gradient-based NAS in weight-entangled search spaces. Our findings reveal that this integration of weight-entanglement and gradient-based NAS brings forth the various benefits of gradient-based methods, while preserving the memory efficiency of weight-entangled spaces. The code for our work is openly accessible https://github.com/automl/TangleNAS.

Paper

Share this book

Add to My Shelf

Weight-Entanglement Meets Gradient-Based Neural Architecture Search

by Hutter, Frank , Safari, Mahmoud , Krishnakumar, Arjun in Black boxes , Compatibility , Entanglement

2023

Weight sharing is a fundamental concept in neural architecture search (NAS), enabling gradient-based methods to explore cell-based architecture spaces significantly faster than traditional blackbox approaches. In parallel, weight \\emph{entanglement} has emerged as a technique for intricate parameter sharing among architectures within macro-level search spaces. %However, the macro structure of such spaces poses compatibility challenges for gradient-based NAS methods. %As a result, blackbox optimization methods have been commonly employed, particularly in conjunction with supernet training, to maintain search efficiency. %Due to the inherent differences in the structure of these search spaces, these Since weight-entanglement poses compatibility challenges for gradient-based NAS methods, these two paradigms have largely developed independently in parallel sub-communities. This paper aims to bridge the gap between these sub-communities by proposing a novel scheme to adapt gradient-based methods for weight-entangled spaces. This enables us to conduct an in-depth comparative assessment and analysis of the performance of gradient-based NAS in weight-entangled search spaces. Our findings reveal that this integration of weight-entanglement and gradient-based NAS brings forth the various benefits of gradient-based methods (enhanced performance, improved supernet training properties and superior any-time performance), while preserving the memory efficiency of weight-entangled spaces. The code for our work is openly accessible \\href{https://anonymous.4open.science/r/TangleNAS-527C}{here}

Paper

Share this book

Add to My Shelf

confopt: A Library for Implementation and Evaluation of Gradient-based One-Shot NAS Methods

by Hutter, Frank , Rapp, Martin , Jha, Abhash Kumar in Benchmarks

2025

Gradient-based one-shot neural architecture search (NAS) has significantly reduced the cost of exploring architectural spaces with discrete design choices, such as selecting operations within a model. However, the field faces two major challenges. First, evaluations of gradient-based NAS methods heavily rely on the DARTS benchmark, despite the existence of other available benchmarks. This overreliance has led to saturation, with reported improvements often falling within the margin of noise. Second, implementations of gradient-based one-shot NAS methods are fragmented across disparate repositories, complicating fair and reproducible comparisons and further development. In this paper, we introduce Configurable Optimizer (confopt), an extensible library designed to streamline the development and evaluation of gradient-based one-shot NAS methods. Confopt provides a minimal API that makes it easy for users to integrate new search spaces, while also supporting the decomposition of NAS optimizers into their core components. We use this framework to create a suite of new DARTS-based benchmarks, and combine them with a novel evaluation protocol to reveal a critical flaw in how gradient-based one-shot NAS methods are currently assessed. The code can be found at https://github.com/automl/ConfigurableOptimizer.

Paper

Share this book

Add to My Shelf

Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation

by Hutter, Frank , Kadlecová, Gabriela , Klein, Aaron in Budgets , Floating point arithmetic , Large language models

2026

Small Language models (SLMs) offer an efficient and accessible alternative to Large Language Models (LLMs), delivering strong performance while using far fewer resources. We introduce a simple and effective framework for pretraining SLMs that brings together three complementary ideas. First, we identify structurally sparse sub-network initializations that consistently outperform randomly initialized models of similar size under the same compute budget. Second, we use evolutionary search to automatically discover high-quality sub-network initializations, providing better starting points for pretraining. Third, we apply knowledge distillation from larger teacher models to speed up training and improve generalization. Together, these components make SLM pretraining substantially more efficient: our best model, discovered using evolutionary search and initialized with LLM weights, matches the validation perplexity of a comparable Pythia SLM while requiring 5.16x and 1.26x fewer floating point operations for token budgets of 10B and 100B, respectively. We release all code publicly, offering a practical and reproducible path toward cost-efficient small language model development at scale.

Paper

Share this book

Add to My Shelf

NAS-Bench-Suite-Zero: Accelerating Research on Zero Cost Proxies

by Hutter, Frank , Safari, Mahmoud , White, Colin in Algorithms , Datasets , Information theory

2022

Zero-cost proxies (ZC proxies) are a recent architecture performance prediction technique aiming to significantly speed up algorithms for neural architecture search (NAS). Recent work has shown that these techniques show great promise, but certain aspects, such as evaluating and exploiting their complementary strengths, are under-studied. In this work, we create NAS-Bench-Suite: we evaluate 13 ZC proxies across 28 tasks, creating by far the largest dataset (and unified codebase) for ZC proxies, enabling orders-of-magnitude faster experiments on ZC proxies, while avoiding confounding factors stemming from different implementations. To demonstrate the usefulness of NAS-Bench-Suite, we run a large-scale analysis of ZC proxies, including a bias analysis, and the first information-theoretic analysis which concludes that ZC proxies capture substantial complementary information. Motivated by these findings, we present a procedure to improve the performance of ZC proxies by reducing biases such as cell size, and we also show that incorporating all 13 ZC proxies into the surrogate models used by NAS algorithms can improve their predictive performance by up to 42%. Our code and datasets are available at https://github.com/automl/naslib/tree/zerocost.

Paper

Share this book

Add to My Shelf

NAS-Bench-Suite: NAS Evaluation is (Now) Surprisingly Easy

by Hutter, Frank , Safari, Mahmoud , Zabergja, Guri in Algorithms , Benchmarks , Image classification

2022

The release of tabular benchmarks, such as NAS-Bench-101 and NAS-Bench-201, has significantly lowered the computational overhead for conducting scientific research in neural architecture search (NAS). Although they have been widely adopted and used to tune real-world NAS algorithms, these benchmarks are limited to small search spaces and focus solely on image classification. Recently, several new NAS benchmarks have been introduced that cover significantly larger search spaces over a wide range of tasks, including object detection, speech recognition, and natural language processing. However, substantial differences among these NAS benchmarks have so far prevented their widespread adoption, limiting researchers to using just a few benchmarks. In this work, we present an in-depth analysis of popular NAS algorithms and performance prediction methods across 25 different combinations of search spaces and datasets, finding that many conclusions drawn from a few NAS benchmarks do not generalize to other benchmarks. To help remedy this problem, we introduce NAS-Bench-Suite, a comprehensive and extensible collection of NAS benchmarks, accessible through a unified interface, created with the aim to facilitate reproducible, generalizable, and rapid NAS research. Our code is available at https://github.com/automl/naslib.

Paper

Share this book

Add to My Shelf

Distribution and Characterization of Microplastics Along the Coastal Shoreline of Thiruvananthapuram District, Kerala, India

by Muthuchamy, Muthukumar , Aiswriya, Vijayalekshmi Padmachandran , Arjun, H. S in Analytical methods , Beaches , Bioaccumulation

2024

Plastic pollution has been a widespread issue across the world ever since its invention in the early 1900s to its flourishment during the Industrial Revolution. The viciousness of plastics further elevates with the emergence of microplastics with smaller size and larger surface area than larger plastic debris. Smaller plastic particles of micro- and nano-size range are ubiquitous in all environmental compartments and have the potential to penetrate the biological system resulting in bioaccumulation and biomagnification. Evaluation of the presence of microplastics in the water and soil of an area is necessary for the implementation of precautionary and remedial measures. The present study attempts to evaluate the extent of microplastic pollution along the beaches of Thiruvananthapuram district, Kerala, India. Surface sediment samples were collected from 25 locations across the coastal line of Thiruvananthapuram. Microplastics were quantified and categorized based on their colour, shape, size, and composition. Visual identification and ATR-FTIR spectroscopy were used for the estimation. The result revealed that the majority of plastic particles present were fibres accounting for around 80.80%. The maximum distribution of particles was reported from the sampling location at Thazhampalli-Chirayinkeezhu and Bheemapally. Around 78.05% of particles were coloured while the remaining were either white or colourless. Nylon fibres and polypropylene fragments were the dominant polymer types obtained. The results point to the role of fishing activities as the major source of microplastic input along the coastal beach sediments.

Journal Article

Share this book

Add to My Shelf

Poplar: a phylogenomics pipeline

by Krishnakumar, Raga , Koning, Elizabeth , Subedi, Arjun in Analysis , Genomes , Genomics

2025

Abstract Motivation Generating phylogenomic trees from the genomic data is essential in understanding biological systems. Each step of this complex process has received extensive attention and has been significantly streamlined over the years. Given the public availability of data, obtaining genomes for a wide selection of species is straightforward. However, analyzing that data to generate a phylogenomic tree is a multistep process with legitimate scientific and technical challenges, often requiring a significant input from a domain-area scientist. Results We present Poplar, a new, streamlined computational pipeline, to address the computational logistical issues that arise when constructing the phylogenomic trees. It provides a framework that runs state-of-the-art software for essential steps in the phylogenomic pipeline, beginning from a genome with or without an annotation, and resulting in a species tree. Running Poplar requires no external databases. In the execution, it enables parallelism for execution for clusters and cloud computing. The trees generated by Poplar match closely with state-of-the-art published trees. The usage and performance of Poplar is far simpler and quicker than manually running a phylogenomic pipeline. Availability and implementation Freely available on GitHub at https://github.com/sandialabs/poplar. Implemented using Python and supported on Linux.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter