Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
159
result(s) for
"Supercomputing"
Sort by:
Physics-inspired spatiotemporal-graph AI ensemble for the detection of higher order wave mode signals of spinning binary black hole mergers
by
Tian, Minyang
,
Huerta, E. A.
,
Kumar, Prayush
in
black holes
,
gravitational waves
,
supercomputing
2024
We present a new class of AI models for the detection of quasi-circular, spinning, non-precessing binary black hole mergers whose waveforms include the higher order gravitational wave modes ($\\ell$, |m|) = {(2,2), (2,1), (3,3), (3,2), (4,4)}, and mode mixing effects in the $\\ell$ = 3, |m| = 2 harmonics. These AI models combine hybrid dilated convolution neural networks to accurately model both short- and long-range temporal sequential information of gravitational waves; and graph neural networks to capture spatial correlations among gravitational wave observatories to consistently describe and identify the presence of a signal in a three detector network encompassing the Advanced LIGO and Virgo detectors. We first trained these spatiotemporal-graph AI models using synthetic noise, using 1.2 million modeled waveforms to densely sample this signal manifold, within 1.7 h using 256 NVIDIA A100 GPUs in the Polaris supercomputer at the Argonne Leadership Computing Facility. This distributed training approach exhibited optimal classification performance, and strong scaling up to 512 NVIDIA A100 GPUs. With these AI ensembles we processed data from a three detector network, and found that an ensemble of 4 AI models achieves state-of-the-art performance for signal detection, and reports two misclassifications for every decade of searched data. We distributed AI inference over 128 GPUs in the Polaris supercomputer and 128 nodes in the Theta supercomputer, and completed the processing of a decade of gravitational wave data from a three detector network within 3.5 h. Finally, we fine-tuned these AI ensembles to process the entire month of February 2020, which is part of the O3b LIGO/Virgo observation run, and found 6 gravitational waves, concurrently identified in Advanced LIGO and Advanced Virgo data, and zero false positives. This analysis was completed in one hour using one NVIDIA A100 GPU.
Journal Article
Physics-inspired spatiotemporal-graph AI ensemble for the detection of higher order wave mode signals of spinning binary black hole mergers
by
Tian, Minyang
,
Huerta, E. A.
,
Kumar, Prayush
in
black holes
,
gravitational waves
,
supercomputing
2024
We present a new class of AI models for the detection of quasi-circular, spinning, non-precessing binary black hole mergers whose waveforms include the higher order gravitational wave modes ($\\ell$, |m|) = {(2,2), (2,1), (3,3), (3,2), (4,4)}, and mode mixing effects in the $\\ell$ = 3, |m| = 2 harmonics. These AI models combine hybrid dilated convolution neural networks to accurately model both short- and long-range temporal sequential information of gravitational waves; and graph neural networks to capture spatial correlations among gravitational wave observatories to consistently describe and identify the presence of a signal in a three detector network encompassing the Advanced LIGO and Virgo detectors. We first trained these spatiotemporal-graph AI models using synthetic noise, using 1.2 million modeled waveforms to densely sample this signal manifold, within 1.7 h using 256 NVIDIA A100 GPUs in the Polaris supercomputer at the Argonne Leadership Computing Facility. This distributed training approach exhibited optimal classification performance, and strong scaling up to 512 NVIDIA A100 GPUs. With these AI ensembles we processed data from a three detector network, and found that an ensemble of 4 AI models achieves state-of-the-art performance for signal detection, and reports two misclassifications for every decade of searched data. We distributed AI inference over 128 GPUs in the Polaris supercomputer and 128 nodes in the Theta supercomputer, and completed the processing of a decade of gravitational wave data from a three detector network within 3.5 h. Finally, we fine-tuned these AI ensembles to process the entire month of February 2020, which is part of the O3b LIGO/Virgo observation run, and found 6 gravitational waves, concurrently identified in Advanced LIGO and Advanced Virgo data, and zero false positives. This analysis was completed in one hour using one NVIDIA A100 GPU.
Journal Article
Status, challenges and trends of data-intensive supercomputing
2022
Supercomputing technology has been supporting the solution of cutting-edge scientific and complex engineering problems since its inception—serving as a comprehensive representation of the most advanced computer hardware and software technologies over a period of time. Over the course of nearly 80 years of development, supercomputing has progressed from being oriented towards computationally intensive tasks, to being oriented towards a hybrid of computationally and data-intensive tasks. Driven by the continuous development of high performance data analytics (HPDA) applications—such as big data, deep learning, and other intelligent tasks—supercomputing storage systems are facing challenges such as a sudden increase in data volume for computational processing tasks, increased and diversified computing power of supercomputing systems, and higher reliability and availability requirements. Based on this, data-intensive supercomputing, which is deeply integrated with data centers and smart computing centers, aims to solve the problems of complex data type optimization, mixed-load optimization, multi-protocol support, and interoperability on the storage system—thereby becoming the main protagonist of research and development today and for some time in the future. This paper first introduces key concepts in HPDA and data-intensive computing, and then illustrates the extent to which existing platforms support data-intensive applications by analyzing the most representative supercomputing platforms today (Fugaku, Summit, Sunway TaihuLight, and Tianhe 2A). This is followed by an illustration of the actual demand for data-intensive applications in today’s mainstream scientific and industrial communities from the perspectives of both scientific and commercial applications. Next, we provide an outlook on future trends and potential challenges data-intensive supercomputing is facing. In a word, this paper provides researchers and practitioners with a quick overview of the key concepts and developments in supercomputing, and captures the current and future data-intensive supercomputing research hotspots and key issues that need to be addressed.
Journal Article
Increasing horizontal resolution in numerical weather prediction and climate simulations: illusion or panacea?
The steady path of doubling the global horizontal resolution approximately every 8 years in numerical weather prediction (NWP) at the European Centre for Medium Range Weather Forecasts may be substan- tially altered with emerging novel computing architectures. It coincides with the need to appropriately address and determine forecast uncertainty with increasing resolution, in particular, when convective-scale motions start to be resolved. Blunt increases in the model resolution will quickly become unaffordable and may not lead to improved NWP forecasts. Consequently, there is a need to accordingly adjust proven numerical techniques. An informed decision on the modelling strategy for harnessing exascale, massively parallel computing power thus also requires a deeper understanding of the sensitivity to uncertainty-for each part of the model-and ultimately a deeper understanding of multi-scale interactions in the atmosphere and their numerical realization in ultra-high-resolution NWP and climate simulations. This paper explores opportunities for substantial increases in the forecast efficiency by judicious adjustment of the formal accuracy or relative resolution in the spectral and physical space. One path is to reduce the formal accuracy by which the spectral transforms are computed. The other pathway explores the importance of the ratio used for the horizontal resolution in gridpoint space versus wavenumbers in spectral space. This is relevant for both high-resolution simulations as well as ensemble-based uncertainty estimation.
Journal Article
CoreNEURON : An Optimized Compute Engine for the NEURON Simulator
2019
The NEURON simulator has been developed over the past three decades and is widely used by neuroscientists to model the electrical activity of neuronal networks. Large network simulation projects using NEURON have supercomputer allocations that individually measure in the millions of core hours. Supercomputer centers are transitioning to next generation architectures and the work accomplished per core hour for these simulations could be improved by an order of magnitude if NEURON was able to better utilize those new hardware capabilities. In order to adapt NEURON to evolving computer architectures, the compute engine of the NEURON simulator has been extracted and has been optimized as a library called CoreNEURON. This paper presents the design, implementation and optimizations of CoreNEURON. We describe how CoreNEURON can be used as a library with NEURON and then compare performance of different network models on multiple architectures including IBM BlueGene/Q, Intel Skylake, Intel MIC and NVIDIA GPU. We show how CoreNEURON can simulate existing NEURON network models with 4-7x less memory usage and 2-7x less execution time while maintaining binary result compatibility with NEURON.
Journal Article
The organization of serotonergic fibers in the Pacific angelshark brain: neuroanatomical and supercomputing analyses
by
Vojta, Thomas
,
Metzler, Ralf
,
Janušonis, Skirmantas
in
5-hydroxytryptamine (5-HT)
,
axon
,
density
2025
Serotonergic axons (fibers) are a universal feature of all vertebrate brains. They form meshworks, typically quantified with regional density measurements, and appear to support neuroplasticity. The self-organization of this system remains poorly understood, partly because of the strong stochasticity of individual fiber trajectories. In an extension to our previous analyses of the mouse brain, serotonergic fibers were investigated in the brain of the Pacific angelshark ( Squatina californica ), a representative of a unique (ray-like) lineage of the squalomorph sharks. First, the fundamental cytoarchitecture of the angelshark brain was examined, including the expression of ionized calcium-binding adapter molecule 1 (Iba1, AIF-1) and the mesencephalic trigeminal nucleus. Second, serotonergic fibers were visualized with immunohistochemistry, which showed that fibers in the forebrain have the tendency to move toward the dorsal pallium and also accumulate at higher densities at pial borders. Third, a population of serotonergic fibers was modeled inside a digital model of the angelshark brain by using a supercomputing simulation. The simulated fibers were defined as sample paths of reflected fractional Brownian motion (FBM), a continuous-time stochastic process. The regional densities generated by these simulated fibers reproduced key features of the biological serotonergic fiber densities in the telencephalon, a brain division with a considerable physical uniformity and no major “obstacles” (dense axon tracts). These results demonstrate that the paths of serotonergic fibers may be inherently stochastic, and that a large population of such paths can give rise to a consistent, non-uniform, and biologically-realistic fiber density distribution. Local densities may be induced by the constraints of the three-dimensional geometry of the brain, with no axon guidance cues. However, they can be further refined by anisotropies that constrain fiber movement (e.g., major axon tracts, active self-avoidance, chemical gradients). In the angelshark forebrain, such constraints may be reduced to an attractive effect of the dorsal pallium, suggesting that anatomically complex distributions of fiber densities can emerge from the interplay of a small set of stochastic and deterministic processes.
Journal Article
Molecular docking-based computational platform for high-throughput virtual screening
by
Jin, Zhong
,
Yu, Kunqian
,
Li, Hui
in
Computer aided design
,
Computer Hardware
,
Computer Science
2022
Structure-based virtual screening is a key, routine computational method in computer-aided drug design. Such screening can be used to identify potentially highly active compounds, to speed up the progress of novel drug design. Molecular docking-based virtual screening can help find active compounds from large ligand databases by identifying the binding affinities between receptors and ligands. In this study, we analyzed the challenges of virtual screening, with the aim of identifying highly active compounds faster and more easily than is generally possible. We discuss the accuracy and speed of molecular docking software and the strategy of high-throughput molecular docking calculation, and we focus on current challenges and our solutions to these challenges of ultra-large-scale virtual screening. The development of Web services helps lower the barrier to drug virtual screening. We introduced some related web sites for docking and virtual screening, focusing on the development of pre- and post-processing interactive visualization and large-scale computing.
Journal Article
PICSAR-QED: a Monte Carlo module to simulate strong-field quantum electrodynamics in particle-in-cell codes for exascale architectures
2022
Physical scenarios where the electromagnetic fields are so strong that quantum electrodynamics (QED) plays a substantial role are one of the frontiers of contemporary plasma physics research. Investigating those scenarios requires state-of-the-art particle-in-cell (PIC) codes able to run on top high-performance computing (HPC) machines and, at the same time, able to simulate strong-field QED processes. This work presents the PICSAR-QED library, an open-source, portable implementation of a Monte Carlo module designed to provide modern PIC codes with the capability to simulate such processes, and optimized for HPC. Detailed tests and benchmarks are carried out to validate the physical models in PICSAR-QED, to study how numerical parameters affect such models, and to demonstrate its capability to run on different architectures (CPUs and GPUs). Its integration with WarpX, a state-of-the-art PIC code designed to deliver scalable performance on upcoming exascale supercomputers, is also discussed and validated against results from the existing literature.
Journal Article
A demand-centered scheduling framework for shared supercomputing resources: modeling, metrics, and case insights
2025
The exponential growth of artificial intelligence and data-intensive applications has led to a significant surge in demand for supercomputing resources. However, limited infrastructure capacity and rising construction costs have made traditional supply-side expansion strategies increasingly unsustainable. In response, many nations are exploring joint utilization systems to maximize resource efficiency by enabling the flexible allocation of distributed computing assets. This study proposes a novel dynamic scheduling framework designed to enhance demand-side management in such environments. The methodology involves estimating a demand model using price elasticity analysis and developing a new composite index to quantitatively evaluate resource management efficiency across multiple centers. A comparative case study was conducted using simulated data from seven specialized supercomputing centers, analyzing different scheduling strategies under varying joint resource ratios. To verify the effectiveness of the proposed framework, an additional comparative analysis was performed for three organizations under identical resource conditions. The results reveal that the dynamic scheduling method provided up to 3.5 times more effective average resource delivery compared to the static method. Furthermore, while the static scheduling method resulted in a response failure rate exceeding 30%, the dynamic method reduced this to approximately 8%, clearly demonstrating its superior ability to meet fluctuating demands with the same amount of resources. These results demonstrate that the proposed dynamic scheduling method significantly reduces demand-response failures and surplus idle resources compared to conventional static scheduling. Furthermore, the study introduces a system-wide efficiency index, which enables real-time monitoring of temporal and institutional demand variance. These findings provide both theoretical and practical contributions to the design and governance of shared HPC infrastructures. The proposed approach offers a scalable foundation for policy frameworks and operational strategies in multi-institutional supercomputing environments.
Journal Article
Typhoon Case Comparison Analysis Between Heterogeneous Many-Core and Homogenous Multicore Supercomputing Platforms
by
Xu, Da
,
Wang, Chengzhi
,
Han, Qiqi
in
Earth and Environmental Science
,
Earth Sciences
,
Heat budget
2023
In this paper, a typical experiment is carried out based on a high-resolution air-sea coupled model, namely, the coupled ocean-atmosphere-wave-sediment transport (COAWST) model, on both heterogeneous many-core (SW) and homogenous multicore (Intel) supercomputing platforms. We construct a hindcast of Typhoon Lekima on both the SW and Intel platforms, compare the simulation results between these two platforms and compare the key elements of the atmospheric and ocean modules to reanalysis data. The comparative experiment in this typhoon case indicates that the domestic many-core computing platform and general cluster yield almost no differences in the simulated typhoon path and intensity, and the differences in surface pressure (PSFC) in the WRF model and sea surface temperature (SST) in the short-range forecast are very small, whereas a major difference can be identified at high latitudes after the first 10 days. Further heat budget analysis verifies that the differences in SST after 10 days are mainly caused by shortwave radiation variations, as influenced by subsequently generated typhoons in the system. These typhoons generated in the hindcast after the first 10 days attain obviously different trajectories between the two platforms.
Journal Article