Catalogue Search | MBRL

SkimROOT: Accelerating LHC Data Filtering with Near- Storage Processing

by Guiang, Jonathan , Arora, Aashay , Swanson, Steven in Data analysis , Data processing , Filtration

2025

Data analysis in high-energy physics (HEP) begins with data reduction, where vast datasets are filtered to extract relevant events. At the Large Hadron Collider (LHC), this process is bottlenecked by slow data transfers between storage and compute nodes. To address this, we introduce SkimROOT, a near-data filtering system leveraging Data Processing Units (DPUs) to accelerate LHC data analysis. By performing filtering directly on storage servers and returning only the relevant data, SkimROOT minimizes data movement and reduces processing delays. Our prototype demonstrates significant efficiency gains, achieving a 44.3× performance improvement, paving the way for faster physics discoveries.

Journal Article

Share this book

Add to My Shelf

400Gbps benchmark of XRootD HTTP-TPC

by Guiang, Jonathan , Arora, Aashay , Balcas, Justas in Communications traffic , Servers , Software

2024

Due to the increased demand of network traffic expected during the HL-LHC era, the T2 sites in the USA will be required to have 400Gbps of available bandwidth to their storage solution. With the above in mind we are pursuing a scale test of XRootD software when used to perform Third Party Copy transfers using the HTTP protocol. Our main objective is to understand the possible limitations in the software stack to achieve the target transfer rate; to that end we have set up a testbed of multiple XRootD servers in both UCSD and Caltech which are connected through a dedicated link capable of 400 Gbps end-to-end. Building upon our experience deploying containerized XRootD servers, we use Kubernetes to easily deploy and test different configurations of our testbed. In this work, we will present our experience doing these tests and the lessons learned.

Journal Article

Share this book

Add to My Shelf

Data Movement Manager (DMM) for the SENSE-Rucio Interoperation Prototype

by Guiang, Jonathan , Arora, Aashay , Newman, Harvey in Computational grids , Data management , Data transfer (computers)

2025

The Data Movement Manager (DMM) is a prototype interface that connects CERN’s data management software, Rucio, with the Software-Defined Networking (SDN) service SENSE by ESnet. It enables SDN-enabled highenergy physics data flows using the existing worldwide LHC computing grid infrastructure. A key feature of DMM is transfer-priority-based bandwidth allocation, optimizing network usage. Additionally, it provides fine-grained monitoring of underperforming flows by leveraging end-to-end data flow monitoring. This is achieved through access to host-level (network interface) throughput metrics and transfer-tool (FTS) data transfer job-level metrics. This paper details the design and implementation of DMM.

Journal Article

Share this book

Add to My Shelf

CRIU - Checkpoint Restore in Userspace for computational simulations and scientific applications

by Guiang, Jonathan , Arora, Aashay , Thain, Greg in Checkpointing , Linux

2024

Creating new materials, discovering new drugs, and simulating systems are essential processes for research and innovation and require substantial computational power. While many applications can be split into many smaller independent tasks, some cannot and may take hours or weeks to run to completion. To better manage those longer-running jobs, it would be desirable to stop them at any arbitrary point in time and later continue their computation on another compute resource; this is usually referred to as checkpointing. While some applications can manage checkpointing programmatically, it would be preferable if the batch scheduling system could do that independently. This paper evaluates the feasibility of using CRIU (Checkpoint Restore in Userspace), an open-source tool for the GNU/Linux environments, emphasizing the OSG’s OSPool HTCondor setup. CRIU allows checkpointing the process state into a disk image and can deal with both open files and established network connections seamlessly. Furthermore, it can checkpoint traditional Linux processes and containerized workloads. The functionality seems adequate for many scenarios supported in the OSPool. However, some limitations prevent it from being usable in all circumstances.

Journal Article

Share this book

Add to My Shelf

Automated Network Services for Exascale Data Movement

by Guiang, Jonathan , Malone Melo, Andrew , Newman, Harvey in Data management , Experiments , Intelligent networks

2024

The Large Hadron Collider (LHC) experiments distribute data by leveraging a diverse array of National Research and Education Networks (NRENs), where experiment data management systems treat networks as a “blackbox” resource. After the High Luminosity upgrade, the Compact Muon Solenoid (CMS) experiment alone will produce roughly 0.5 exabytes of data per year. NREN Networks are a critical part of the success of CMS and other LHC experiments. However, during data movement, NRENs are unaware of data priorities, importance, or need for quality of service, and this poses a challenge for operators to coordinate the movement of data and have predictable data flows across multi-domain networks. The overarching goal of SENSE (The Software-defined network for End-to-end Networked Science at Exascale) is to enable National Labs and universities to request and provision end-to-end intelligent network services for their application workflows leveraging SDN (Software-Defined Networking) capabilities. This work aims to allow LHC Experiments and Rucio, the data management software used by CMS Experiment, to allocate and prioritize certain data transfers over the wide area network. In this paper, we will present the current progress of the integration of SENSE, Multi-domain end-to-end SDN Orchestration with QoS (Quality of Service) capabilities, with Rucio, the data management software used by CMS Experiment.

Journal Article

Share this book

Add to My Shelf

Moving the California distributed CMS XCache from bare metal into containers using Kubernetes

by Guiang, Jonathan , Fajardo, Edgar , Tadel, Matevz in Caching , Containers , Disks

2020

The University of California system maintains excellent networking between its campuses and a number of other Universities in California, including Caltech, most of them being connected at 100 Gbps. UCSD and Caltech Tier2 centers have joined their disk systems into a single logical caching system, with worker nodes from both sites accessing data from disks at either site. This successful setup has been in place for the last two years. However, coherently managing nodes at multiple physical locations is not trivial and requires an update on the operations model used. The Pacific Research Platform (PRP) provides Kubernetes resource pool spanning resources in the science demilitarized zones (DMZs) in several campuses in California and worldwide. We show how we migrated the XCache services from bare-metal deployments into containers using the PRP cluster. This paper presents the reasoning behind our hardware decisions and the experience in migrating to and operating in a mixed environment.

Journal Article

Share this book

Add to My Shelf

Measurements of the Higgs Boson Through Vector Boson Scattering and Software and Computing for Exascale Data Science

by Guiang, Jonathan Kasuke in Computational physics , Computer science , Particle physics

2024

This dissertation presents the analyses of WH and VVH production through vector boson scattering (VBS). The VBS WH analysis excludes scenarios where the HWW and HZZ couplings have opposite signs beyond 5 standard deviations. The VBS VVH analysis places limits on the HHVV coupling between -0.03 and 2.04 times the Standard Model value. Both analyses are based on proton-proton collision data recorded by the CMS experiment at the CERN LHC from 2016 to 2018 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 138 inverse fb. In addition, two projects for the high luminosity LHC upgrade are described: a highly parallelizable track-finding algorithm and managed networking for exabyte-scale data science.

Dissertation

Share this book

Add to My Shelf

Anomaly Detection for Automated Data Quality Monitoring in the CMS Detector

by Guiang, Jonathan , Paramesvaran, Sudarshan , Barberis, Emanuela in Anomalies , Charged particles , Histograms

2026

Successful operation of large particle detectors like the Compact Muon Solenoid (CMS) at the CERN Large Hadron Collider requires rapid, in-depth assessment of data quality. We introduce the “AutoDQM” system for Automated Data Quality Monitoring using advanced statistical techniques and unsupervised machine learning. Anomaly detection algorithms based on the beta-binomial probability function and principal component analysis are tested on the full set of proton-proton collision data collected by CMS in 2022. AutoDQM identifies anomalous “bad” data affected by significant detector malfunction at a rate 4 – 6 times higher than “good” data, demonstrating its effectiveness as a general data quality monitoring tool.

Journal Article

Share this book

Add to My Shelf

400Gbps benchmark of XRootD HTTP-TPC

by Guiang, Jonathan , Arora, Aashay , Balcas, Justas in Communications traffic , Servers , Software

2023

Due to the increased demand of network traffic expected during the HL-LHC era, the T2 sites in the USA will be required to have 400Gbps of available bandwidth to their storage solution. With the above in mind we are pursuing a scale test of XRootD software when used to perform Third Party Copy transfers using the HTTP protocol. Our main objective is to understand the possible limitations in the software stack to achieve the target transfer rate; to that end we have set up a testbed of multiple XRootD servers in both UCSD and Caltech which are connected through a dedicated link capable of 400 Gbps end-to-end. Building upon our experience deploying containerized XRootD servers, we use Kubernetes to easily deploy and test different configurations of our testbed. In this work, we will present our experience doing these tests and the lessons learned.

Paper

Share this book

Add to My Shelf

Data Movement Manager (DMM) for the SENSE-Rucio Interoperation Prototype

by Guiang, Jonathan , Arora, Aashay , Newman, Harvey in Computational grids , Data management , Data transfer (computers)

2025

The Data Movement Manager (DMM) is a prototype interface that connects CERN's data management software, Rucio, with the Sofware-Defined Networking (SDN) service SENSE by ESNet. It enables SDN-enabled high-energy physics data flows using the existing worldwide LHC computing grid infrastructure. A key feature of DMM is transfer priority-based bandwidth allocation, optimizing network usage. Additionally, it provides fine-grained monitoring of underperforming flows by leveraging end-to-end data flow monitoring. This is achieved through access to host-level (network interface) throughput metrics and transfer-tool (FTS) data transfer job-level metrics. This paper details the design and implementation of DMM.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter