Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
22 result(s) for "Guiang, Jonathan"
Sort by:
SkimROOT: Accelerating LHC Data Filtering with Near- Storage Processing
Data analysis in high-energy physics (HEP) begins with data reduction, where vast datasets are filtered to extract relevant events. At the Large Hadron Collider (LHC), this process is bottlenecked by slow data transfers between storage and compute nodes. To address this, we introduce SkimROOT, a near-data filtering system leveraging Data Processing Units (DPUs) to accelerate LHC data analysis. By performing filtering directly on storage servers and returning only the relevant data, SkimROOT minimizes data movement and reduces processing delays. Our prototype demonstrates significant efficiency gains, achieving a 44.3× performance improvement, paving the way for faster physics discoveries.
400Gbps benchmark of XRootD HTTP-TPC
Due to the increased demand of network traffic expected during the HL-LHC era, the T2 sites in the USA will be required to have 400Gbps of available bandwidth to their storage solution. With the above in mind we are pursuing a scale test of XRootD software when used to perform Third Party Copy transfers using the HTTP protocol. Our main objective is to understand the possible limitations in the software stack to achieve the target transfer rate; to that end we have set up a testbed of multiple XRootD servers in both UCSD and Caltech which are connected through a dedicated link capable of 400 Gbps end-to-end. Building upon our experience deploying containerized XRootD servers, we use Kubernetes to easily deploy and test different configurations of our testbed. In this work, we will present our experience doing these tests and the lessons learned.
Data Movement Manager (DMM) for the SENSE-Rucio Interoperation Prototype
The Data Movement Manager (DMM) is a prototype interface that connects CERN’s data management software, Rucio, with the Software-Defined Networking (SDN) service SENSE by ESnet. It enables SDN-enabled highenergy physics data flows using the existing worldwide LHC computing grid infrastructure. A key feature of DMM is transfer-priority-based bandwidth allocation, optimizing network usage. Additionally, it provides fine-grained monitoring of underperforming flows by leveraging end-to-end data flow monitoring. This is achieved through access to host-level (network interface) throughput metrics and transfer-tool (FTS) data transfer job-level metrics. This paper details the design and implementation of DMM.
CRIU - Checkpoint Restore in Userspace for computational simulations and scientific applications
Creating new materials, discovering new drugs, and simulating systems are essential processes for research and innovation and require substantial computational power. While many applications can be split into many smaller independent tasks, some cannot and may take hours or weeks to run to completion. To better manage those longer-running jobs, it would be desirable to stop them at any arbitrary point in time and later continue their computation on another compute resource; this is usually referred to as checkpointing. While some applications can manage checkpointing programmatically, it would be preferable if the batch scheduling system could do that independently. This paper evaluates the feasibility of using CRIU (Checkpoint Restore in Userspace), an open-source tool for the GNU/Linux environments, emphasizing the OSG’s OSPool HTCondor setup. CRIU allows checkpointing the process state into a disk image and can deal with both open files and established network connections seamlessly. Furthermore, it can checkpoint traditional Linux processes and containerized workloads. The functionality seems adequate for many scenarios supported in the OSPool. However, some limitations prevent it from being usable in all circumstances.
Automated Network Services for Exascale Data Movement
The Large Hadron Collider (LHC) experiments distribute data by leveraging a diverse array of National Research and Education Networks (NRENs), where experiment data management systems treat networks as a “blackbox” resource. After the High Luminosity upgrade, the Compact Muon Solenoid (CMS) experiment alone will produce roughly 0.5 exabytes of data per year. NREN Networks are a critical part of the success of CMS and other LHC experiments. However, during data movement, NRENs are unaware of data priorities, importance, or need for quality of service, and this poses a challenge for operators to coordinate the movement of data and have predictable data flows across multi-domain networks. The overarching goal of SENSE (The Software-defined network for End-to-end Networked Science at Exascale) is to enable National Labs and universities to request and provision end-to-end intelligent network services for their application workflows leveraging SDN (Software-Defined Networking) capabilities. This work aims to allow LHC Experiments and Rucio, the data management software used by CMS Experiment, to allocate and prioritize certain data transfers over the wide area network. In this paper, we will present the current progress of the integration of SENSE, Multi-domain end-to-end SDN Orchestration with QoS (Quality of Service) capabilities, with Rucio, the data management software used by CMS Experiment.
Moving the California distributed CMS XCache from bare metal into containers using Kubernetes
The University of California system maintains excellent networking between its campuses and a number of other Universities in California, including Caltech, most of them being connected at 100 Gbps. UCSD and Caltech Tier2 centers have joined their disk systems into a single logical caching system, with worker nodes from both sites accessing data from disks at either site. This successful setup has been in place for the last two years. However, coherently managing nodes at multiple physical locations is not trivial and requires an update on the operations model used. The Pacific Research Platform (PRP) provides Kubernetes resource pool spanning resources in the science demilitarized zones (DMZs) in several campuses in California and worldwide. We show how we migrated the XCache services from bare-metal deployments into containers using the PRP cluster. This paper presents the reasoning behind our hardware decisions and the experience in migrating to and operating in a mixed environment.
Measurements of the Higgs Boson Through Vector Boson Scattering and Software and Computing for Exascale Data Science
This dissertation presents the analyses of WH and VVH production through vector boson scattering (VBS). The VBS WH analysis excludes scenarios where the HWW and HZZ couplings have opposite signs beyond 5 standard deviations. The VBS VVH analysis places limits on the HHVV coupling between -0.03 and 2.04 times the Standard Model value. Both analyses are based on proton-proton collision data recorded by the CMS experiment at the CERN LHC from 2016 to 2018 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 138 inverse fb. In addition, two projects for the high luminosity LHC upgrade are described: a highly parallelizable track-finding algorithm and managed networking for exabyte-scale data science.
Anomaly Detection for Automated Data Quality Monitoring in the CMS Detector
Successful operation of large particle detectors like the Compact Muon Solenoid (CMS) at the CERN Large Hadron Collider requires rapid, in-depth assessment of data quality. We introduce the “AutoDQM” system for Automated Data Quality Monitoring using advanced statistical techniques and unsupervised machine learning. Anomaly detection algorithms based on the beta-binomial probability function and principal component analysis are tested on the full set of proton-proton collision data collected by CMS in 2022. AutoDQM identifies anomalous “bad” data affected by significant detector malfunction at a rate 4 – 6 times higher than “good” data, demonstrating its effectiveness as a general data quality monitoring tool.
400Gbps benchmark of XRootD HTTP-TPC
Due to the increased demand of network traffic expected during the HL-LHC era, the T2 sites in the USA will be required to have 400Gbps of available bandwidth to their storage solution. With the above in mind we are pursuing a scale test of XRootD software when used to perform Third Party Copy transfers using the HTTP protocol. Our main objective is to understand the possible limitations in the software stack to achieve the target transfer rate; to that end we have set up a testbed of multiple XRootD servers in both UCSD and Caltech which are connected through a dedicated link capable of 400 Gbps end-to-end. Building upon our experience deploying containerized XRootD servers, we use Kubernetes to easily deploy and test different configurations of our testbed. In this work, we will present our experience doing these tests and the lessons learned.
Data Movement Manager (DMM) for the SENSE-Rucio Interoperation Prototype
The Data Movement Manager (DMM) is a prototype interface that connects CERN's data management software, Rucio, with the Sofware-Defined Networking (SDN) service SENSE by ESNet. It enables SDN-enabled high-energy physics data flows using the existing worldwide LHC computing grid infrastructure. A key feature of DMM is transfer priority-based bandwidth allocation, optimizing network usage. Additionally, it provides fine-grained monitoring of underperforming flows by leveraging end-to-end data flow monitoring. This is achieved through access to host-level (network interface) throughput metrics and transfer-tool (FTS) data transfer job-level metrics. This paper details the design and implementation of DMM.