Catalogue Search | MBRL

Experience in dynamic tape drive allocation to manage scientific data

by Fornari, F. , Cavalli, A. , Sapunenko, V. in Large Hadron Collider , Nuclear physics , Physics

2023

The main computing and storage facility of INFN (Italian Institute for Nuclear Physics) running at CNAF hosts and manages tens of Petabytes of data produced by the LHC (Large Hadron Collider) experiments at CERN and other scientific collaborations in which INFN is involved. Most of these data are stored on tape resources of different technologies. All the tape drives can be used for administrative tasks (as repack, audit, space reclamation), as well to write and read data of all the experiments. Moreover, the usage of tape resources by scientific communities will become considerably more intense in the next years and the amount of data on tape will double by 2025. For these reasons, the issue of the concurrent access to tape drives is significant. We designed a software solution to optimize the efficiency of the shared usage of tape drives in our environment and put it in production in January 2020. In this paper we present the experience with such dynamic tape resources allocation in production. Comparing it with the previous static allocation method, we observed an improvement in reading throughput up to 85%. Moreover, we describe the new features added to our solution to optimize the efficiency of the shared usage of tape drives of different technologies.

Journal Article

Share this book

Add to My Shelf

Last developments of the INFN CNAF Long Term Data Preservation (LTDP) project: the CDF data recover and safekeeping

by Fattibene, E , Cavalli, A , Ricci, P P in Collaboration , Computation , Data analysis

2018

The INFN CNAF Tier-1 has become the Italian national data center for the INFN computing activities since 2005. As one of the reference sites for data storage and computing provider in the High Energy Physics (HEP) community it offers resources to all the four LHC experiments and many other HEP and non-HEP collaborations. The CDF experiment has used the INFN Tier-1 resources for many years and, after the end of data taking in 2011, it faced the challenge to both preserve the large amount of scientific data produced and give the possibility to access and reuse the whole information in the future using the specific computing model. For this reason starting from the end of 2012 the CDF Italian collaboration, together with the INFN CNAF and Fermilab (FNAL), introduced a Long Term Data Preservation (LTDP) project with the purpose of preserving and sharing all the CDF data and the related analysis framework and knowledge. This is particularly challenging since part of the software releases is no longer supported and the amount of data to be preserved is rather large. The first objective of the collaboration was the copy of all the CDF RUN-2 raw data and user level ntuples (about 4 PB) from FNAL to the INFN CNAF tape library backend using a dedicated network link. This task was successfully accomplished during the last years and, in addition, a system to implement regular integrity check of data has been developed. This system ensures that all the data are completely accessible and it can automatically retrieve an identical copy of problematic or corrupted file from the original dataset at FNAL. The setup of a dedicated software framework, which allows users to access and analyse the data with the complete CDF analysis chain, was also carried out with the addition of users and system administrators detailed documentation for the long-term future. Furthermore a second and more ambitious objective emerged during 2016 with a feasibility study for reading the first CDF RUN-1 dataset now stored as an unique copy in a huge amount (about 4000) of old Exabyte tape cartridges. With the installation of compatible refurbished tape drive autoloaders an initial test bed was completed and the first phase of the Exabyte tapes reading activity started. In the present article, we will illustrate the state of the art of the LTDP project with a particular attention to the technical solutions adopted in order to store and maintain the CDF data and the analysis framework, and to overcome the issues that have arisen during the recent activities. The CDF model could also prove useful for designing new data preservation projects for other experiments or use cases.

Journal Article

Share this book

Add to My Shelf

Dynamic sharing of tape drives accessing scientific data

by Cavalli, A. , Sapunenko, V. , Fattibene, E. in Data management , Large Hadron Collider , Nuclear physics

2018

The data management infrastructure operated at CNAF, the central computing and storage facility of INFN (Italian Institute for Nuclear Physics), is based on both disk and tape storage resources. About 40 Petabytes of scientific data produced by LHC (Large Hadron Collider) at CERN and other experiments in which INFN is involved are stored on tape. This is the highest latency storage tier within HSM (Hierarchical Storage Management) environment. Writing and reading requests on tape media are satisfied through a set of Oracle-StorageTek T10000D tape drives, shared among different scientific communities. In the next years, the usage of tape drives will become more intense not only due to the growing amount of scientific data to manage but also due to general trend to use tapes as \"slow disk\", announced by the main user communities. In order to reduce hardware purchases, a key point is to minimize the inactivity periods of tape drives. In this paper we present a software solution designed to optimize the efficiency of the shared usage of tape drives in our environment.

Journal Article

Share this book

Add to My Shelf

Abstracting application deployment on Cloud infrastructures

by Salomoni, D , Fattibene, E , Aiftimiei, D C in Cloud computing , Data centers , Infrastructure

2017

Deploying a complex application on a Cloud-based infrastructure can be a challenging task. In this contribution we present an approach for Cloud-based deployment of applications and its present or future implementation in the framework of several projects, such as \"!CHAOS: a cloud of controls\" [1], a project funded by MIUR (Italian Ministry of Research and Education) to create a Cloud-based deployment of a control system and data acquisition framework, \"INDIGO-DataCloud\" [2], an EC H2020 project targeting among other things high-level deployment of applications on hybrid Clouds, and \"Open City Platform\"[3], an Italian project aiming to provide open Cloud solutions for Italian Public Administrations. We considered to use an orchestration service to hide the complex deployment of the application components, and to build an abstraction layer on top of the orchestration one. Through Heat [4] orchestration service, we prototyped a dynamic, on-demand, scalable platform of software components, based on OpenStack infrastructures. On top of the orchestration service we developed a prototype of a web interface exploiting the Heat APIs. The user can start an instance of the application without having knowledge about the underlying Cloud infrastructure and services. Moreover, the platform instance can be customized by choosing parameters related to the application such as the size of a File System or the number of instances of a NoSQL DB cluster. As soon as the desired platform is running, the web interface offers the possibility to scale some infrastructure components. In this contribution we describe the solution design and implementation, based on the application requirements, the details of the development of both the Heat templates and of the web interface, together with possible exploitation strategies of this work in Cloud data centers.

Journal Article

Share this book

Add to My Shelf

Looking at the sub-TeV sky with cosmic muons detected in the EEE MRPC telescopes

by Garbini, M. , Mazziotta, M. N. , Bossini, E. in Applied and Technical Physics , Astronomical maps , Atomic

2015

Distributions of secondary cosmic muons were measured by the Multigap Resistive Plate Chambers (MRPC) telescopes of the Extreme Energy Events (EEE) Project, spanning a large angular and temporal acceptance through its sparse sites, to test the possibility to search for any anomaly over long runs. After correcting for the time exposure and geometrical acceptance of the telescopes, data were transformed into equatorial coordinates, and equatorial sky maps were obtained from different sites on a preliminary dataset of 110M events in the energy range at sub-TeV scale.

Journal Article

Share this book

Add to My Shelf

Extending the farm on external sites: the INFN Tier-1 experience

by Chierici, A , Chiarelli, L , Falabella, A in Cloud computing , Computer centers , Data centers

2017

The Tier-1 at CNAF is the main INFN computing facility offering computing and storage resources to more than 30 different scientific collaborations including the 4 experiments at the LHC. It is also foreseen a huge increase in computing needs in the following years mainly driven by the experiments at the LHC (especially starting with the run 3 from 2021) but also by other upcoming experiments such as CTA[1] While we are considering the upgrade of the infrastructure of our data center, we are also evaluating the possibility of using CPU resources available in other data centres or even leased from commercial cloud providers. Hence, at INFN Tier-1, besides participating to the EU project HNSciCloud, we have also pledged a small amount of computing resources (∼ 2000 cores) located at the Bari ReCaS[2] for the WLCG experiments for 2016 and we are testing the use of resources provided by a commercial cloud provider. While the Bari ReCaS data center is directly connected to the GARR network[3] with the obvious advantage of a low latency and high bandwidth connection, in the case of the commercial provider we rely only on the General Purpose Network. In this paper we describe the set-up phase and the first results of these installations started in the last quarter of 2015, focusing on the issues that we have had to cope with and discussing the measured results in terms of efficiency.

Journal Article

Share this book

Add to My Shelf

EEE - Extreme Energy Events: an astroparticle physics experiment in Italian High Schools

by Hatzifotiadou, D , Taiuti, M , Cifarelli, L in Cosmic ray showers , Cosmic rays , Physics

2016

The Extreme Energy Events project (EEE) is aimed to study Extensive Air Showers (EAS) from primary cosmic rays of more than 1018 eV energy detecting the ground secondary muon component using an array of telescopes with high spatial and time resolution. The second goal of the EEE project is to involve High School teachers and students in this advanced research work and to initiate them in scientific culture: to reach both purposes the telescopes are located inside High School buildings and the detector construction, assembling and monitoring - together with data taking and analysis - are done by researchers from scientific institutions in close collaboration with them. At present there are 42 telescopes in just as many High Schools scattered all over Italy, islands included, plus two at CERN and three in INFN units. We report here some preliminary physics results from the first two common data taking periods together with the outreach impact of the project.

Journal Article

Share this book

Add to My Shelf

A Grid storage accounting system based on DGAS and HLRmon

by Fattibene, E , Cristofori, A , Guarise, A in Accounting , Accounting systems , Central processing units

2012

Accounting in a production-level Grid infrastructure is of paramount importance in order to measure the utilization of the available resources. While several CPU accounting systems are deployed within the European Grid Infrastructure (EGI), storage accounting systems, stable enough to be adopted in a production environment are not yet available. As a consequence, there is a growing interest in storage accounting and work on this is being carried out in the Open Grid Forum (OGF) where a Usage Record (UR) definition suitable for storage resources has been proposed for standardization. In this paper we present a storage accounting system which is composed of three parts: a sensor layer, a data repository with a transport layer (Distributed Grid Accounting System - DGAS) and a web portal providing graphical and tabular reports (HLRmon). The sensor layer is responsible for the creation of URs according to the schema (described in this paper) that is currently being discussed within OGF. DGAS is one of the CPU accounting systems used within EGI, in particular by the Italian Grid Infrastructure (IGI) and some other National Grid Initiatives (NGIs) and projects. DGAS architecture is evolving in order to collect Usage Records for different types of resources. This improvement allows DGAS to be used as a ‘general’ data repository and transport layer. HLRmon is the web portal acting as an interface to DGAS. It has been improved to retrieve storage accounting data from the DGAS repository and create reports in an easy way. This is very useful not only for the Grid users and administrators but also for the stakeholders.

Journal Article

Share this book

Add to My Shelf

WMSMonitor advancements in the EMI era

by Dongiovanni, D , Fattibene, E , Cesini, D in Categories , Computer architecture , Electromagnetic interference

2012

In production Grid infrastructures deploying EMI (European Middleware Initiative) middleware release, the Workload Management System (WMS) is the service responsible for the distribution of user tasks to the remote computing resources. Monitoring the reliability of this service, the job lifecycle and the workflow pattern generated by different user communities is an important and challenging activity. Initially designed to monitor and manage a distributed cluster of gLite WMS/LB (Logging and Bookeeping) services, WMSMonitor has proved to be a useful and flexible tool for a variety of user categories. In fact, after asynchronously extracting information from all monitored instances, WMSMonitor re-aggregates it by different keys (WMS instance, Virtual Organization, User, etc.) providing insight both on services status and on their usage to service administrators, developers, advanced Grid users and performance testers. The positive feedback on WMSMonitor utilization from various production Grid sites pushed us to improve the tool to enhance its flexibility and scalability exploiting a new architecture. Moreover the tool has been made compliant to recent evolutions in the monitored services. We therefore present the new version of WMSMonitor which can monitor EMI WMS/LB services and shows an improved user interface allowing better report capabilities. Among main novelties, we mention the collection of Job Submission Service (JSS) error type statistics and the adoption of ActiveMQ messaging system which now allows multiple data consumers to exploit collected information. Finally, it is worth to mention that the implemented architecture and the exploitation of a messaging layer commonly adopted in EMI Grid applications make WMSMonitor a flexible tool that can be easily extended to monitor other Grid services.

Journal Article

Share this book

Add to My Shelf

WMSMonitor advancements in the EMI era

by Dongiovanni, D , Fattibene, E , Cesini, D in Categories , Computer architecture , Electromagnetic interference

2012

In production Grid infrastructures deploying EMI (European Middleware Initiative) middleware release, the Workload Management System (WMS) is the service responsible for the distribution of user tasks to the remote computing resources. Monitoring the reliability of this service, the job lifecycle and the workflow pattern generated by different user communities is an important and challenging activity. Initially designed to monitor and manage a distributed cluster of gLite WMS/LB (Logging and Bookeeping) services, WMSMonitor has proved to be a useful and flexible tool for a variety of user categories. In fact, after asynchronously extracting information from all monitored instances, WMSMonitor re-aggregates it by different keys (WMS instance, Virtual Organization, User, etc.) providing insight both on services status and on their usage to service administrators, developers, advanced Grid users and performance testers. The positive feedback on WMSMonitor utilization from various production Grid sites pushed us to improve the tool to enhance its flexibility and scalability exploiting a new architecture. Moreover the tool has been made compliant to recent evolutions in the monitored services. We therefore present the new version of WMSMonitor which can monitor EMI WMS/LB services and shows an improved user interface allowing better report capabilities. Among main novelties, we mention the collection of Job Submission Service (JSS) error type statistics and the adoption of ActiveMQ messaging system which now allows multiple data consumers to exploit collected information. Finally, it is worth to mention that the implemented architecture and the exploitation of a messaging layer commonly adopted in EMI Grid applications make WMSMonitor a flexible tool that can be easily extended to monitor other Grid services.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter