Catalogue Search | MBRL

Benchmarking and self-assessment for parliaments

by O'Brien, Mitchell, editor , Stapenhurst, Rick, editor , Von Trapp, Lisa, editor in Legislative bodies Evaluation. , Government productivity Evaluation. , Benchmarking (Management)

Book

Share this book

Add to My Shelf

Estimating the coherence of noise

by Flammia, Steven T , Wallman, Joel , Granade, Chris in Benchmarking , characterization , Coherence

2015

Noise mechanisms in quantum systems can be broadly characterized as either coherent (i.e., unitary) or incoherent. For a given fixed average error rate, coherent noise mechanisms will generally lead to a larger worst-case error than incoherent noise. We show that the coherence of a noise source can be quantified by the unitarity, which we relate to the average change in purity averaged over input pure states. We then show that the unitarity can be efficiently estimated using a protocol based on randomized benchmarking that is efficient and robust to state-preparation and measurement errors. We also show that the unitarity provides a lower bound on the optimal achievable gate infidelity under a given noisy process.

Journal Article

Share this book

Add to My Shelf

Routines for results : a quick-reference guidebook of end-to-end solutions to solidify your small business

by Hook, Chris, author , Burge, Ryan C., author , Bagg, James, author in Benchmarking (Management) , Performance. , Small business Management.

Book

Share this book

Add to My Shelf

Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline

by Chougule, Kapeel , Agda, Jireh R. A. , Ou, Shujun in Accuracy , Animal Genetics and Genomics , Animals

2019

Background Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared. Moreover, a comprehensive pipeline is needed to produce a non-redundant library of TEs for species lacking this resource to generate whole-genome TE annotations. Results We benchmark existing programs based on a carefully curated library of rice TEs. We evaluate the performance of methods annotating long terminal repeat (LTR) retrotransposons, terminal inverted repeat (TIR) transposons, short TIR transposons known as miniature inverted transposable elements (MITEs), and Helitrons. Performance metrics include sensitivity, specificity, accuracy, precision, FDR, and F 1 . Using the most robust programs, we create a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a filtered non-redundant TE library for annotation of structurally intact and fragmented elements. EDTA also deconvolutes nested TE insertions frequently found in highly repetitive genomic regions. Using other model species with curated TE libraries (maize and Drosophila), EDTA is shown to be robust across both plant and animal species. Conclusions The benchmarking results and pipeline developed here will greatly facilitate TE annotation in eukaryotic genomes. These annotations will promote a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels. EDTA is open-source and freely available: https://github.com/oushujun/EDTA .

Journal Article

Share this book

Add to My Shelf

Benchmarking transaction and analytical processing systems : the creation of a mixed workload benchmark and its application

by Bog, Anja, author in Benchmarking (Management) , Database management. , Management information systems.

Systems for Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) are currently separate. The potential of the latest technologies and changes in operational and analytical applications over the last decade have given rise to the unification of these systems, which can be of benefit for both workloads. Research and industry have reacted and prototypes of hybrid database systems are now appearing. Benchmarks are the standard method for evaluating, comparing and supporting the development of new database systems. Because of the separation of OLTP and OLAP systems, existing benchmarks are only focused on one or the other. With the rise of hybrid database systems, benchmarks to assess these systems will be needed as well. Based on the examination of existing benchmarks, a new benchmark for hybrid database systems is introduced in this book. It is furthermore used to determine the effect of adding OLAP to an OLTP workload and is applied to analyze the impact of typically used optimizations in the historically separate OLTP and OLAP domains in mixed-workload scenarios.

Book

Share this book

Add to My Shelf

Comparative evaluation and performance of large language models on expert level critical care questions: a benchmark study

by van de Sande, Davy , Gommers, Diederik , Goeijenbier, Marco in Accuracy , Anesthesiology , Artificial intelligence

2025

Background Large language models (LLMs) show increasing potential for their use in healthcare for administrative support and clinical decision making. However, reports on their performance in critical care medicine is lacking. Methods This study evaluated five LLMs (GPT-4o, GPT-4o-mini, GPT-3.5-turbo, Mistral Large 2407 and Llama 3.1 70B) on 1181 multiple choice questions (MCQs) from the gotheextramile.com database, a comprehensive database of critical care questions at European Diploma in Intensive Care examination level. Their performance was compared to random guessing and 350 human physicians on a 77-MCQ practice test. Metrics included accuracy, consistency, and domain-specific performance. Costs, as a proxy for energy consumption, were also analyzed. Results GPT-4o achieved the highest accuracy at 93.3%, followed by Llama 3.1 70B (87.5%), Mistral Large 2407 (87.9%), GPT-4o-mini (83.0%), and GPT-3.5-turbo (72.7%). Random guessing yielded 41.5% ( p < 0.001). On the practice test, all models surpassed human physicians, scoring 89.0%, 80.9%, 84.4%, 80.3%, and 66.5%, respectively, compared to 42.7% for random guessing ( p < 0.001) and 61.9% for the human physicians. However, in contrast to the other evaluated LLMs ( p < 0.001), GPT-3.5-turbo’s performance did not significantly outperform physicians ( p = 0.196). Despite high overall consistency, all models gave consistently incorrect answers. The most expensive model was GPT-4o, costing over 25 times more than the least expensive model, GPT-4o-mini. Conclusions LLMs exhibit exceptional accuracy and consistency, with four outperforming human physicians on a European-level practice exam. GPT-4o led in performance but raised concerns about energy consumption. Despite their potential in critical care, all models produced consistently incorrect answers, highlighting the need for more thorough and ongoing evaluations to guide responsible implementation in clinical settings.

Journal Article

Share this book

Add to My Shelf

Lean six sigma for dummies

by Morgan, John, 1949- author , Brenig-Jones, Martin, author in Six sigma (Quality control standard) , Industrial efficiency. , Quality control Standards.

The jargon-crowded language and theory of Lean Six Sigma can be intimidating for both beginners and experienced users. Written in plain English and packed with lots of helpful examples, this easy-to-follow guide arms you with tools and techniques for implementing Lean Six Sigma and offers guidance on everything from policy deployment to managing change in your organisation and everything in between.

Book

Share this book

Add to My Shelf

A benchmark of batch-effect correction methods for single-cell RNA sequencing data

by Ang, Kok Siong , Goh, Michelle , Zhang, Xiaomeng in Algorithms , Animal Genetics and Genomics , Animals

2020

Background Large-scale single-cell transcriptomic datasets generated using different technologies contain batch-specific systematic variations that present a challenge to batch-effect removal and data integration. With continued growth expected in scRNA-seq data, achieving effective batch integration with available computational resources is crucial. Here, we perform an in-depth benchmark study on available batch correction methods to determine the most suitable method for batch-effect removal. Results We compare 14 methods in terms of computational runtime, the ability to handle large datasets, and batch-effect correction efficacy while preserving cell type purity. Five scenarios are designed for the study: identical cell types with different technologies, non-identical cell types, multiple batches, big data, and simulated data. Performance is evaluated using four benchmarking metrics including kBET, LISI, ASW, and ARI. We also investigate the use of batch-corrected data to study differential gene expression. Conclusion Based on our results, Harmony, LIGER, and Seurat 3 are the recommended methods for batch integration. Due to its significantly shorter runtime, Harmony is recommended as the first method to try, with the other methods as viable alternatives.

Journal Article

Share this book

Add to My Shelf

The current state of benchmarking use and networks in facilities management

by Dodd, Justin R. , Kasana, Dipin , Smithwick, Jake in Benchmarks , Cost control , Facilities management

2023

PurposeThe purpose of this paper is to address the knowledge gap on the use of benchmarking techniques as utilized by facilities management (FM) professionals for the purpose of identifying means to improve industry benchmarking practices and guide the direction of future FM benchmarking research.Design/methodology/approachData were collected through surveying 585 FM practitioners representing various countries, organization sizes, types, industries. The data were summarized and analyzed through creating frequency tables, charts, and cross-tabulations. The survey results were compared to a previously published study on benchmarking use to identify the similarities and differences between benchmarking for FM functions vs core business functions.FindingsThe findings indicate that while FM-oriented benchmarking has been adopted at similar levels as other industries, FM-oriented benchmarking tends to be simplistic, lacks a strategic position in the company, often relies upon self-report survey data, is often performed by an individual with no formal benchmarking team and does not utilize process benchmarking or benchmarking networks. These findings emphasize the need for benchmarking education, advocacy for FM as a strategic business partner, the development of verified data sources and networks specifically for the unique greater facilities management field functions.Practical implicationsThese findings provide needed data on the state of FM practitioner use of benchmarking specifically for FM functions in North America. The results can be used as an assessment for the industry, to improve practitioner use and knowledge, and to identify further avenues for academic study.Originality/valueThe value of this study lies in filling in identified knowledge gaps on how FM practitioners are using benchmarking in practice. These data are absent from the research literature and offer the potential to help bridge the academic-practitioner divide to ensure that future research will focus on addressing practitioner needs for the industry.

Journal Article

Share this book

Add to My Shelf

Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods

by Li, Bo , Regev, Aviv , Dobin, Alexander in Accuracy , Animal Genetics and Genomics , Benchmarking

2019

Background Accurate fusion transcript detection is essential for comprehensive characterization of cancer transcriptomes. Over the last decade, multiple bioinformatic tools have been developed to predict fusions from RNA-seq, based on either read mapping or de novo fusion transcript assembly. Results We benchmark 23 different methods including applications we develop, STAR-Fusion and TrinityFusion, leveraging both simulated and real RNA-seq. Overall, STAR-Fusion, Arriba, and STAR-SEQR are the most accurate and fastest for fusion detection on cancer transcriptomes. Conclusion The lower accuracy of de novo assembly-based methods notwithstanding, they are useful for reconstructing fusion isoforms and tumor viruses, both of which are important in cancer research.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter