Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Series TitleSeries Title
-
Reading LevelReading Level
-
YearFrom:-To:
-
More FiltersMore FiltersContent TypeItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
19,382
result(s) for
"Benchmarking"
Sort by:
Estimating the coherence of noise
by
Flammia, Steven T
,
Wallman, Joel
,
Granade, Chris
in
Benchmarking
,
characterization
,
Coherence
2015
Noise mechanisms in quantum systems can be broadly characterized as either coherent (i.e., unitary) or incoherent. For a given fixed average error rate, coherent noise mechanisms will generally lead to a larger worst-case error than incoherent noise. We show that the coherence of a noise source can be quantified by the unitarity, which we relate to the average change in purity averaged over input pure states. We then show that the unitarity can be efficiently estimated using a protocol based on randomized benchmarking that is efficient and robust to state-preparation and measurement errors. We also show that the unitarity provides a lower bound on the optimal achievable gate infidelity under a given noisy process.
Journal Article
Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline
by
Chougule, Kapeel
,
Agda, Jireh R. A.
,
Ou, Shujun
in
Accuracy
,
Animal Genetics and Genomics
,
Animals
2019
Background
Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared. Moreover, a comprehensive pipeline is needed to produce a non-redundant library of TEs for species lacking this resource to generate whole-genome TE annotations.
Results
We benchmark existing programs based on a carefully curated library of rice TEs. We evaluate the performance of methods annotating long terminal repeat (LTR) retrotransposons, terminal inverted repeat (TIR) transposons, short TIR transposons known as miniature inverted transposable elements (MITEs), and Helitrons. Performance metrics include sensitivity, specificity, accuracy, precision, FDR, and
F
1
. Using the most robust programs, we create a comprehensive pipeline called Extensive
de-novo
TE Annotator (EDTA) that produces a filtered non-redundant TE library for annotation of structurally intact and fragmented elements. EDTA also deconvolutes nested TE insertions frequently found in highly repetitive genomic regions. Using other model species with curated TE libraries (maize and Drosophila), EDTA is shown to be robust across both plant and animal species.
Conclusions
The benchmarking results and pipeline developed here will greatly facilitate TE annotation in eukaryotic genomes. These annotations will promote a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels. EDTA is open-source and freely available:
https://github.com/oushujun/EDTA
.
Journal Article
Benchmarking transaction and analytical processing systems : the creation of a mixed workload benchmark and its application
Systems for Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) are currently separate. The potential of the latest technologies and changes in operational and analytical applications over the last decade have given rise to the unification of these systems, which can be of benefit for both workloads. Research and industry have reacted and prototypes of hybrid database systems are now appearing. Benchmarks are the standard method for evaluating, comparing and supporting the development of new database systems. Because of the separation of OLTP and OLAP systems, existing benchmarks are only focused on one or the other. With the rise of hybrid database systems, benchmarks to assess these systems will be needed as well. Based on the examination of existing benchmarks, a new benchmark for hybrid database systems is introduced in this book. It is furthermore used to determine the effect of adding OLAP to an OLTP workload and is applied to analyze the impact of typically used optimizations in the historically separate OLTP and OLAP domains in mixed-workload scenarios.
Comparative evaluation and performance of large language models on expert level critical care questions: a benchmark study
by
van de Sande, Davy
,
Gommers, Diederik
,
Goeijenbier, Marco
in
Accuracy
,
Anesthesiology
,
Artificial intelligence
2025
Background
Large language models (LLMs) show increasing potential for their use in healthcare for administrative support and clinical decision making. However, reports on their performance in critical care medicine is lacking.
Methods
This study evaluated five LLMs (GPT-4o, GPT-4o-mini, GPT-3.5-turbo, Mistral Large 2407 and Llama 3.1 70B) on 1181 multiple choice questions (MCQs) from the gotheextramile.com database, a comprehensive database of critical care questions at European Diploma in Intensive Care examination level. Their performance was compared to random guessing and 350 human physicians on a 77-MCQ practice test. Metrics included accuracy, consistency, and domain-specific performance. Costs, as a proxy for energy consumption, were also analyzed.
Results
GPT-4o achieved the highest accuracy at 93.3%, followed by Llama 3.1 70B (87.5%), Mistral Large 2407 (87.9%), GPT-4o-mini (83.0%), and GPT-3.5-turbo (72.7%). Random guessing yielded 41.5% (
p
< 0.001). On the practice test, all models surpassed human physicians, scoring 89.0%, 80.9%, 84.4%, 80.3%, and 66.5%, respectively, compared to 42.7% for random guessing (
p
< 0.001) and 61.9% for the human physicians. However, in contrast to the other evaluated LLMs (
p
< 0.001), GPT-3.5-turbo’s performance did not significantly outperform physicians (
p
= 0.196). Despite high overall consistency, all models gave consistently incorrect answers. The most expensive model was GPT-4o, costing over 25 times more than the least expensive model, GPT-4o-mini.
Conclusions
LLMs exhibit exceptional accuracy and consistency, with four outperforming human physicians on a European-level practice exam. GPT-4o led in performance but raised concerns about energy consumption. Despite their potential in critical care, all models produced consistently incorrect answers, highlighting the need for more thorough and ongoing evaluations to guide responsible implementation in clinical settings.
Journal Article
Lean six sigma for dummies
The jargon-crowded language and theory of Lean Six Sigma can be intimidating for both beginners and experienced users. Written in plain English and packed with lots of helpful examples, this easy-to-follow guide arms you with tools and techniques for implementing Lean Six Sigma and offers guidance on everything from policy deployment to managing change in your organisation and everything in between.
A benchmark of batch-effect correction methods for single-cell RNA sequencing data
by
Ang, Kok Siong
,
Goh, Michelle
,
Zhang, Xiaomeng
in
Algorithms
,
Animal Genetics and Genomics
,
Animals
2020
Background
Large-scale single-cell transcriptomic datasets generated using different technologies contain batch-specific systematic variations that present a challenge to batch-effect removal and data integration. With continued growth expected in scRNA-seq data, achieving effective batch integration with available computational resources is crucial. Here, we perform an in-depth benchmark study on available batch correction methods to determine the most suitable method for batch-effect removal.
Results
We compare 14 methods in terms of computational runtime, the ability to handle large datasets, and batch-effect correction efficacy while preserving cell type purity. Five scenarios are designed for the study: identical cell types with different technologies, non-identical cell types, multiple batches, big data, and simulated data. Performance is evaluated using four benchmarking metrics including kBET, LISI, ASW, and ARI. We also investigate the use of batch-corrected data to study differential gene expression.
Conclusion
Based on our results, Harmony, LIGER, and Seurat 3 are the recommended methods for batch integration. Due to its significantly shorter runtime, Harmony is recommended as the first method to try, with the other methods as viable alternatives.
Journal Article
The current state of benchmarking use and networks in facilities management
by
Dodd, Justin R.
,
Kasana, Dipin
,
Smithwick, Jake
in
Benchmarks
,
Cost control
,
Facilities management
2023
PurposeThe purpose of this paper is to address the knowledge gap on the use of benchmarking techniques as utilized by facilities management (FM) professionals for the purpose of identifying means to improve industry benchmarking practices and guide the direction of future FM benchmarking research.Design/methodology/approachData were collected through surveying 585 FM practitioners representing various countries, organization sizes, types, industries. The data were summarized and analyzed through creating frequency tables, charts, and cross-tabulations. The survey results were compared to a previously published study on benchmarking use to identify the similarities and differences between benchmarking for FM functions vs core business functions.FindingsThe findings indicate that while FM-oriented benchmarking has been adopted at similar levels as other industries, FM-oriented benchmarking tends to be simplistic, lacks a strategic position in the company, often relies upon self-report survey data, is often performed by an individual with no formal benchmarking team and does not utilize process benchmarking or benchmarking networks. These findings emphasize the need for benchmarking education, advocacy for FM as a strategic business partner, the development of verified data sources and networks specifically for the unique greater facilities management field functions.Practical implicationsThese findings provide needed data on the state of FM practitioner use of benchmarking specifically for FM functions in North America. The results can be used as an assessment for the industry, to improve practitioner use and knowledge, and to identify further avenues for academic study.Originality/valueThe value of this study lies in filling in identified knowledge gaps on how FM practitioners are using benchmarking in practice. These data are absent from the research literature and offer the potential to help bridge the academic-practitioner divide to ensure that future research will focus on addressing practitioner needs for the industry.
Journal Article
Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods
by
Li, Bo
,
Regev, Aviv
,
Dobin, Alexander
in
Accuracy
,
Animal Genetics and Genomics
,
Benchmarking
2019
Background
Accurate fusion transcript detection is essential for comprehensive characterization of cancer transcriptomes. Over the last decade, multiple bioinformatic tools have been developed to predict fusions from RNA-seq, based on either read mapping or de novo fusion transcript assembly.
Results
We benchmark 23 different methods including applications we develop, STAR-Fusion and TrinityFusion, leveraging both simulated and real RNA-seq. Overall, STAR-Fusion, Arriba, and STAR-SEQR are the most accurate and fastest for fusion detection on cancer transcriptomes.
Conclusion
The lower accuracy of de novo assembly-based methods notwithstanding, they are useful for reconstructing fusion isoforms and tumor viruses, both of which are important in cancer research.
Journal Article