Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
127
result(s) for
"Hofstee, H."
Sort by:
Optimizing performance of GATK workflows using Apache Arrow In-Memory data framework
by
Al-Ars, Zaid
,
Hofstee, H. Peter
,
Ahmad, Tanveer
in
Algorithms
,
Animal Genetics and Genomics
,
Apache Arrow
2020
Background
Immense improvements in sequencing technologies enable producing large amounts of high throughput and cost effective next-generation sequencing (NGS) data. This data needs to be processed efficiently for further downstream analyses. Computing systems need this large amounts of data closer to the processor (with low latency) for fast and efficient processing. However, existing workflows depend heavily on disk storage and access, to process this data incurs huge disk I/O overheads. Previously, due to the cost, volatility and other physical constraints of DRAM memory, it was not feasible to place large amounts of working data sets in memory. However, recent developments in storage-class memory and non-volatile memory technologies have enabled computing systems to place huge data in memory to process it directly from memory to avoid disk I/O bottlenecks. To exploit the benefits of such memory systems efficiently, proper formatted data placement in memory and its high throughput access is necessary by avoiding (de)-serialization and copy overheads in between processes. For this purpose, we use the newly developed Apache Arrow, a cross-language development framework that provides language-independent columnar in-memory data format for efficient in-memory big data analytics. This allows genomics applications developed in different programming languages to communicate in-memory without having to access disk storage and avoiding (de)-serialization and copy overheads.
Implementation
We integrate Apache Arrow in-memory based Sequence Alignment/Map (SAM) format and its shared memory objects store library in widely used genomics high throughput data processing applications like BWA-MEM, Picard and GATK to allow in-memory communication between these applications. In addition, this also allows us to exploit the cache locality of tabular data and parallel processing capabilities through shared memory objects.
Results
Our implementation shows that adopting in-memory SAM representation in genomics high throughput data processing applications results in better system resource utilization, low number of memory accesses due to high cache locality exploitation and parallel scalability due to shared memory objects. Our implementation focuses on the GATK best practices recommended workflows for germline analysis on whole genome sequencing (WGS) and whole exome sequencing (WES) data sets. We compare a number of existing in-memory data placing and sharing techniques like ramDisk and Unix pipes to show how columnar in-memory data representation outperforms both. We achieve a speedup of 4.85x and 4.76x for WGS and WES data, respectively, in overall execution time of variant calling workflows. Similarly, a speedup of 1.45x and 1.27x for these data sets, respectively, is achieved, as compared to the second fastest workflow. In some individual tools, particularly in sorting, duplicates removal and base quality score recalibration the speedup is even more promising.
Availability
The code and scripts used in our experiments are available in both container and repository form at:
https://github.com/abs-tudelft/ArrowSAM
.
Journal Article
Introduction to the Cell multiprocessor
2005
This paper provides an introductory overview of the Cell multiprocessor. Cell represents a revolutionary extension of conventional microprocessor architecture and organization. The paper discusses the history of the project, the program objectives and challenges, the design concept, the architecture and programming models, and the implementation. [PUBLICATION ABSTRACT]
Journal Article
Nailfold capillary density is associated with the presence and severity of pulmonary arterial hypertension in systemic sclerosis
by
Dijkmans, B A C
,
Hofstee, H M A
,
Postmus, P E
in
Adult
,
Aged
,
Biological and medical sciences
2009
Objective:The aim of this study was to investigate whether there are differences in capillary nailfold changes in patients with systemic sclerosis (SSc) with and without pulmonary arterial hypertension (PAH), and whether these changes are associated with PAH severity and disease specificity.Methods:Capillary density and loop dimensions were studied in 21 healthy controls, 20 patients with idiopathic PAH (IPAH) and 40 patients with SSc. Of the 40 patients with SSc, 19 had no PAH (SSc–nonPAH) and 21 had PAH (SSc–PAH), of whom eight had PAH during exercise.Results:Capillary density was lower in SSc–PAH compared with patients who had SSc–nonPAH (4.33/mm vs 6.56/mm respectively, p = 0.001), but loop dimensions were equal. In comparison with IPAH, patients with SSc–PAH had reduced capillary density (4.33/mm vs 7.86/mm, p<0.001) and larger loop dimensions (total width 101.05 µm vs 44.43 µm, p<0.001). Capillary density in healthy controls (9.87/mm) was significantly higher when compared with SSc–nonPAH (6.56/mm), SSc–PAH (4.33/mm) and with IPAH (7.86/mm). No differences in capillary dimensions were present between healthy controls and IPAH.Capillary density correlated with mean pulmonary arterial pressure (PAP) at rest in SSc–PAH at rest (r = −0.58, p = 0.039) and IPAH (r = −0.67, p = 0.001).Conclusions:Reduction of nailfold capillary density, but not capillary loop dimensions is associated with PAH, and correlates with the severity of PAH in both SSc and IPAH. This suggests that either systemic microvascular changes play a part in the development of PAH, or that PAH itself contributes to systemic microvascular changes.
Journal Article
Nailfold capillary abnormalities in sclerodermatous chronic GVHD
by
Zweegman, S
,
Hofstee, H M A
,
de Waal, T T
in
631/250/1904
,
692/699/1670/122/1801
,
692/699/249/1529
2013
Chronic GVHD (cGVHD) complicating allo-SCT commonly presents as sclerotic skin changes resembling systemic sclerosis (SSc), suggesting a common pathophysiological pathway. Damage to capillaries is considered an early event in the pathogenesis of SSc, and is associated with characteristic nailfold capillary abnormalities. Whether such nailfold capillary abnormalities occur in sclerodermatous cGVHD is unknown. Nailfold videocapillaroscopy (NVC) was used to evaluate capillary morphology, density and loop dimensions in 14 patients with sclerodermatous cGVHD, 14 sex- and age-matched SSc patients, and 14 healthy controls. It was shown that none of the cGVHD patients and controls, whereas all SSc patients showed severe capillary abnormalities. cGVHD patients and controls showed no differences in capillary density (9.05 vs 9.16 loops/mm, respectively,
P
=0.84), and capillary loop dimensions (total loop width 44.36 vs 45.56 μm, respectively,
P
=0.84). Compared with cGVHD patients, SSc patients had a reduced capillary density (9.05 vs 5.25 loops/mm, respectively,
P
<0.001), and an increase in capillary loop dimensions (total loop width 44.36 vs 99.97 μm, respectively,
P
=<0.001). In conclusion sclerodermatous cGVHD patients do not show the characteristic microvascular abnormalities seen in SSc, suggesting that capillary damage does not contribute to the pathophysiology of sclerodermatous cGVHD, and making NVC unsuitable for early identification.
Journal Article
Cell Broadband Engine processor vault security architecture
by
Shimizu, K.
,
Liberty, J. S.
,
Hofstee, H. P.
in
Computer architecture
,
Computer peripherals
,
Computer platforms
2007
Current data protection technologies such as those based on public-key encryption and broadcast encryption focus on the secure control and protection of data. Although these protection schemes are effective and mathematically sound, they are susceptible to systematic attacks that utilize any underlying platform weakness, bypassing the cryptographic strengths of the actual schemes. Thus, ensuring that the computing platform supports the cryptographic data protection layers is a critical issue. The Cell Broadband Engine (Cell/B.E.) processor security architecture has three core features that are well suited for this purpose. It provides hardware-enforced process isolation in which code and data can execute in physically isolated memory space. It also provides the ability to perform hardware-supported authentication of any software stack (i.e., \"secure boot\") during run time. Finally, the architecture provides a hardware key to act as the root of an encryption chain. Data encrypted directly or indirectly by this key can be decrypted and provided only to an application that is running in the isolated memory and that has been verified. This significantly reduces an adversary's chances of manipulating software expose the key that is fundamental to a data protection or authentication scheme. Furthermore, it provides a foundation for an application to attest itself to a remote party by demonstrating access to a secret.
Journal Article
Efficacy and Safety of Outpatient Treatment Based on the Hestia Clinical Decision Rule with or without N-Terminal Pro–Brain Natriuretic Peptide Testing in Patients with Acute Pulmonary Embolism. A Randomized Clinical Trial
by
Faber, Laura M.
,
Peltenburg, Henny
,
Brouwer, Rolf E.
in
Cardiopulmonary Resuscitation - statistics & numerical data
,
Computed Tomography Angiography
,
Decision Support Techniques
2016
Outpatient treatment of pulmonary embolism (PE) may lead to improved patient satisfaction and reduced healthcare costs. However, trials to assess its safety and the optimal method for patient selection are scarce.
To validate the utility and safety of selecting patients with PE for outpatient treatment by the Hestia criteria and to compare the safety of the Hestia criteria alone with the Hestia criteria combined with N-terminal pro-brain natriuretic peptide (NT-proBNP) testing.
We performed a randomized noninferiority trial in 17 Dutch hospitals. We randomized patients with PE without any of the Hestia criteria to direct discharge or additional NT-proBNP testing. We discharged the latter patients as well if NT-proBNP did not exceed 500 ng/L or admitted them if NT-proBNP was greater than 500 ng/L. The primary endpoint was 30-day adverse outcome defined as PE- or bleeding-related mortality, cardiopulmonary resuscitation, or intensive care unit admission. The noninferiority margin for the primary endpoint was 3.4%.
We randomized 550 patients. In the NT-proBNP group, 34 of 275 (12%) had elevated NT-proBNP values and were managed as inpatients. No patient (0 of 34) with an elevated NT-proBNP level treated in hospital (0%; 95% confidence interval [CI], 0-10.2%), versus no patient (0 of 23) with a post hoc-determined elevated NT-proBNP level from the direct discharge group (0%; 95% CI, 0-14.8%), experienced the primary endpoint. In both trial cohorts, the primary endpoint occurred in none of the 275 patients (0%; 95% CI, 0-1.3%) subjected to NT-proBNP testing, versus in 3 of 275 patients (1.1%; 95% CI, 0.2-3.2%) in the direct discharge group (P = 0.25). During the 3-month follow-up, recurrent venous thromboembolism occurred in two patients (0.73%; 95% CI, 0.1-2.6%) in the NT-proBNP group versus three patients (1.1%; 95% CI, 0.2-3.2%) in the direct discharge group (P = 0.65).
Outpatient treatment of patients with PE selected on the basis of the Hestia criteria alone was associated with a low risk of adverse events. Given the low number of patients with elevated NT-proBNP levels, this trial was unable to draw definite conclusions regarding the incremental value of NT-proBNP testing in patients who fulfill the Hestia criteria. Clinical trial registered with www.trialregister.nl/trialreg/admin/rctview.asp?TC=2603 (NTR2603).
Journal Article
Custom circuit design as a driver of microprocessor performance
2000
This paper presents a survey of some of the most aggressive custom designs for CMOS processor products and prototypes in IBM. We argue that microprocessor performance growth, which has traditionally been driven primarily by CMOS technology and microarchitectural improvements, can receive a substantial contribution from improvements in circuit design and physical organization.
Journal Article
Soils of Seabee Hook, Cape Hallett, northern Victoria Land, Antarctica
2006
The soils of the Seabee Hook area of Cape Hallett in northern Victoria Land, Antarctica, were mapped and characterized. Seabee Hook is a low-lying gravel spit of beach deposits built up by coastal currents carrying basalt material from nearby cliffs. Seabee Hook is the location of an Adélie penguin (Pygoscelis adeliae) colony which influences the soils with additions of guano, dead birds, eggshells and feathers. A soil-landscape model was developed and a soil association was identified between the soils formed on mounds (relict beach ridges) favoured by penguins for nests (Typic Haplorthel) and the soils in the areas between the mounds (Typic Haplorthel/Typic Aquorthel). Soils formed on the mounds inhabited by penguins contained guano in the upper 50 cm, overlying sub-rounded beach-deposited gravel and sand. Soils between mounds had a thin veneer (< 5 cm) of guano overlying basaltic gravelly sand similar to that in the lower parts of the mound soils. The soils had high concentrations of nitrogen, organic carbon, phosphorus, cadmium, zinc, copper, and increased electrical conductivity, within horizons influenced by penguin guano. Five buried penguin bones were collected from the base of soil profiles and radiocarbon dated. The dates indicate that Seabee Hook has been colonized by penguins for at least 1000 years.
Journal Article
Giving Text Analytics a Boost
by
Polig, Raphael
,
Hagleitner, Christoph
,
Hofstee, H Peter
in
Data management
,
Information retrieval
,
Reconfigurable hardware
2018
The amount of textual data has reached a new scale and continues to grow at an unprecedented rate. IBM's SystemT software is a powerful text analytics system, which offers a query-based interface to reveal the valuable information that lies within these mounds of data. However, traditional server architectures are not capable of analyzing the so-called \"Big Data\" in an efficient way, despite the high memory bandwidth that is available. We show that by using a streaming hardware accelerator implemented in reconfigurable logic, the throughput rates of the SystemT's information extraction queries can be improved by an order of magnitude. We present how such a system can be deployed by extending SystemT's existing compilation flow and by using a multi-threaded communication interface that can efficiently use the bandwidth of the accelerator.
Groundwater characteristics at Seabee Hook, Cape Hallett, Antarctica
2006
Seabee Hook is a low lying gravel spit adjacent to Cape Hallett, northern Victoria Land, in the Ross Sea region of Antarctica and hosts an Adélie penguin (Pygoscelis adeliae) rookery. Dipwells were inserted to monitor changes in depth to, and volume of, groundwater and tracer tests were conducted to estimate aquifer hydraulic conductivity and groundwater velocity. During summer (November–February), meltwater forms a shallow, unconfined, aquifer perched on impermeable ice cemented soil. Groundwater extent and volume depends on the amount of snowfall as meltwater is primarily sourced from melting snow drifts. Groundwater velocity through the permeable gravel and sand was up to 7.8 m day−1, and hydraulic conductivities of 4.7 × 10−4 m s−1 to 3.7 × 10−5 m s−1 were measured. The presence of the penguin rookery, and the proximity of the sea, affects groundwater chemistry with elevated concentrations of salts (1205 mg L−1 sodium, 332 mg L−1 potassium) and nutrients (193 mg L−1 nitrate, 833 mg L−1 ammonia, 10 mg L−1 total phosphorus) compared with groundwater sourced away from the rookery, and with other terrestrial waters in Antarctica.
Journal Article