Catalogue Search | MBRL

Workflow Analysis for CGH Generation with Speckle Reduction and Occlusion Culling Using GPU Acceleration

by Blesa, Alfonso , Serón, Francisco J. , Sanz, Diego in Algorithms , Analysis , CGH simulation

2025

Although GPUs are widely used in Computer-Generated Holography (CGH), their specific application to concrete problems such as occlusion or speckle filtering through temporal multiplexing is not yet standardized and has not been fully explored. This work aims to optimize the software architecture by taking the GPU architecture into account in a novel way for these particular tasks. We present an optimized algorithm for CGH computation that provides a joint solution to the problems of speckle noise and occlusion. The workflow includes the generation and illumination of a 3D scene, the calculation of the CGH including color, occlusion, and temporal speckle-noise filtering, followed by scene reconstruction through both simulation and experimental methods. The research focuses on implementing a temporal multiplexing technique that simultaneously performs speckle denoising and occlusion culling for point clouds, evaluating two types of occlusion that differ in whether the occlusion effect dominates over the depth effect in a scene stored in a CGH, while leveraging the parallel processing capabilities of GPUs to achieve a more immersive and high-quality visual experience. To this end, the total computational cost associated with generating color and occlusion CGHs is evaluated, quantifying the relative contribution of each factor. The results indicate that, under strict occlusion conditions, temporal multiplexing filtering does not significantly impact the overall computational cost of CGH calculation.

Journal Article

Share this book

Add to My Shelf

Real-time colour hologram generation based on ray-sampling plane with multi-GPU acceleration

by Sato, Hirochika , Oi, Ryutaro , Nakayama, Hirotaka in 639/624/1075/146 , 639/624/1107/510 , Central processing units

2018

Although electro-holography can reconstruct three-dimensional (3D) motion pictures, its computational cost is too heavy to allow for real-time reconstruction of 3D motion pictures. This study explores accelerating colour hologram generation using light-ray information on a ray-sampling (RS) plane with a graphics processing unit (GPU) to realise a real-time holographic display system. We refer to an image corresponding to light-ray information as an RS image. Colour holograms were generated from three RS images with resolutions of 2,048 × 2,048; 3,072 × 3,072 and 4,096 × 4,096 pixels. The computational results indicate that the generation of the colour holograms using multiple GPUs (NVIDIA Geforce GTX 1080) was approximately 300–500 times faster than those generated using a central processing unit. In addition, the results demonstrate that 3D motion pictures were successfully reconstructed from RS images of 3,072 × 3,072 pixels at approximately 15 frames per second using an electro-holographic reconstruction system in which colour holograms were generated from RS images in real time.

Journal Article

Share this book

Add to My Shelf

High resolution topology optimization using graphics processing units (GPUs)

by Challis, Vivien J. , Roberts, Anthony P. , Grotowski, Joseph F. in Bulk modulus , Computation , Computational Mathematics and Numerical Analysis

2014

We present a Graphics Processing Unit (GPU) implementation of the level set method for topology optimization. The solution of three-dimensional topology optimization problems with millions of elements becomes computationally tractable with this GPU implementation and NVIDIA supercomputer-grade GPUs. We demonstrate this by solving the inverse homogenization problem for the design of isotropic materials with maximized bulk modulus. We trace the maximum bulk modulus optimization results to very high porosities to demonstrate the detail achievable with a high computational resolution. By utilizing a parallel GPU implementation rather than a sequential CPU implementation, similar increases in tractable computational resolution would be expected for other topology optimization problems.

Journal Article

Share this book

Add to My Shelf

Paper-like Electronic Displays: Large-Area Rubber-Stamped Plastic Sheets of Electronics and Microencapsulated Electrophoretic Inks

by Katz, Howard , Rogers, John A. , Raju, V. R. in Dielectric materials , Displays , Electric current

2001

Electronic systems that use rugged lightweight plastics potentially offer attractive characteristics (low-cost processing, mechanical flexibility, large area coverage, etc.) that are not easily achieved with established silicon technologies. This paper summarizes work that demonstrates many of these characteristics in a realistic system: organic active matrix backplane circuits (256 transistors) for large (≈5 × 5-inch) mechanically flexible sheets of electronic paper, an emerging type of display. The success of this effort relies on new or improved processing techniques and materials for plastic electronics, including methods for (i) rubber stamping (microcontact printing) high-resolution (≈1 µm) circuits with low levels of defects and good registration over large areas, (ii) achieving low leakage with thin dielectrics deposited onto surfaces with relief, (iii) constructing high-performance organic transistors with bottom contact geometries, (iv) encapsulating these transistors, (v) depositing, in a repeatable way, organic semiconductors with uniform electrical characteristics over large areas, and (vi) low-temperature (≈ 100°C) annealing to increase the on/off ratios of the transistors and to improve the uniformity of their characteristics. The sophistication and flexibility of the patterning procedures, high level of integration on plastic substrates, large area coverage, and good performance of the transistors are all important features of this work. We successfully integrate these circuits with microencapsulated electrophoretic \"inks\" to form sheets of electronic paper.

Journal Article

Share this book

Add to My Shelf

Affective value of game items: a mood management and selective exposure approach

by Koo, Dong-Mo , Kim, Sang Jin , Bae, Joonheui in Affinity , Arousal , Boredom

2019

Purpose The purpose of this paper is to investigate the relationship between game items and mood management to show the affective value of game items. Specifically, the study examines the impact of interaction between two negative mood states (stress vs boredom) and types of game items (functional vs decorative) on the purchasing intention of game items. Design/methodology/approach Two experiments were conducted to predict the outcomes of using game items. Findings Game users effectively manage their level of arousal and mood valence using game items. The selective exposure theory provides additional understanding of different purchasing behaviors, suggesting that stressed users are more likely to purchase decorative items while bored users purchase functional items to manage their mood. Research limitations/implications The study results show the affective role of game items in mood management. While previous studies focused on the cognitive and functional aspects of purchasing game items, this study extends the value of game items as augmented products. Practical implications When launching new games, companies should provide game users free game items for mood management. In addition, to increase intervention potential and behavioral affinity, marketers need to develop and launch more game item types. Originality/value This study extends the understanding of affective value of game items by applying mood management and selective exposure theories to explain the purchase intention of game items.

Journal Article

Share this book

Add to My Shelf

Wanting More, Getting Less: Gaming Performance Measurement as a Form of Deviant Workplace Behavior

by Graf, Laura , Stumpf-Wollersheim, Jutta , Wendler, Wiebke S. in Academic achievement , Behavior , Business and Management

2019

Investigating the causes of unethical behaviors in academia, such as scientific misconduct, has become a highly important research subject. The current performance measurement practices (e.g., equating research performance with the number of publications in top-tier journals) are frequently referred to as being responsible for scientists' unethical behaviors. We conducted qualitative semi-structured interviews with different stakeholders of the higher education system (e.g., professors and policy makers; N = 43) to analyze the influence of performance measurement on scientists' behavior. We followed a three-step coding procedure and found (1) that the participants described a variety of positive behavioral consequences (e.g., higher productivity) but mainly negative behavioral consequences (e.g., questionable publishing practices) of current performance measurement practices in academia; (2) that scientists' behavior can be described as gaming performance measurement (i.e., achieving performance goals by reducing performance quality and focusing on those tasks that are measured); and (3) that gaming performance measurement shares the same characteristics as deviant workplace behavior (i.e., a voluntary violation of organizational norms that harms the university). We discuss that gaming performance measurement has not been considered as a type of deviant workplace behavior in the previous literature. Furthermore, we draw from research on deviant workplace behavior and goal setting to discuss psychological processes that may underlie gaming performance measurement. Our results indicate the importance of connecting literature on deviant workplace behavior and goal setting to advance our understanding of gaming performance measurement.

Journal Article

Share this book

Add to My Shelf

Memory Coalescing Implementation of Metropolis Resampling on Graphics Processing Unit

by Dülger, Özcan , Demirekler, Mübeccel , Oğuztüzün, Halit in Algorithms , Graphics processing units , Resampling

2018

Owing to many cores in its architecture, graphics processing unit (GPU) offers promise for parallel execution of the particle filter. A stage of the particle filter that is particularly challenging to parallelize is resampling. There are parallel resampling algorithms in the literature such as Metropolis resampling, which does not require a collective operation such as cumulative sum over weights and does not suffer from numerical instability. However, with large number of particles, Metropolis resampling becomes slow. This is because of the non-coalesced access problem on the global memory of the GPU. In this article, we offer solutions for this problem of Metropolis resampling. We introduce two implementation techniques, named Metropolis-C1 and Metropolis-C2, and compare them with the original Metropolis resampling on NVIDIA Tesla K40 board. In the first scenario where these two techniques achieve their fastest execution times, Metropolis-C1 is faster than the others, but yields the worst results in quality. However, Metropolis-C2 is closer to Metropolis resampling in quality. In the second scenario where all three algorithms yield similar quality, although Metropolis-C1 and Metropolis-C2 get slower, they are still faster than the original Metropolis resampling.

Journal Article

Share this book

Add to My Shelf

OpenMP and CUDA simulations of Sella Zerbino Dam break on unstructured grids

by Petaccia, G. , Leporati, F. , Torti, E. in Architecture (computers) , Breaking , Computation

2016

This paper presents two 2D dam break parallelized models based on shallow water equations (SWE) written in conservative form. The models were implemented exploiting multicore PC systems and graphics processor unit (GPU) architectures under the OpenMP and the NVIDIA™’s compute unified device architecture (CUDA) frameworks. The mathematical model is solved using a finite-volume technique on an unstructured grid, with Roe’s approximate Riemann solver, a first-order upwind scheme. The upwind treatment of the source terms is implemented. A technique to cope with a wetting-drying advance front is adopted, together with the inclusion of the influence of source terms in the stability constraint in order to prevent negative water depths at the dry fronts. The proposed model is first applied to a laboratory test and then to a real dam break that occurred in Italy in 1935. Results on different grid sizes are compared to show the computing efficiency between the original sequential model and the parallelized models.

Journal Article

Share this book

Add to My Shelf

Accelerated bulk memory operations on heterogeneous multi-core systems

by Shi, Weidong , Lee, JongHyuk , Gil, JoonMin in 3-D graphics , Bandwidths , Central processing units

2018

A traditional fixed-function graphics accelerator has evolved into a programmable general-purpose graphics processing unit over the past few years, the general-purpose computing on GPU (GPGPU). Recently, revolutionary measures have been taken along this direction: an integrated GPU, i.e., CPUs and GPUs are integrated into the same package or even into the same die. However, considering a system-on-chip, the GPU takes up considerable silicon resources, but when running non-graphical workloads or non-GPGPU applications it is likely that overall system performance will not be affected. This paper presents a novel approach to accelerate conventional operations that are normally performed on CPUs, which are bulk memory operations such as memcpy or memcmp, using an integrated GPU. Offloading bulk memory operations to the GPU has many benefits: (i) The throughput GPU outperforms the CPU in bulk memory operations; (ii) for on-die GPUs with unified cache between the GPU and the CPU, the CPU can utilize the GPU private cache to store the moved data and reduce the CPU cache bottleneck; (iii) additional lightweight hardware can also support asynchronous offloads; and (iv) unlike the prior art using a dedicated hardware copy engine (e.g., DMA), our approach utilizes as much GPU hardware resources as possible. The performance results based on our solution showed that offloaded bulk memory operations outperform CPU up to 4.3 times faster on micro-benchmarks while using fewer resources. Using eight real-world applications and a cycle-based full-system simulation environment, five of eight applications showed about 30% speedup and two applications showed about 20% speedup.

Journal Article

Share this book

Add to My Shelf

An experimental evaluation of extreme learning machines on several hardware devices

by Zhang, Qi , Wu, Gang , Li, Liang in Acceleration , Algorithms , Artificial Intelligence

2020

As an important learning algorithm, extreme learning machine (ELM) is known for its excellent learning speed. With the expansion of ELM’s applications in the field of classification and regression, the need for its real-time performance is increasing. Although the use of hardware acceleration is an obvious solution, how to select the appropriate acceleration hardware for ELM-based applications is a topic worthy of further discussion. For this purpose, we designed and evaluated the optimized ELM algorithms on three kinds of state-of-the-art acceleration hardware, i.e., multi-core CPU, Graphics Processing Unit (GPU), and Field-Programmable Gate Array (FPGA) which are all suitable for matrix multiplication optimization. The experimental results showed that the speedup ratio of these optimized algorithms on acceleration hardware achieved 10–800. Therefore, we suggest that (1) use GPU to accelerate ELM algorithms for large dataset, and (2) use FPGA for small dataset because of its lower power, especially for some embedded applications. We also opened our source code.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter