Catalogue Search | MBRL

Developing conversational interfaces for iOS : add responsive voice control to your apps

by Mitrevski, Martin in iOS (Electronic resource) , Multimodal user interfaces (Computer systems)

\"Learn how to incorporate your own conversational interfaces into iOS applications. This book will help you work comfortably multiple frameworks, including Apple's Speech and SiriKit frameworks; Google's API. AI conversational interfaces platform; and Facebook's Wit.ai. You'll explore the basics of natural language processing on iOS and see how to develop sentiment analysis with Apple's new Core ML framework. You'll also understand the primary challenges conversational interfaces face, and how to future proof your design. With the introduction of SiriKit and the Speech framework, iOS developers now have huge opportunities to work with conversational interfaces in their apps. The latest advancements in natural language processing and machine learning allow for the development of complex conversational interfaces. This book incorporates all aspects of conversational interfaces on iOS--from voice transcription to natural language processing and entities extraction to text to speech commands.\"-- Provided by publisher.

Book

Share this book

Add to My Shelf

Multimodal Large Language Models in Health Care: Applications, Challenges, and Future Outlook

by Sheikh, Javaid , Renault, Max-Antoine , Damseh, Rafat in Application , Artificial intelligence , Chatbots

2024

In the complex and multidimensional field of medicine, multimodal data are prevalent and crucial for informed clinical decisions. Multimodal data span a broad spectrum of data types, including medical images (eg, MRI and CT scans), time-series data (eg, sensor data from wearable devices and electronic health records), audio recordings (eg, heart and respiratory sounds and patient interviews), text (eg, clinical notes and research articles), videos (eg, surgical procedures), and omics data (eg, genomics and proteomics). While advancements in large language models (LLMs) have enabled new applications for knowledge retrieval and processing in the medical field, most LLMs remain limited to processing unimodal data, typically text-based content, and often overlook the importance of integrating the diverse data modalities encountered in clinical practice. This paper aims to present a detailed, practical, and solution-oriented perspective on the use of multimodal LLMs (M-LLMs) in the medical field. Our investigation spanned M-LLM foundational principles, current and potential applications, technical and ethical challenges, and future research directions. By connecting these elements, we aimed to provide a comprehensive framework that links diverse aspects of M-LLMs, offering a unified vision for their future in health care. This approach aims to guide both future research and practical implementations of M-LLMs in health care, positioning them as a paradigm shift toward integrated, multimodal data–driven medical practice. We anticipate that this work will spark further discussion and inspire the development of innovative approaches in the next generation of medical M-LLM systems.

Journal Article

Share this book

Add to My Shelf

Designing with the body : somaesthetic interaction design

by Hèoèok, Kristina, author in Multimodal user interfaces (Computer systems) Design and construction. , Human-machine systems Design and construction. , Somesthesia.

Book

Share this book

Add to My Shelf

Gold nanoshell-localized photothermal ablation of prostate tumors in a clinical pilot device study

by Winoker, Jared S. , Anastos, Harry , Knauer, Cynthia J. in Aged , Animal models , Biocompatibility

2019

Biocompatible gold nanoparticles designed to absorb light at wave-lengths of high tissue transparency have been of particular interest for biomedical applications. The ability of such nanoparticles to convert absorbed near-infrared light to heat and induce highly localized hyperthermia has been shown to be highly effective for photothermal cancer therapy, resulting in cell death and tumor remission in a multitude of preclinical animal models. Here we report the initial results of a clinical trial in which laser-excited gold-silica nanoshells (GSNs) were used in combination with magnetic resonance–ultrasound fusion imaging to focally ablate low-intermediate-grade tumors within the prostate. The overall goal is to provide highly localized regional control of prostate cancer that also results in greatly reduced patient morbidity and improved functional outcomes. This pilot device study reports feasibility and safety data from 16 cases of patients diagnosed with low- or intermediate-risk localized prostate cancer. After GSN infusion and high-precision laser ablation, patients underwent multiparametric MRI of the prostate at 48 to 72 h, followed by postprocedure mpMRI/ultrasound targeted fusion biopsies at 3 and 12 mo, as well as a standard 12-core systematic biopsy at 12 mo. GSN-mediated focal laser ablation was successfully achieved in 94% (15/16) of patients, with no significant difference in International Prostate Symptom Score or Sexual Health Inventory for Men observed after treatment. This treatment protocol appears to be feasible and safe in men with low- or intermediate-risk localized prostate cancer without serious complications or deleterious changes in genitourinary function.

Journal Article

Share this book

Add to My Shelf

Social signal processing

by Burgoon, Judee K., editor , Magnenat-Thalmann, Nadia, 1946- editor , Pantic, Maja, 1970- editor in Human-computer interaction. , Signal processing. , Human face recognition (Computer science)

\"Social Signal Processing is the first book to cover all aspects of the modeling, automated detection, analysis, and synthesis of nonverbal behavior in human-human and human-machine interactions. Authoritative surveys address conceptual foundations, machine analysis and synthesis of social signal processing, and applications. Foundational topics include affect perception and interpersonal coordination in communication; later chapters cover technologies for automatic detection and understanding such as computational paralinguistics and facial expression analysis and for the generation of artificial social signals such as social robots and artificial agents. The final section covers a broad spectrum of applications based on social signal processing in healthcare, deception detection, and digital cities, including detection of developmental diseases and analysis of small groups. Each chapter offers a basic introduction to its topic, accessible to students and other newcomers, and then outlines challenges and future perspectives for the benefit of experienced researchers and practitioners in the field\"-- Provided by publisher.

Book

Share this book

Add to My Shelf

Multimodal Sentiment Analysis Representations Learning via Contrastive Learning with Condense Attention Fusion

by Wang, Huiru , Ren, Zenyu , Ma, Chunming in Data fusion , Datasets , Emotion regulation

2023

Multimodal sentiment analysis has gained popularity as a research field for its ability to predict users’ emotional tendencies more comprehensively. The data fusion module is a critical component of multimodal sentiment analysis, as it allows for integrating information from multiple modalities. However, it is challenging to combine modalities and remove redundant information effectively. In our research, we address these challenges by proposing a multimodal sentiment analysis model based on supervised contrastive learning, which leads to more effective data representation and richer multimodal features. Specifically, we introduce the MLFC module, which utilizes a convolutional neural network (CNN) and Transformer to solve the redundancy problem of each modal feature and reduce irrelevant information. Moreover, our model employs supervised contrastive learning to enhance its ability to learn standard sentiment features from data. We evaluate our model on three widely-used datasets, namely MVSA-single, MVSA-multiple, and HFM, demonstrating that our model outperforms the state-of-the-art model. Finally, we conduct ablation experiments to validate the efficacy of our proposed method.

Journal Article

Share this book

Add to My Shelf

Towards better performing transport networks

by Jourquin, Bart, 1964- , Rietveld, Piet , Westin, Kerstin in Transport multimodal. , Transport routier. , Transport aâerien.

Book

Share this book

Add to My Shelf

Ultrathin Bronchoscopy with Multimodal Devices for Peripheral Pulmonary Lesions. A Randomized Trial

by Oki, Masahide , Kurimoto, Noriaki , Asano, Fumihiro in Adult , Aged , Aged, 80 and over

2015

Abstract Rationale The combination of an ultrathin bronchoscope, navigational technology, and endobronchial ultrasound (EBUS) seems to combine the best of mutual abilities for evaluating peripheral pulmonary lesions, but ultrathin bronchoscopes that allow the use of EBUS have not been developed so far. Objectives To compare the diagnostic yield of transbronchial biopsy under EBUS, fluoroscopy, and virtual bronchoscopic navigation guidance using a novel ultrathin bronchoscope with that using a thin bronchoscope with a guide sheath for peripheral pulmonary lesions. Methods In four centers, patients with suspected peripheral pulmonary lesions less than or equal to 30 mm in the longest diameter were included and randomized to undergo transbronchial biopsy with EBUS, fluoroscopy, and virtual bronchoscopic navigation guidance using a 3.0-mm ultrathin bronchoscope (UTB group) or a 4.0-mm thin bronchoscope with a guide sheath (TB-GS group). Measurements and Main Results A total of 310 patients were enrolled and randomized, among whom 305 patients (150, UTB group; 155, TB-GS group) were analyzed. The ultrathin bronchoscope could reach more distal bronchi than the thin bronchoscope (median fifth- vs. fourth-generation bronchi; P < 0.001). Diagnostic histologic specimens were obtained in 74% (42% for benign and 81% for malignant lesions) of the UTB group and 59% (36% for benign and 70% for malignant lesions) of the TB-GS group (P = 0.044, Mantel-Haenszel test). Complications including pneumothorax, bleeding, chest pain, and pneumonia occurred in 3% and 5% in the respective groups. Conclusions The diagnostic yield of the UTB method is higher than that of the TB-GS method. Clinical trial registered with www.umin.ac.jp/ctr/ (UMIN 000003177)

Journal Article

Share this book

Add to My Shelf

Effective Techniques for Multimodal Data Fusion: A Comparative Analysis

by Wróblewska, Anna , Sysko-Romańczuk, Sylwia , Pawłowski, Maciej in comparative analysis , data fusion , Deep learning

2023

Data processing in robotics is currently challenged by the effective building of multimodal and common representations. Tremendous volumes of raw data are available and their smart management is the core concept of multimodal learning in a new paradigm for data fusion. Although several techniques for building multimodal representations have been proven successful, they have not yet been analyzed and compared in a given production setting. This paper explored three of the most common techniques, (1) the late fusion, (2) the early fusion, and (3) the sketch, and compared them in classification tasks. Our paper explored different types of data (modalities) that could be gathered by sensors serving a wide range of sensor applications. Our experiments were conducted on Amazon Reviews, MovieLens25M, and Movie-Lens1M datasets. Their outcomes allowed us to confirm that the choice of fusion technique for building multimodal representation is crucial to obtain the highest possible model performance resulting from the proper modality combination. Consequently, we designed criteria for choosing this optimal data fusion technique.

Journal Article

Share this book

Add to My Shelf

Combining Multimodal Techniques to Approach the Study of Academic Lectures

by Bernad-Mechó, Edgar in academic lectures , análisis de la interacción , análisis del discurso multimodal

2021

This article offers a methodological reflection on the use of multimodal techniques for the study of academic lectures. Three distinct multimodal approaches have been put forward to explore the use of language holistically, namely, multimodal social semiotics (MSS), multimodal discourse analysis (MDA) and multimodal interaction analysis (MIA). These approaches differ in their main focus—the social context, the system of semiotic resources available to the speakers and the social actors, respectively—and the tools they provide to conduct multimodal analyses. To exemplify how analyses may be conducted within each of the paradigms in the context of academic lectures in English, I examine an excerpt extracted from an African-American history lecture from Yale University by a native English speaker in which he organizes his discourse in between content sections. Through the use of short multimodal transcriptions, I discuss how MSS can be used for reflections on the social contexts of academic lectures, MDA describes the use of semiotic resources employed by the lecturers, and MIA can be used to look into how lecturers structure their speech into sequences of actions. Ultimately, I suggest a combination of multimodal methodologies to obtain a broader account of the intricacies of discourse in academic settings. Este artículo ofrece una reflexión metodológica sobre el uso de técnicas multimodales para el estudio de clases universitarias. Existen tres enfoques metodológicos para el estudio holístico del lenguaje: la semiótica social multimodal o multimodal social semiotics (MSS), el análisis del discurso multimodal o multimodal discourse analysis (MDA) y el análisis de la interacción multimodal o multimodal interaction analysis (MIA). Estos enfoques difieren en sus principales focos de atención—el contexto social, el sistema de recursos semióticos disponible para la comunidad de hablantes y los agentes sociales, respectivamente—y las herramientas que proporcionan para llevar a cabo análisis multimodales. Para ejemplificar cómo se pueden llevar a cabo análisis dentro de cada uno de estos paradigmas en el contexto de las clases universitarias en inglés, examino un fragmento extraído de una clase de historia afroamericana de la Universidad de Yale impartida por un hablante nativo del inglés, fragmento en el cual el profesor organiza su discurso entre secciones de contenido. A través de transcripciones multimodales breves, trato el modo en que el MSS puede ser utilizado para ofrecer reflexiones sobre los contextos sociales de las clases universitarias, el MDA describe el uso de recursos semióticos utilizados por el profesorado, y el MIA puede ser usado para indagar en la estructuración del discurso del profesorado en secuencias de acciones. En último término, propongo una combinación de estas metodologías multimodales para obtener una visión más amplia de las complejidades del discurso en contextos académicos.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter