Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
7,225 result(s) for "Voice control"
Sort by:
The Use of Voice Control in 3D Medical Data Visualization Implementation, Legal, and Ethical Issues
Voice-controlled devices are becoming increasingly common in our everyday lives as well as in medicine. Whether it is our smartphones, with voice assistants that make it easier to access functions, or IoT (Internet of Things) devices that let us control certain areas of our home with voice commands using sensors and different communication networks, or even medical robots that can be controlled by a doctor with voice instructions. Over the last decade, systems using voice control have made great progress, both in terms of accuracy of voice processing and usability. The topic of voice control is intertwined with the application of artificial intelligence (AI), as the mapping of spoken commands into written text and their understanding is mostly conducted by some kind of trained AI model. Our research had two objectives. The first was to design and develop a system that enables doctors to evaluate medical data in 3D using voice control. The second was to describe the legal and ethical issues involved in using AI-based solutions for voice control. During our research, we created a voice control module for an existing software called PathoVR, using a model taught by Google to interpret the voice commands given by the user. Our research, presented in this paper, can be divided into two parts. In the first, we have designed and developed a system that allows the user to evaluate 3D pathological medical serial sections using voice commands. In contrast, in the second part of our research, we investigated the legal and ethical issues that may arise when using voice control in the medical field. In our research, we have identified legal and ethical barriers to the use of artificial intelligence in voice control, which need to be answered in order to make this technology part of everyday medicine.
A Secure and Robust Multimodal Framework for In-Vehicle Voice Control: Integrating Bilingual Wake-Up, Speaker Verification, and Fuzzy Command Understanding
Intelligent in-vehicle voice systems face critical challenges in robustness, security, and semantic flexibility under complex acoustic conditions. To address these issues holistically, this paper proposes a novel multimodal and secure voice-control framework. The system integrates a hybrid dual-channel wake-up mechanism, combining a commercial English engine (Picovoice) with a custom lightweight ResNet-Lite model for Chinese, to achieve robust cross-lingual activation. For reliable identity authentication, an optimized ECAPA-TDNN model is introduced, enhanced with spectral augmentation, sliding window feature fusion, and an adaptive threshold mechanism. Furthermore, a two-tier fuzzy command matching algorithm operating at character and pinyin levels is designed to significantly improve tolerance to speech variations and ASR errors. Comprehensive experiments on a test set encompassing various Chinese dialects, English accents, and noise environments demonstrate that the proposed system achieves high performance across all components: the wake-up mechanism maintains commercial-grade reliability for English and provides a functional baseline for Chinese; the improved ECAPA-TDNN attains low equal error rates of 2.37% (quiet), 5.59% (background music), and 3.12% (high-speed noise), outperforming standard baselines and showing strong noise robustness against the state of the art; and the fuzzy matcher boosts command recognition accuracy to over 95.67% in quiet environments and above 92.7% under noise, substantially outperforming hard matching by approximately 30%. End-to-end tests confirm an overall interaction success rate of 93.7%. This work offers a practical, integrated solution for developing secure, robust, and flexible voice interfaces in intelligent vehicles.
In-Vehicle Speech Recognition for Voice-Driven UAV Control in a Collaborative Environment of MAV and UAV
Most conventional speech recognition systems have mainly concentrated on voice-driven control of personal user devices such as smartphones. Therefore, a speech recognition system used in a special environment needs to be developed in consideration of the environment. In this study, a speech recognition framework for voice-driven control of unmanned aerial vehicles (UAVs) is proposed in a collaborative environment between manned aerial vehicles (MAVs) and UAVs, where multiple MAVs and UAVs fly together, and pilots on board MAVs control multiple UAVs with their voices. Standard speech recognition systems consist of several modules, including front-end, recognition, and post-processing. Among them, this study focuses on recognition and post-processing modules in terms of in-vehicle speech recognition. In order to stably control UAVs via voice, it is necessary to handle the environmental conditions of the UAVs carefully. First, we define control commands that the MAV pilot delivers to UAVs and construct training data. Next, for the recognition module, we investigate an acoustic model suitable for the characteristics of the UAV control commands and the UAV system with hardware resource constraints. Finally, two approaches are proposed for post-processing: grammar network-based syntax analysis and transaction-based semantic analysis. For evaluation, we developed a speech recognition system in a collaborative simulation environment between a MAV and an UAV and successfully verified the validity of each module. As a result of recognition experiments of connected words consisting of two to five words, the recognition rates of hidden Markov model (HMM) and deep neural network (DNN)-based acoustic models were 98.2% and 98.4%, respectively. However, in terms of computational amount, the HMM model was about 100 times more efficient than DNN. In addition, the relative improvement in error rate with the proposed post-processing was about 65%.
Towards artificial general intelligence with hybrid Tianjic chip architecture
There are two general approaches to developing artificial general intelligence (AGI) 1 : computer-science-oriented and neuroscience-oriented. Because of the fundamental differences in their formulations and coding schemes, these two approaches rely on distinct and incompatible platforms 2 – 8 , retarding the development of AGI. A general platform that could support the prevailing computer-science-based artificial neural networks as well as neuroscience-inspired models and algorithms is highly desirable. Here we present the Tianjic chip, which integrates the two approaches to provide a hybrid, synergistic platform. The Tianjic chip adopts a many-core architecture, reconfigurable building blocks and a streamlined dataflow with hybrid coding schemes, and can not only accommodate computer-science-based machine-learning algorithms, but also easily implement brain-inspired circuits and several coding schemes. Using just one chip, we demonstrate the simultaneous processing of versatile algorithms and models in an unmanned bicycle system, realizing real-time object detection, tracking, voice control, obstacle avoidance and balance control. Our study is expected to stimulate AGI development by paving the way to more generalized hardware platforms. The ‘Tianjic’ hybrid electronic chip combines neuroscience-oriented and computer-science-oriented approaches to artificial general intelligence, demonstrated by controlling an unmanned bicycle.
Underwater wireless communication via TENG-generated Maxwell’s displacement current
Underwater communication is a critical and challenging issue, on account of the complex underwater environment. This study introduces an underwater wireless communication approach via Maxwell’s displacement current generated by a triboelectric nanogenerator. Underwater electric field can be generated through a wire connected to a triboelectric nanogenerator, while current signal can be inducted in an underwater receiver certain distance away. The received current signals are basically immune to disturbances from salinity, turbidity and submerged obstacles. Even after passing through a 100 m long spiral water pipe, the electric signals are not distorted in waveform. By modulating and demodulating the current signals generated by a sound driven triboelectric nanogenerator, texts and images can be transmitted in a water tank at 16 bits/s. An underwater lighting system is operated by the triboelectric nanogenerator-based voice-activated controller wirelessly. This triboelectric nanogenerator-based approach can form the basis for an alternative wireless communication in complex underwater environments. Underwater communication, despite constant development, still remains a challenging technology. Here, authors report an underwater wireless communication approach based on the triboelectric nanogenerator, which provides a self-powered communication system in complex underwater environments.
Design and implementation of a voice-controlled digital ultrasonic flaw detector
This paper proposes a voice-controlled digital ultrasonic flaw detector, aimed at solving the problems of traditional ultrasonic flaw detectors in industrial environments such as glove contamination and inconvenient operation in dim environments. By integrating an offline voice recognition chip, the flaw detector can achieve parameter adjustment and operation control through voice commands, reducing reliance on touch screens and key panels. The experimental results show that this scheme has the characteristics of low cost, low power consumption, and high response speed, which can effectively improve the efficiency of industrial ultrasonic testing.
Voice Flows to and around Leaders: Understanding When Units Are Helped or Hurt by Employee Voice
In two studies, we develop and test theory about the relationship between speaking up, one type of organizational citizenship behavior, and unit performance by accounting for where employee voice is flowing. Results from a qualitative study of managers and professionals across a variety of industries suggest that voice to targets at different formal power levels (peers or superiors) and locations in the organization (inside or outside a focal unit) differs systematically in terms of its usefulness in generating actions to a unit's benefit on the issues raised and in the likely information value of the ideas expressed. We then theorize how distinct voice flows should be differentially related to unit performance based on these core characteristics and test our hypotheses using time-lagged field data from 801 employees and their managers in 93 units across nine North American credit unions. Results demonstrate that voice flows are positively related to a unit's effectiveness when they are targeted at the focal leader of that unit—who should be able to take action—whether from that leader's own subordinates or those in other units, and negatively related to a unit's effectiveness when they are targeted at coworkers who have little power to effect change. Together, these studies provide a structural framework for studying the nature and impact of multiple voice flows, some along formal reporting lines and others that reflect the informal communication structure within organizations. This research demonstrates that understanding the potential performance benefits and costs of voice for leaders and their units requires attention to the structure and complexity of multiple voice flows rather than to an undifferentiated amount of voice.
Machine learning-assisted wearable sensing systems for speech recognition and interaction
The human voice stands out for its rich information transmission capabilities. However, voice communication is susceptible to interference from noisy environments and obstacles. Here, we propose a wearable wireless flexible skin-attached acoustic sensor (SAAS) capable of capturing the vibrations of vocal organs and skin movements, thereby enabling voice recognition and human-machine interaction (HMI) in harsh acoustic environments. This system utilizes a piezoelectric micromachined ultrasonic transducers (PMUT), which feature high sensitivity (-198 dB), wide bandwidth (10 Hz-20 kHz), and excellent flatness (±0.5 dB). Flexible packaging enhances comfort and adaptability during wear, while integration with the Residual Network (ResNet) architecture significantly improves the classification of laryngeal speech features, achieving an accuracy exceeding 96%. Furthermore, we also demonstrated SAAS’s data collection and intelligent classification capabilities in multiple HMI scenarios. Finally, the speech recognition system was able to recognize everyday sentences spoken by participants with an accuracy of 99.8% through a deep learning model. With advantages including a simple fabrication process, stable performance, easy integration, and low cost, SAAS presents a compelling solution for applications in voice control, HMI, and wearable electronics. Voice communication faces challenges from noise and obstructions. Here, the authors present a flexible PMUT-based wearable sensor, focusing on signal capture, noise resistance, and applications in HMI, IoT, and speech disorder assistance.
3D printed triboelectric nanogenerator as self-powered human-machine interactive sensor for breathing-based language expression
Human-machine interfaces (HMIs) are important windows for a human to communicate with the outside world. The current HMI devices such as cellphones, tablets, and computers can be used to help people with aphasia for language expression. However, these conventional HMI devices are not friendly to some particular groups who also lose their abilities of physical movements like in the intensive care unit (ICU) or vegetative patients to realize language expression. Herein, we report a breath-driven triboelectric nanogenerator (TENG) acting as a HMI sensor for language expression through human breathing without voice controls or manual operations. The TENG is integrated within a mask and fabricated via a three-dimensional (3D) printing method. When wearing the mask, the TENG can produce responsive electric signals corresponding to the airflow from breathing, which is capable of recognizing human breathing types with different intensities, lengths, and frequencies. On the basis of the breathing recognition ability, a breathing-based language expressing system is further developed through introducing the Morse code as a communication protocol. Compared with conventional language expressing devices, this system can extract subjective information of a person from breathing behaviors and output corresponding language text, which is not relying on voices or physical movements. This research for the first time introduces the self-powered breathing-based language expressing method to the field of HMI technology by using a 3D printed TENG, and could make HMI interactions become more friendly and fascinating.
AGV Control using Voice Command
Automated guided vehicles (AGV) have applications in various fields ranging from the process industry to many more. AGV has its history from the early 50s to till date. However, it’s been gone through several modifications in structure, design, and techniques. In its simple form, it completes its task using navigation. This paper aims to give a review of the various technological advancements in the field of Automated Guided Vehicle in the past few years. In this review, various navigational techniques and structural designs have been addressed. The various techniques of navigation have been studied and are used by various manufacturers in the world. The review includes the various structure of AGV which is currently in use in the market. In addition to this, the voice recognition technique has also been addressed.