Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
20
result(s) for
"Fuzzy match"
Sort by:
Risk factors for recurrent tuberculosis after successful treatment in a high burden setting: a cohort study
by
Wilson, Douglas
,
Cudahy, Patrick George Tobias
,
Cohen, Ted
in
Adult
,
Air bases
,
Antitubercular Agents - therapeutic use
2020
Background
People successfully completing treatment for tuberculosis remain at elevated risk for recurrent disease, either from relapse or reinfection. Identifying risk factors for recurrent tuberculosis may help target post-tuberculosis screening and care.
Methods
We enrolled 500 patients with smear-positive pulmonary tuberculosis in South Africa and collected baseline data on demographics, clinical presentation and sputum mycobacterial cultures for 24-loci mycobacterial interspersed repetitive unit-variable number tandem repeat (MIRU-VNTR) typing. We used routinely-collected administrative data to identify recurrent episodes of tuberculosis occurring over a median of six years after successful treatment completion.
Results
Of 500 patients initially enrolled, 333 (79%) successfully completed treatment for tuberculosis. During the follow-up period 35 patients with successful treatment (11%) experienced a bacteriologically confirmed tuberculosis recurrence. In our Cox proportional hazards model, a 3+ AFB sputum smear grade was significantly associated with recurrent tuberculosis with a hazard ratio of 3.33 (95% CI 1.44–7.7). The presence of polyclonal
M. tuberculosis
infection at baseline had a hazard ratio for recurrence of 1.96 (95% CI 0.86–4.48).
Conclusion
Our results indicate that AFB smear grade is independently associated with tuberculosis recurrence after successful treatment for an initial episode while the association between polyclonal
M. tuberculosis
infection and increased risk of recurrence appears possible.
Journal Article
Hybrid topic modeling method based on dirichlet multinomial mixture and fuzzy match algorithm for short text clustering
by
Al-Mulla, Noha A
,
Jawarneh, Sana
,
ALmarashdeh, Ibrahim
in
Algorithms
,
Arabic language
,
Big Data
2024
Topic modeling methods proved to be effective for inferring latent topics from short texts. Dealing with short texts is challenging yet helpful for many real-world applications, due to the sparse terms in the text and the high dimensionality representation. Most of the topic modeling methods require the number of topics to be defined earlier. Similarly, methods based on Dirichlet Multinomial Mixture (DMM) involve the maximum possible number of topics before execution which is hard to determine due to topic uncertainty, and many noises exist in the dataset. Hence, a new approach called the Topic Clustering algorithm based on Levenshtein Distance (TCLD) is introduced in this paper, TCLD combines DMM models and the Fuzzy matching algorithm to address two key challenges in topic modeling: (a) The outlier problem in topic modeling methods. (b) The problem of determining the optimal number of topics. TCLD uses the initial clustered topics generated by DMM models and then evaluates the semantic relationships between documents using Levenshtein Distance. Subsequently, it determines whether to keep the document in the same cluster, relocate it to another cluster, or mark it as an outlier. The results demonstrate the efficiency of the proposed approach across six English benchmark datasets, in comparison to seven topic modeling approaches, with 83% improvement in purity and 67% enhancement in Normalized Mutual Information (NMI) across all datasets. The proposed method was also applied to a collected Arabic tweet and the results showed that only 12% of the Arabic short texts were incorrectly clustered, according to human inspection.
Journal Article
Clustering Enhanced Error-tolerant Top-k Spatio-textual Search
2021
There are a large amount of Location-Based Services widely available on a variety of portable electronic devices. It is critical for them to efficiently support top-kquery considering both spatial and textual relevance. Considering both the errors in user input and the spatial databases, it is necessary to support error-tolerant spatio-textual search for end-users. Previous researches mainly focused on set-based textual relevance, which makes it difficult for them to find reasonable results when the input tokens are not exactly matched with those from the records in spatial database. We design a novel framework to support top-kspatio-textual search with fuzzy token matching. A hierarchical index is proposed to capture signatures of both spatial and textual relevance. Based on it, we devise two algorithms to preferentially access the nodes with more similar objects while those with dissimilar ones can be pruned. We further propose a clustering based approach to construct the index by leveraging textual information. We conduct extensive experiments on real world POI datasets, and the results show that our framework outperforms state-of-the-art methods by a significant margin.
Journal Article
Research on Distributed Resource Management System for Product Design
by
Wang, Sheng Fa
,
Pan, Min
2012
Resource plays an important role for product design. In order to manage distributed and ocean resource for product design in an open and dynamic interconnection environment, resource integration framework is proposed. Resource depicted through matrix is presented in resource space. The matrix description, combination, integration, semantic reasoning and semantic search on resource are studied. According to resource space model, a fuzzy decision-based method is proposed to decide which resource is the most optimal. Finally, an example on product design of eddy current retarder is given to testify the validity of distributed resource management for product design.
Journal Article
Combining translation memories and statistical machine translation using sparse features
by
Way, Andy
,
Liu, Qun
,
Escartín, Carla Parra
in
Artificial Intelligence
,
Computational Linguistics
,
Computer memory
2016
The combination of translation memories (TMs) and statistical machine translation (SMT) has been demonstrated to be beneficial. In this paper, we present a combination approach which integrates TMs into SMT by using sparse features extracted at run-time during decoding. These features can be used on both phrase-based SMT and syntax-based SMT. We conducted experiments on a publicly available English-French data set and an English-Spanish industrial data set. Our experimental results show that these features significantly improve our phrase-based and syntax-based SMT baselines on both language pairs.
Journal Article
A Fuzzy-Match Search Engine for Physician Directories
2014
A search engine to find physicians' information is a basic but crucial function of a health care provider's website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names.
The Marshfield Clinic website provides a search engine for users to search for physicians' names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster.
Instead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: \"Typographic\", \"Phonetic spelling variation\", and \"Nickname\". To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data.
Using the \"Challenge Data Set of Marshfield Physician Names,\" we evaluated the accuracy of fuzzy-match engine-top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine-top one (71%).
We designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.
Journal Article
A Corpus-based Machine Translation Method of Term Extraction in LSP Texts
2014
To tackle the problems of term extraction in language specific field, this paper proposes a method of coordinating use of corpus and machine translation system in extracting terms in LSP text. A comparable corpus built for this research contains 167 English texts and 229 Chinese texts with around 600,000 English tokens and 900,000 Chinese characters. The corpus is annotated with mega-information and tagged with POS for further use. To get the key word list from the corpus, BFSU PowerConc software is used with the referential corpora of Crown and CLOB for English and TORCH and LCMC for Chinese. A VB program is written to generate the multi-word units, and then GOOGLE translators' toolkit is used to get translation pairs and SDL trados fuzzy match function is applied to extract lists of multi-word terms and their translations. The results show this method has 70% of translated term pairs scoring 2.0 in a 0~3 grading scale with a 0.5 interval by human graders. The methods can be applied to extract translation term pairs for computer-aided translation of language for specific purpose texts. Also, the by-product comparable corpus, combined with N-gram multiword unit lists, can be used in facilitating trainee translators in translation. The findings underline the significance of combing the use of machine translation method with corpora techniques, and also foresee the necessity of comparable corpora building and sharing and Conc-gram extracting in this field. [PUBLICATION ABSTRACT]
Journal Article
Analysis on the construction of sports match prediction model using neural network
2020
To grasp the development of sports events in time and adjust the strategy in the process of match in time, the traditional back-propagation neural network (BPNN) algorithm was improved, and the match prediction model was constructed by using the adaptive BPNN. Taking the football match data of the Union of European Football Associations Champions League 2016–2017 as the prediction sample, the match was predicted. Moreover, taking some match data of Barcelona in 2016–2017 as an example, the fitting accuracy of the improved adaptive BPNN prediction model, multiple linear regression (MLR) model and grey degree prediction model were compared and analyzed. The research results showed that the prediction model built by the improved adaptive BPNN algorithm had smaller prediction error after rolling prediction. By comparing with the fitting accuracy of MLR model and grey prediction model, it was also found that the prediction error of the prediction model proposed was almost zero, and the error of the other two models was large. It showed that the prediction model proposed had high accuracy and reliability. Therefore, applying neural network to build a sports match prediction model and then predict the match results can provide a certain theoretical basis for sports match practice and the prediction and analysis of the match results.
Journal Article
Study on resource service match and search in manufacturing grid system
by
Tao, Fei
,
Zhao, Dongming
,
Zhou, Zude
in
Algorithms
,
CAE) and Design
,
Computer-Aided Engineering (CAD
2009
Resource service match and search (RSMS) is the core to realize manufacturing grid (MGrid) resource scheduling. In order to realize effectively RSMS between resource demanders and providers, a RSMS framework is proposed and the key technologies to realize it are studied. The describing information of resource services are classified into four categories: (a) word concept information, (b) sentence information, (c) number information, including number interval and fuzzy number, and (d) entity class (or data structure) information. The similarity matching algorithms of each kind of describing information are investigated, respectively, including word matching algorithms, sentence matching algorithms, number matching algorithms, and entity class matching algorithms. Based on the proposed matching algorithms, the match and search processes of MGrid resource services are divided into four phases:
first
, matching the basic information of resource services, such as service name and service description, namely, basic matching;
second
, matching the inputs and outputs information of resource services, namely, I/O matching;
third
, matching the quality of service (QoS) information of resource services, namely QoS matching;
last
, combining the above three matching results and generating an integrated matching result, namely, integrated matching. The matching functions and algorithms of each phase are described in detail. A case study illustrates the application of proposed methods, and the accuracy and efficiency of the proposed method are measured.
Journal Article