Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
20 result(s) for "Fuzzy match"
Sort by:
Risk factors for recurrent tuberculosis after successful treatment in a high burden setting: a cohort study
Background People successfully completing treatment for tuberculosis remain at elevated risk for recurrent disease, either from relapse or reinfection. Identifying risk factors for recurrent tuberculosis may help target post-tuberculosis screening and care. Methods We enrolled 500 patients with smear-positive pulmonary tuberculosis in South Africa and collected baseline data on demographics, clinical presentation and sputum mycobacterial cultures for 24-loci mycobacterial interspersed repetitive unit-variable number tandem repeat (MIRU-VNTR) typing. We used routinely-collected administrative data to identify recurrent episodes of tuberculosis occurring over a median of six years after successful treatment completion. Results Of 500 patients initially enrolled, 333 (79%) successfully completed treatment for tuberculosis. During the follow-up period 35 patients with successful treatment (11%) experienced a bacteriologically confirmed tuberculosis recurrence. In our Cox proportional hazards model, a 3+ AFB sputum smear grade was significantly associated with recurrent tuberculosis with a hazard ratio of 3.33 (95% CI 1.44–7.7). The presence of polyclonal M. tuberculosis infection at baseline had a hazard ratio for recurrence of 1.96 (95% CI 0.86–4.48). Conclusion Our results indicate that AFB smear grade is independently associated with tuberculosis recurrence after successful treatment for an initial episode while the association between polyclonal M. tuberculosis infection and increased risk of recurrence appears possible.
Hybrid topic modeling method based on dirichlet multinomial mixture and fuzzy match algorithm for short text clustering
Topic modeling methods proved to be effective for inferring latent topics from short texts. Dealing with short texts is challenging yet helpful for many real-world applications, due to the sparse terms in the text and the high dimensionality representation. Most of the topic modeling methods require the number of topics to be defined earlier. Similarly, methods based on Dirichlet Multinomial Mixture (DMM) involve the maximum possible number of topics before execution which is hard to determine due to topic uncertainty, and many noises exist in the dataset. Hence, a new approach called the Topic Clustering algorithm based on Levenshtein Distance (TCLD) is introduced in this paper, TCLD combines DMM models and the Fuzzy matching algorithm to address two key challenges in topic modeling: (a) The outlier problem in topic modeling methods. (b) The problem of determining the optimal number of topics. TCLD uses the initial clustered topics generated by DMM models and then evaluates the semantic relationships between documents using Levenshtein Distance. Subsequently, it determines whether to keep the document in the same cluster, relocate it to another cluster, or mark it as an outlier. The results demonstrate the efficiency of the proposed approach across six English benchmark datasets, in comparison to seven topic modeling approaches, with 83% improvement in purity and 67% enhancement in Normalized Mutual Information (NMI) across all datasets. The proposed method was also applied to a collected Arabic tweet and the results showed that only 12% of the Arabic short texts were incorrectly clustered, according to human inspection.
Clustering Enhanced Error-tolerant Top-k Spatio-textual Search
There are a large amount of Location-Based Services widely available on a variety of portable electronic devices. It is critical for them to efficiently support top-kquery considering both spatial and textual relevance. Considering both the errors in user input and the spatial databases, it is necessary to support error-tolerant spatio-textual search for end-users. Previous researches mainly focused on set-based textual relevance, which makes it difficult for them to find reasonable results when the input tokens are not exactly matched with those from the records in spatial database. We design a novel framework to support top-kspatio-textual search with fuzzy token matching. A hierarchical index is proposed to capture signatures of both spatial and textual relevance. Based on it, we devise two algorithms to preferentially access the nodes with more similar objects while those with dissimilar ones can be pruned. We further propose a clustering based approach to construct the index by leveraging textual information. We conduct extensive experiments on real world POI datasets, and the results show that our framework outperforms state-of-the-art methods by a significant margin.
Research on Distributed Resource Management System for Product Design
Resource plays an important role for product design. In order to manage distributed and ocean resource for product design in an open and dynamic interconnection environment, resource integration framework is proposed. Resource depicted through matrix is presented in resource space. The matrix description, combination, integration, semantic reasoning and semantic search on resource are studied. According to resource space model, a fuzzy decision-based method is proposed to decide which resource is the most optimal. Finally, an example on product design of eddy current retarder is given to testify the validity of distributed resource management for product design.
Combining translation memories and statistical machine translation using sparse features
The combination of translation memories (TMs) and statistical machine translation (SMT) has been demonstrated to be beneficial. In this paper, we present a combination approach which integrates TMs into SMT by using sparse features extracted at run-time during decoding. These features can be used on both phrase-based SMT and syntax-based SMT. We conducted experiments on a publicly available English-French data set and an English-Spanish industrial data set. Our experimental results show that these features significantly improve our phrase-based and syntax-based SMT baselines on both language pairs.
A Fuzzy-Match Search Engine for Physician Directories
A search engine to find physicians' information is a basic but crucial function of a health care provider's website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names. The Marshfield Clinic website provides a search engine for users to search for physicians' names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster. Instead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: \"Typographic\", \"Phonetic spelling variation\", and \"Nickname\". To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data. Using the \"Challenge Data Set of Marshfield Physician Names,\" we evaluated the accuracy of fuzzy-match engine-top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine-top one (71%). We designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.
A Corpus-based Machine Translation Method of Term Extraction in LSP Texts
To tackle the problems of term extraction in language specific field, this paper proposes a method of coordinating use of corpus and machine translation system in extracting terms in LSP text. A comparable corpus built for this research contains 167 English texts and 229 Chinese texts with around 600,000 English tokens and 900,000 Chinese characters. The corpus is annotated with mega-information and tagged with POS for further use. To get the key word list from the corpus, BFSU PowerConc software is used with the referential corpora of Crown and CLOB for English and TORCH and LCMC for Chinese. A VB program is written to generate the multi-word units, and then GOOGLE translators' toolkit is used to get translation pairs and SDL trados fuzzy match function is applied to extract lists of multi-word terms and their translations. The results show this method has 70% of translated term pairs scoring 2.0 in a 0~3 grading scale with a 0.5 interval by human graders. The methods can be applied to extract translation term pairs for computer-aided translation of language for specific purpose texts. Also, the by-product comparable corpus, combined with N-gram multiword unit lists, can be used in facilitating trainee translators in translation. The findings underline the significance of combing the use of machine translation method with corpora techniques, and also foresee the necessity of comparable corpora building and sharing and Conc-gram extracting in this field. [PUBLICATION ABSTRACT]
Analysis on the construction of sports match prediction model using neural network
To grasp the development of sports events in time and adjust the strategy in the process of match in time, the traditional back-propagation neural network (BPNN) algorithm was improved, and the match prediction model was constructed by using the adaptive BPNN. Taking the football match data of the Union of European Football Associations Champions League 2016–2017 as the prediction sample, the match was predicted. Moreover, taking some match data of Barcelona in 2016–2017 as an example, the fitting accuracy of the improved adaptive BPNN prediction model, multiple linear regression (MLR) model and grey degree prediction model were compared and analyzed. The research results showed that the prediction model built by the improved adaptive BPNN algorithm had smaller prediction error after rolling prediction. By comparing with the fitting accuracy of MLR model and grey prediction model, it was also found that the prediction error of the prediction model proposed was almost zero, and the error of the other two models was large. It showed that the prediction model proposed had high accuracy and reliability. Therefore, applying neural network to build a sports match prediction model and then predict the match results can provide a certain theoretical basis for sports match practice and the prediction and analysis of the match results.
Study on resource service match and search in manufacturing grid system
Resource service match and search (RSMS) is the core to realize manufacturing grid (MGrid) resource scheduling. In order to realize effectively RSMS between resource demanders and providers, a RSMS framework is proposed and the key technologies to realize it are studied. The describing information of resource services are classified into four categories: (a) word concept information, (b) sentence information, (c) number information, including number interval and fuzzy number, and (d) entity class (or data structure) information. The similarity matching algorithms of each kind of describing information are investigated, respectively, including word matching algorithms, sentence matching algorithms, number matching algorithms, and entity class matching algorithms. Based on the proposed matching algorithms, the match and search processes of MGrid resource services are divided into four phases: first , matching the basic information of resource services, such as service name and service description, namely, basic matching; second , matching the inputs and outputs information of resource services, namely, I/O matching; third , matching the quality of service (QoS) information of resource services, namely QoS matching; last , combining the above three matching results and generating an integrated matching result, namely, integrated matching. The matching functions and algorithms of each phase are described in detail. A case study illustrates the application of proposed methods, and the accuracy and efficiency of the proposed method are measured.