Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Series TitleSeries Title
-
Reading LevelReading Level
-
YearFrom:-To:
-
More FiltersMore FiltersContent TypeItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
4,950
result(s) for
"Information retrieval Automation."
Sort by:
Fuzzy information retrieval
Information retrieval used to mean looking through thousands of strings of texts to find words or symbols that matched a user's query. Today, there are many models that help index and search more effectively so retrieval takes a lot less time. Information retrieval (IR) is often seen as a subfield of computer science and shares some modeling, applications, storage applications and techniques, as do other disciplines like artificial intelligence, database management, and parallel computing. This book introduces the topic of IR and how it differs from other computer science disciplines. A discussion of the history of modern IR is briefly presented, and the notation of IR as used in this book is defined. The complex notation of relevance is discussed. Some applications of IR are noted as well since IR has many practical uses today. Using information retrieval with fuzzy logic to search for software terms can help find software components and ultimately help increase the reuse of software. This is just one practical application of IR that is covered in this book. Some of the classical models of IR are presented as a contrast to extending the Boolean model. This includes a brief mention of the source of weights for the various models. In a typical retrieval environment, answers are either yes or no, i.e., on or off. On the other hand, fuzzy logic can bring in a \"degree of \" match, vs. a crisp, i.e., strict match. This, too, is looked at and explored in much detail, showing how it can be applied to information retrieval. Fuzzy logic is often times considered a soft computing application and this book explores how IR with fuzzy logic and its membership functions as weights can help indexing, querying, and matching. Since fuzzy set theory and logic are explored in IR systems, the explanation of where the fuzz is ensues. The concept of relevance feedback, including pseudorelevance feedback is explored for the various models of IR. For the extended Boolean model, the use of genetic algorithms for relevance feedback is delved into. The concept of query expansion is explored using rough set theory. Various term relationships are modeled and presented, and the model extended for fuzzy retrieval. An example using the UMLS terms is also presented. The model is also extended for term relationships beyond synonyms. Finally, this book looks at clustering, both crisp and fuzzy, to see how that can improve retrieval performance. An example is presented to illustrate the concepts.
Big Data, Little Data, No Data
by
Borgman, Christine L
in
Big data
,
Communication in learning and scholarship
,
Communication in learning and scholarship -- Technological innovations
2015,2016,2017
\"Big Data\" is on the covers ofScience, Nature, theEconomist, andWiredmagazines, on the front pages of theWall Street Journaland theNew York Times.But despite the media hyperbole, as Christine Borgman points out in this examination of data and scholarly research, having the right data is usually better than having more data; little data can be just as valuable as big data. In many cases, there are no data -- because relevant data don't exist, cannot be found, or are not available. Moreover, data sharing is difficult, incentives to do so are minimal, and data practices vary widely across disciplines.Borgman, an often-cited authority on scholarly communication, argues that data have no value or meaning in isolation; they exist within a knowledge infrastructure -- an ecology of people, practices, technologies, institutions, material objects, and relationships. After laying out the premises of her investigation -- six \"provocations\" meant to inspire discussion about the uses of data in scholarship -- Borgman offers case studies of data practices in the sciences, the social sciences, and the humanities, and then considers the implications of her findings for scholarly practice and research policy. To manage and exploit data over the long term, Borgman argues, requires massive investment in knowledge infrastructures; at stake is the future of scholarship.
Improving the translation of search strategies using the Polyglot Search Translator: a randomized controlled trial
by
Clark, Justin Michael
,
Carter, Matthew
,
Jones, Mark
in
Analysis
,
automation
,
Bibliographic data bases
2020
Background: Searching for studies to include in a systematic review (SR) is a time- and labor-intensive process with searches of multiple databases recommended. To reduce the time spent translating search strings across databases, a tool called the Polyglot Search Translator (PST) was developed. The authors evaluated whether using the PST as a search translation aid reduces the time required to translate search strings without increasing errors.Methods: In a randomized trial, twenty participants were randomly allocated ten database search strings and then randomly assigned to translate five with the assistance of the PST (PST-A method) and five without the assistance of the PST (manual method). We compared the time taken to translate search strings, the number of errors made, and how close the number of references retrieved by a translated search was to the number retrieved by a reference standard translation.Results: Sixteen participants performed 174 translations using the PST-A method and 192 translations using the manual method. The mean time taken to translate a search string with the PST-A method was 31 minutes versus 45 minutes by the manual method (mean difference: 14 minutes). The mean number of errors made per translation by the PST-A method was 8.6 versus 14.6 by the manual method. Large variation in the number of references retrieved makes results for this outcome unreliable, although the number of references retrieved by the PST-A method was closer to the reference standard translation than the manual method.Conclusion: When used to assist with translating search strings across databases, the PST can increase the speed of translation without increasing errors. Errors in search translations can still be a problem, and search specialists should be aware of this.
Journal Article
Informatica
2023
Informatica -the updated
edition of Alex Wright's previously published Glut-continues the
journey through the history of the information age to show how
information systems emerge . Today's \"information
explosion\" may seem like a modern phenomenon, but we are not the
first generation-or even the first species-to wrestle with the
problem of information overload. Long before the advent of
computers, human beings were collecting, storing, and organizing
information: from Ice Age taxonomies to Sumerian archives, Greek
libraries to Christian monasteries.
Wright weaves a narrative that connects such seemingly far-flung
topics as insect colonies, Stone Age jewelry, medieval monasteries,
Renaissance encyclopedias, early computer networks, and the World
Wide Web. He suggests that the future of the information age may
lie deep in our cultural past.
We stand at a precipice struggling to cope with a tsunami of
data. Wright provides some much-needed historical perspective. We
can understand the predicament of information overload not just as
the result of technological change but as the latest chapter in an
ancient story that we are only beginning to understand.
Optical cryptosystems
2020
Advanced technologies such as artificial intelligence, big data, cloud computing, and the Internet of Things have changed the digital landscape, providing many new and exciting opportunities. However, they also provide ever-shifting gateways for information theft or misuse. Staying ahead requires the development of innovative and responsive security measures, and recent advances in optical technology have positioned it as a promising alternative to digital cryptography. Optical Cryptosystems introduces the subject of optical cryptography and provides up-to-date coverage of optical security schemes. Optical principles, approaches, and algorithms are discussed as well as applications, including image/data encryption-decryption, watermarking, image/data hiding, and authentication verification. This book also includes MATLAB[reg] codes, enabling students and research professionals to carry out exercises and develop newer methods of image/data security and authentication.
Machine learning reduced workload with minimal risk of missing studies: development and evaluation of a randomized controlled trial classifier for Cochrane Reviews
by
Marshall, Iain J.
,
Elliott, Julian
,
Mavergames, Chris
in
Algorithms
,
Automation
,
Bibliographic data bases
2021
This study developed, calibrated, and evaluated a machine learning classifier designed to reduce study identification workload in Cochrane for producing systematic reviews.
A machine learning classifier for retrieving randomized controlled trials (RCTs) was developed (the “Cochrane RCT Classifier”), with the algorithm trained using a data set of title–abstract records from Embase, manually labeled by the Cochrane Crowd. The classifier was then calibrated using a further data set of similar records manually labeled by the Clinical Hedges team, aiming for 99% recall. Finally, the recall of the calibrated classifier was evaluated using records of RCTs included in Cochrane Reviews that had abstracts of sufficient length to allow machine classification.
The Cochrane RCT Classifier was trained using 280,620 records (20,454 of which reported RCTs). A classification threshold was set using 49,025 calibration records (1,587 of which reported RCTs), and our bootstrap validation found the classifier had recall of 0.99 (95% confidence interval 0.98–0.99) and precision of 0.08 (95% confidence interval 0.06–0.12) in this data set. The final, calibrated RCT classifier correctly retrieved 43,783 (99.5%) of 44,007 RCTs included in Cochrane Reviews but missed 224 (0.5%). Older records were more likely to be missed than those more recently published.
The Cochrane RCT Classifier can reduce manual study identification workload for Cochrane Reviews, with a very low and acceptable risk of missing eligible RCTs. This classifier now forms part of the Evidence Pipeline, an integrated workflow deployed within Cochrane to help improve the efficiency of the study identification processes that support systematic review production.
•Systematic review processes need to become more efficient.•Machine learning is sufficiently mature for real-world use.•A machine learning classifier was built using data from Cochrane Crowd.•It was calibrated to achieve very high recall.•It is now live and in use in Cochrane review production systems.
Journal Article
Discovering knowledge in data : an introduction to data mining
2005,2004
DANIEL T. LAROSE received his PhD in statistics from the University of Connecticut. An associate professor of statistics at Central Connecticut State University, he developed and directs Data Mining@CCSU, the world's first online master of science program in data mining. He has also worked as a data mining consultant for Connecticut-area companies. He is currently working on the next two books of his three-volume series on Data Mining: Data Mining Methods and Models and Data Mining the Web: Uncovering Patterns in Web Content, scheduled to publish respectively in 2005 and 2006.
Handbook of evaluation methods for health informatics
by
Brender, Jytte
in
Decision Support Techniques
,
Evaluation
,
Information storage and retrieval systems
2006
This Handbook provides a complete compendium of methods for evaluation of IT-based systems and solutions within healthcare. Emphasis is entirely on assessment of the IT-system within its organizational environment. The author provides a coherent and complete assessment of methods addressing interactions with and effects of technology at the organizational, psychological, and social levels.It offers an explanation of the terminology and theoretical foundations underlying the methodological analysis presented here. The author carefully guides the reader through the process of identifying relevant methods corresponding to specific information needs and conditions for carrying out the evaluation study. The Handbook takes a critical view by focusing on assumptions for application, tacit built-in perspectives of the methods as well as their perils and pitfalls. *Collects a number of evaluation methods of medical informatics*Addresses metrics and measures*Includes an extensive list of anotated references, case studies, and a list of useful Web sites
Autonomous chemical research with large language models
2023
Transformer-based large language models are making significant strides in various fields, such as natural language processing
1
–
5
, biology
6
,
7
, chemistry
8
–
10
and computer programming
11
,
12
. Here, we show the development and capabilities of Coscientist, an artificial intelligence system driven by GPT-4 that autonomously designs, plans and performs complex experiments by incorporating large language models empowered by tools such as internet and documentation search, code execution and experimental automation. Coscientist showcases its potential for accelerating research across six diverse tasks, including the successful reaction optimization of palladium-catalysed cross-couplings, while exhibiting advanced capabilities for (semi-)autonomous experimental design and execution. Our findings demonstrate the versatility, efficacy and explainability of artificial intelligence systems like Coscientist in advancing research.
Coscientist is an artificial intelligence system driven by GPT-4 that autonomously designs, plans and performs experiments by incorporating large language models empowered by tools such as internet and documentation search, code execution and experimental automation.
Journal Article
A Survey on Application of Knowledge Graph
2020
Knowledge graphs, representation of information as a semantic graph, have caused wide concern in both industrial and academic world. Their property of providing semantically structured information has brought important possible solutions for many tasks including question answering, recommendation and information retrieval, and is considered to offer great promise for building more intelligent machines by many researchers. Although knowledge graphs have already supported multiple \"Big Data\" applications in all sorts of commercial and scientific domains since Google coined this term in 2012, there was no previous study give a systemically review of the application of knowledge graphs. Therefore, unlike other related work which focuses on the construction techniques of knowledge graphs, this present paper aims at providing a first survey on these applications stemming from different domains. This paper also points out that while important advancements of applying knowledge graphs' great ability of providing semantically structured information into specific domains have been made in recent years, several aspects still remain to be explored.
Journal Article