Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
3
result(s) for
"MSCOCO"
Sort by:
Content Based Image Retrieval (Cbir) For Storing Of Products
With rise in complexity of materials and lack of necessary skills among the mass population, it is necessary to find a system that can help the employees to identify the materials without the help of additional labour, training and manuals. Hence, a system has been created to identify the material and give similar materials when the model is provided with just an image of the material in question. This being an unlinked database helps in maintaining privacy of the company. Convolutional neural networks are used for object detection and then captioning is done by Recurrent neural network which also compares the generated caption with the provided database and return the best match. LSTM and NLP aid the process of caption generation and search. The dataset used is MSCOCO 2014. The evaluation metrics is BLEU which returned a score of 70.3%. The whole idea is easy to combine with concepts like KANBAN and facilitate the layout design of the company to a great flexibility. By reducing the time spent on training and sorting the efficiency of the firm is increased. The system created for identifying the material can be interlinked with centralised inventory management system to help track the material in the production process. The layout of the store was not optimised which resulted in delay in the production process. This can be rectified using a software like ARENA to design the store layout. With a system in place for material deduction the steps followed in stores can be reduced.
Journal Article
Learning DALTS for cross-modal retrieval
by
Wang, Wenmin
,
Yu, Zheng
in
Adaptation
,
B6135 Optical, image and video signal processing
,
C5260B Computer vision and image processing techniques
2019
Cross-modal retrieval has been recently proposed to find an appropriate subspace, where the similarity across different modalities such as image and text can be directly measured. In this study, different from most existing works, the authors propose a novel model for cross-modal retrieval based on a domain-adaptive limited text space (DALTS) rather than a common space or an image space. Experimental results on three widely used datasets, Flickr8K, Flickr30K and Microsoft Common Objects in Context (MSCOCO), show that the proposed method, dubbed DALTS, is able to learn superior text space features which can effectively capture the necessary information for cross-modal retrieval. Meanwhile, DALTS achieves promising improvements in accuracy for cross-modal retrieval compared with the current state-of-the-art methods.
Journal Article
Visual Linguistic Model and Its Applications in Image Captioning
by
Kumar, Ravin
in
Accuracy
,
Advances in Computational Approaches for Artificial Intelligence
,
Comparative analysis
2020
Image captioning is a well-known task of generating textual description of a given image. Research work on this problem statement requires efforts in both computer vision and natural language processing domains to obtain better quality image descriptions. In this paper, we are proposing a new deep learning approach to generate image captions. In this approach, we generate a sequence of visual embeddings for objects and their relationships present in the image. These visual embeddings are arranged in a particular manner and are then supplied to the encoder part of an attention-based sequence-to-sequence model. In the final step, we receive the generated image captions from the decoder part of our sequence-to-sequence model. We tested its performance on MSCOCO Dataset, and the obtained results suggested that our model generates better image captions for MSCOCO testing dataset.
Journal Article