Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
1
result(s) for
"Flickr30K"
Sort by:
Learning DALTS for cross-modal retrieval
by
Wang, Wenmin
,
Yu, Zheng
in
Adaptation
,
B6135 Optical, image and video signal processing
,
C5260B Computer vision and image processing techniques
2019
Cross-modal retrieval has been recently proposed to find an appropriate subspace, where the similarity across different modalities such as image and text can be directly measured. In this study, different from most existing works, the authors propose a novel model for cross-modal retrieval based on a domain-adaptive limited text space (DALTS) rather than a common space or an image space. Experimental results on three widely used datasets, Flickr8K, Flickr30K and Microsoft Common Objects in Context (MSCOCO), show that the proposed method, dubbed DALTS, is able to learn superior text space features which can effectively capture the necessary information for cross-modal retrieval. Meanwhile, DALTS achieves promising improvements in accuracy for cross-modal retrieval compared with the current state-of-the-art methods.
Journal Article