Characterization and classification of semantic image-text relations

dc.bibliographicCitation.firstPage31eng
dc.bibliographicCitation.issue1eng
dc.bibliographicCitation.volume9eng
dc.contributor.authorOtto, C.
dc.contributor.authorSpringstein, M.
dc.contributor.authorAnand, A.
dc.contributor.authorEwerth, R.
dc.date.accessioned2020-07-21T09:12:27Z
dc.date.available2020-07-21T09:12:27Z
dc.date.issued2020
dc.description.abstractThe beneficial, complementary nature of visual and textual information to convey information is widely known, for example, in entertainment, news, advertisements, science, or education. While the complex interplay of image and text to form semantic meaning has been thoroughly studied in linguistics and communication sciences for several decades, computer vision and multimedia research remained on the surface of the problem more or less. An exception is previous work that introduced the two metrics Cross-Modal Mutual Information and Semantic Correlation in order to model complex image-text relations. In this paper, we motivate the necessity of an additional metric called Status in order to cover complex image-text relations more completely. This set of metrics enables us to derive a novel categorization of eight semantic image-text classes based on three dimensions. In addition, we demonstrate how to automatically gather and augment a dataset for these classes from the Web. Further, we present a deep learning system to automatically predict either of the three metrics, as well as a system to directly predict the eight image-text classes. Experimental results show the feasibility of the approach, whereby the predict-all approach outperforms the cascaded approach of the metric classifiers.eng
dc.description.versionpublishedVersioneng
dc.identifier.urihttps://doi.org/10.34657/3700
dc.identifier.urihttps://oa.tib.eu/renate/handle/123456789/5071
dc.language.isoengeng
dc.publisherBerlin : Springer Natureeng
dc.relation.doihttps://doi.org/10.1007/s13735-019-00187-6
dc.relation.ispartofseriesInternational Journal of Multimedia Information Retrieval 9 (2020), Nr. 1eng
dc.relation.issn2192-6611
dc.rights.licenseCC BY 4.0 Unportedeng
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/eng
dc.subjectData augmentationeng
dc.subjectImage-text classeng
dc.subjectMultimodalityeng
dc.subjectSemantic gapeng
dc.subject.ddc020eng
dc.titleCharacterization and classification of semantic image-text relationseng
dc.typearticleeng
dc.typeTexteng
dcterms.bibliographicCitation.journalTitleInternational Journal of Multimedia Information Retrievaleng
tib.accessRightsopenAccesseng
wgl.contributorTIBeng
wgl.subjectInformatikeng
wgl.typeZeitschriftenartikeleng
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Otto et al 2020, Characterization and classification of semantic image-text relations.pdf
Size:
5.5 MB
Format:
Adobe Portable Document Format
Description: