Multimodal news analytics using measures of cross-modal entity and context consistency

Müller-Budack, Eric; Theiner, Jonas; Diering, Sebastian; Idahl, Maximilian; Hakimov, Sherzod; Ewerth, Ralph

doi:https://doi.org/10.34657/6829

Multimodal news analytics using measures of cross-modal entity and context consistency

dc.bibliographicCitation.firstPage	111	eng
dc.bibliographicCitation.journalTitle	International Journal of Multimedia Information Retrieval	eng
dc.bibliographicCitation.lastPage	125	eng
dc.bibliographicCitation.volume	10	eng
dc.contributor.author	Müller-Budack, Eric
dc.contributor.author	Theiner, Jonas
dc.contributor.author	Diering, Sebastian
dc.contributor.author	Idahl, Maximilian
dc.contributor.author	Hakimov, Sherzod
dc.contributor.author	Ewerth, Ralph
dc.date.accessioned	2021-12-22T08:25:46Z
dc.date.available	2021-12-22T08:25:46Z
dc.date.issued	2021
dc.description.abstract	The World Wide Web has become a popular source to gather information and news. Multimodal information, e.g., supplement text with photographs, is typically used to convey the news more effectively or to attract attention. The photographs can be decorative, depict additional details, but might also contain misleading information. The quantification of the cross-modal consistency of entity representations can assist human assessors’ evaluation of the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today’s society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of the entities in text and photograph by exploiting state-of-the-art computer vision approaches. In contrast to previous work, our system automatically acquires example data from the Web and is applicable to real-world news. Moreover, an approach that quantifies contextual image-text relations is introduced. The feasibility is demonstrated on two datasets that cover different languages, topics, and domains.	eng
dc.description.version	publishedVersion	eng
dc.identifier.uri	https://oa.tib.eu/renate/handle/123456789/7782
dc.identifier.uri	https://doi.org/10.34657/6829
dc.language.iso	eng	eng
dc.publisher	London : Springer	eng
dc.relation.doi	https://doi.org/10.1007/s13735-021-00207-4
dc.relation.essn	2192-662X
dc.relation.issn	2192-6611
dc.rights.license	CC BY 4.0 Unported	eng
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	eng
dc.subject.ddc	004	eng
dc.subject.ddc	660	eng
dc.subject.ddc	020	eng
dc.subject.other	Cross-modal consistency	eng
dc.subject.other	News analytics	eng
dc.subject.other	Image-text relations	eng
dc.subject.other	Image repurposing detection	eng
dc.title	Multimodal news analytics using measures of cross-modal entity and context consistency	eng
dc.type	Article	eng
tib.accessRights	openAccess	eng
wgl.contributor	TIB	eng
wgl.subject	Informatik	eng
wgl.type	Zeitschriftenartikel	eng

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Müller-Budack2021_Article_MultimodalNewsAnalyticsUsingMe.pdf
Size:: 2.2 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Informationswissenschaften