A Multimodal Approach for Semantic Patent Image Retrieval

Pustu-Iren, Kader; Bruns, Gerrit; Ewerth, Ralph

doi:https://doi.org/10.34657/6842

A Multimodal Approach for Semantic Patent Image Retrieval

Files

paper6.pdf (1.32 MB)

Date

2021

Authors

Pustu-Iren, Kader

Bruns, Gerrit

Ewerth, Ralph

Volume

2909

Series Titel

CEUR Workshop Proceedings ; 2909

Book Title

Proceedings of the 2nd Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech) 2021 co-located with the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021)

Publisher

Aachen, Germany : RWTH Aachen

Link to publishers' Version

http://ceur-ws.org/Vol-2909/paper6.pdf

Abstract

Patent images such as technical drawings contain valuable information and are frequently used by experts to compare patents. However, current approaches to patent information retrieval are largely focused on textual information. Consequently, we review previous work on patent retrieval with a focus on illustrations in figures. In this paper, we report on work in progress for a novel approach for patent image retrieval that uses deep multimodal features. Scene text spotting and optical character recognition are employed to extract numerals from an image to subsequently identify references to corresponding sentences in the patent document. Furthermore, we use a neural state-of-the-art CLIP model to extract structural features from illustrations and additionally derive textual features from the related patent text using a sentence transformer model. To fuse our multimodal features for similarity search we apply re-ranking according to averaged or maximum scores. In our experiments, we compare the impact of different modalities on the task of similarity search for patent images. The experimental results suggest that patent image retrieval can be successfully performed using the proposed feature sets, while the best results are achieved when combining the features of both modalities.

Keywords

Keywords GND

Konferenzschrift

Conference

PatentSemTech 2021 - Patent Text Mining and Semantic Technologies, July 15th 2021, online

Publication Type

BookPart

Version

publishedVersion

URI

https://oa.tib.eu/renate/handle/123456789/7795
https://doi.org/10.34657/6842

Collections

Informationswissenschaften

License

CC BY 4.0 Unported

https://creativecommons.org/licenses/by/4.0/

Full item page

A Multimodal Approach for Semantic Patent Image Retrieval

Files

Date

Authors

Editor

Advisor

Volume

Issue

Journal

Series Titel

Book Title

Publisher

Supplementary Material

Other Versions

Link to publishers' Version

Abstract

Description

Keywords

Keywords GND

Conference

Publication Type

Version

URI

Collections

License