A Multimodal Approach for Semantic Patent Image Retrieval

dc.bibliographicCitation.volume2909eng
dc.contributor.authorPustu-Iren, Kader
dc.contributor.authorBruns, Gerrit
dc.contributor.authorEwerth, Ralph
dc.date.accessioned2021-12-22T16:11:27Z
dc.date.available2021-12-22T16:11:27Z
dc.date.issued2021
dc.description.abstractPatent images such as technical drawings contain valuable information and are frequently used by experts to compare patents. However, current approaches to patent information retrieval are largely focused on textual information. Consequently, we review previous work on patent retrieval with a focus on illustrations in figures. In this paper, we report on work in progress for a novel approach for patent image retrieval that uses deep multimodal features. Scene text spotting and optical character recognition are employed to extract numerals from an image to subsequently identify references to corresponding sentences in the patent document. Furthermore, we use a neural state-of-the-art CLIP model to extract structural features from illustrations and additionally derive textual features from the related patent text using a sentence transformer model. To fuse our multimodal features for similarity search we apply re-ranking according to averaged or maximum scores. In our experiments, we compare the impact of different modalities on the task of similarity search for patent images. The experimental results suggest that patent image retrieval can be successfully performed using the proposed feature sets, while the best results are achieved when combining the features of both modalities.eng
dc.description.versionpublishedVersioneng
dc.identifier.urihttps://oa.tib.eu/renate/handle/123456789/7795
dc.identifier.urihttps://doi.org/10.34657/6842
dc.language.isoengeng
dc.publisherAachen, Germany : RWTH Aacheneng
dc.relation.essn1613-0073
dc.relation.ispartofProceedings of the 2nd Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech) 2021 co-located with the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021)eng
dc.relation.ispartofseriesCEUR Workshop Proceedings ; 2909eng
dc.relation.urihttp://ceur-ws.org/Vol-2909/paper6.pdf
dc.rights.licenseCC BY 4.0 Unportedeng
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/eng
dc.subjectPatent Image Similarity Searcheng
dc.subjectDeep Learningeng
dc.subjectMulitmodal Feature Representationseng
dc.subjectScene Text Spottingeng
dc.subject.classificationKonferenzschriftger
dc.subject.ddc020eng
dc.titleA Multimodal Approach for Semantic Patent Image Retrievaleng
dc.typebookParteng
dc.typeTexteng
tib.accessRightsopenAccesseng
tib.relation.conferencePatentSemTech 2021 - Patent Text Mining and Semantic Technologies, July 15th 2021, onlineeng
wgl.contributorTIBeng
wgl.subjectInformatikeng
wgl.typeBuchkapitel / Sammelwerksbeitrageng
wgl.typeKonferenzbeitrageng
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
paper6.pdf
Size:
1.32 MB
Format:
Adobe Portable Document Format
Description: