Semi-supervised identification of rarely appearing persons in video by correcting weak labels

Some recent approaches for character identification in movies and TV broadcasts are realized in a semi-supervised manner by assigning transcripts and/or subtitles to the speakers. However, the labels obtained in this way achieve only an accuracy of 80% - 90% and the number of training examples for the different actors is unevenly distributed. In this paper, we propose a novel approach for person identification in video by correcting and extending the training data with reliable predictions to reduce the number of annotation errors. Furthermore, the intra-class diversity of rarely speaking characters is enhanced. To address the imbalance of training data per person, we suggest two complementary prediction scores. These scores are also used to recognize whether or not a face track belongs to a (supporting) character whose identity does not appear in the transcript etc. Experimental results demonstrate the feasibility of the proposed approach, outperforming the current state of the art.

Keywords

Face identification in video, semi-supervised learning

Publication Type

ConferenceObject

Version

publishedVersion

URI

https://doi.org/10.34657/690
https://oa.tib.eu/renate/handle/123456789/4431

Collections

Informationswissenschaften

License

CC BY 4.0 Unported

https://creativecommons.org/licenses/by/4.0/

Full item page

Semi-supervised identification of rarely appearing persons in video by correcting weak labels

Files

Date

Authors

Editor

Advisor

Volume

Issue

Journal

Series Titel

Book Title

Publisher

Supplementary Material

Other Versions

Link to publishers' Version

Abstract

Description

Keywords

Keywords GND

Conference

Publication Type

Version

URI

Collections

License