Faces Annotation in News Images Based on Multimodal Information
Abstract
News images usually appear with the company of descriptive captions. In an image-caption pair where several faces contained in image associated with a few names in the corresponding caption, the task of face annotation is to infer the correct name for each face. In this work, a novel face annotation framework called face annotation based on multimodal information (FAMI) is proposed. Different from previous works mainly relying on facial similarity, FAMI focuses on inferring names by fusing multimodal information extracted from images and captions. Specifically, we first extract multiple types of information that may contribute to annotation faces. After that, we attempt to annotate faces by an information fusion model. Finally, a correcting strategy is adopted to improve the annotation results further. As shown in our experiments, the proposed framework yields high- quality face annotation performances against several baseline approaches.
Keywords
Multimodal information, Face annotation, Information fusion
DOI
10.12783/dtcse/cnai2018/24181
10.12783/dtcse/cnai2018/24181
Refbacks
- There are currently no refbacks.