Multi-stream CNN for facial expression recognition in limited training data

被引:23
|
作者
Aghamaleki, Javad Abbasi [1 ]
Chenarlogh, Vahid Ashkani [2 ]
机构
[1] Damgham Univ, Fac Engn Dept, Damghan, Iran
[2] Islamic Azad Univ, Sci & Res Branch, ECE Dept, Tehran, Iran
关键词
Facial expression recognition; Convolutional neural network; Limited data; Multi-stream structure; FACE;
D O I
10.1007/s11042-019-7530-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Limited data is a challenging problem to train Convolutional Neural Networks. On the other hand, acquiring a database in a demanded scale is not a straightforward task. In this paper, handcrafted features along with a multi-stream structure are proposed as a solution to improve performance of limited data via CNN. Three handcrafted features using local binary pattern code extractor and Sobel edge detection operator in horizontal and vertical directions of images have been extracted to apply to the multi-stream CNN model. Our model is based on two distinct structures including three-stream and single-stream structures. The three-stream structure can be employed to improve the recognition rate in facial expression classifiers when the training data is limited. In three-stream structure, each of information channels will be added to distinct streams separately. Furthermore, the transfer learning technique employed and behaviour of VGG16 architecture trained with limited data have been studied to be compared with the proposed method. In addition, input data is expanded by means of rotation, cropping, and flipping. Next, three-stream and single-stream structures are examined while using limited and also expanded training data. We have evaluated the mentioned system in order to compare it with state of the arts for CK+ and MUG databases in both limited-data and expanded-data. The results indicate that by using limited-data, recognition accuracy will be improved through the mentioned strategy. (92.19 to 88.95 in CK+ database and 85.4 to 82.5 in MUG database). Additionally, the performance was improved in comparison with benchmark methods.
引用
收藏
页码:22861 / 22882
页数:22
相关论文
共 50 条
  • [31] Dynamic Facial Expression Recognition based on Two-Stream-CNN with LBP-TOP
    DuoFeng
    Ren, Fuji
    PROCEEDINGS OF 2018 5TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2018, : 355 - 359
  • [32] An Optimized multi-stream decoding algorithm for handwritten word recognition
    Kessentini, Yousri
    Paquet, Thierry
    Guermazi, Ahmed
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 192 - 196
  • [33] Multi-stream HMM for EMG-based speech recognition
    Manabe, H
    Zhang, Z
    PROCEEDINGS OF THE 26TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2004, 26 : 4389 - 4392
  • [34] A Multi-Stream Sequence Learning Framework for Human Interaction Recognition
    Haroon, Umair
    Ullah, Amin
    Hussain, Tanveer
    Ullah, Waseem
    Sajjad, Muhammad
    Muhammad, Khan
    Lee, Mi Young
    Baik, Sung Wook
    IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2022, 52 (03) : 435 - 444
  • [35] Multi-stream Deep Networks for Vehicle Make and Model Recognition
    Besbes, Mohamed Dhia Elhak
    Kessentini, Yousri
    Tabia, Hedi
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 413 - 419
  • [36] Hierarchical multi-stream posterior based speech recognition system
    Ketabdar, H
    Bourlard, H
    Bengio, S
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3869 : 294 - 306
  • [37] Automated speech recognition by multi-stream dynamic time warping
    Mohamadi, T
    Gharbi, AH
    Mezaache, S
    Harrag, A
    CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING 2001, VOLS I AND II, CONFERENCE PROCEEDINGS, 2001, : 527 - 531
  • [38] Facial expression recognition with FRR-CNN
    Xie, Siyue
    Hu, Haifeng
    ELECTRONICS LETTERS, 2017, 53 (04) : 235 - 237
  • [39] Multi-stream Gaussian Mixture Model based Facial Feature Localization
    Kumatani, Kenichi
    Ekenel, Hazim K.
    Gao, Hua
    Stiefelhagen, Rainer
    Ercil, Aytuel
    2008 IEEE 16TH SIGNAL PROCESSING, COMMUNICATION AND APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2008, : 869 - +
  • [40] Multi-stream acoustic model adaptation for noisy speech recognition
    Tamura, Satoshi
    Hayamizu, Satoru
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,