Enabling ISPless Low-Power Computer Vision

被引:1
|
作者
Datta, Gourav [1 ]
Liu, Zeyu [1 ]
Yin, Zihan [1 ]
Sun, Linyu [1 ]
Jaiswal, Akhilesh R. [1 ]
Beerel, Peter A. [1 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90007 USA
关键词
D O I
10.1109/WACV56688.2023.00246
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Current computer vision (CV) systems use an image signal processing (ISP) unit to convert the high resolution raw images captured by image sensors to visually pleasing RGB images. Typically, CV models are trained on these RGB images and have yielded state-of-the-art (SOTA) performance on a wide range of complex vision tasks, such as object detection. In addition, in order to deploy these models on resource-constrained low-power devices, recent works have proposed in-sensor and in-pixel computing approaches that try to partly/fully bypass the ISP and yield significant bandwidth reduction between the image sensor and the CV processing unit by downsampling the activation maps in the initial convolutional neural network (CNN) layers. However, direct inference on the raw images degrades the test accuracy due to the difference in covariance of the raw images captured by the image sensors compared to the ISPprocessed images used for training. Moreover, it is difficult to train deep CV models on raw images, because most (if not all) large-scale open-source datasets consist of RGB images. To mitigate this concern, we propose to invert the ISP pipeline, which can convert the RGB images of any dataset to its raw counterparts, and enable model training on raw images. We release the raw version of the COCO dataset, a large-scale benchmark for generic high-level vision tasks. For ISP-less CV systems, training on these raw images result in a similar to 7.1% increase in test accuracy on the visual wake works (VWW) dataset compared to relying on training with traditional ISP-processed RGB datasets. To further improve the accuracy of ISP-less CV models and to increase the energy and bandwidth benefits obtained by in-sensor/in-pixel computing, we propose an energy-efficient form of analog in-pixel demosaicing that may be coupled with in-pixel CNN computations. When evaluated on raw images captured by real sensors from the PASCALRAWdataset, our approach results in a 8.1% increase in mAP. Lastly, we demonstrate a further 20.5% increase in mAP by using a novel application of few-shot learning with thirty shots each for the novel PASCALRAW dataset, constituting 3 classes. Codes are available at https://github.com/godatta/ISP-less-CV.
引用
收藏
页码:2429 / 2438
页数:10
相关论文
共 50 条
  • [31] Enabling On-Demand Low-Power mmWave Repeaters via Passive Beamforming
    Li, Tianxiang
    Mazaheri, Mohammad H.
    Abari, Omid
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, ACM MOBICOM 2024, 2024, : 618 - 632
  • [32] Low-power Wide-area Networks: Enabling Geo-IoT
    De Milliano, Sabine
    GIM INTERNATIONAL-THE WORLDWIDE MAGAZINE FOR GEOMATICS, 2016, 30 (11): : 24 - 25
  • [33] Enabling Densely-Scalable Low-Power WSNs for Shipping and Industrial IoT
    Williams, J. M.
    Ruiz-Rosero, J. P.
    Khanna, R.
    Pisharody, G.
    Qian, Y.
    Wang, J.
    Carlson, C.
    Liu, H.
    Ramirez-Gonzalez, G.
    2017 IEEE 8TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS AND MOBILE COMMUNICATION CONFERENCE (UEMCON), 2017, : 547 - 552
  • [34] A system-on-a-chip design of a low-power smart vision system
    Fang, WC
    1998 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS-SIPS 98: DESIGN AND IMPLEMENTATION, 1998, : 63 - 72
  • [35] A High-Speed Low-Power Multitask Digital Vision Chip
    Noohi, Mohammad Sajad
    Sayedi, Sayed Masoud
    Jalili, Armin
    2014 SECOND RSI/ISM INTERNATIONAL CONFERENCE ON ROBOTICS AND MECHATRONICS (ICROM), 2014, : 161 - 165
  • [36] Real-Time Low-Power FPGA Architecture for Stereo Vision
    Puglia, Luca
    Vigliar, Mario
    Raiconi, Giancarlo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2017, 64 (11) : 1307 - 1311
  • [37] LPViT: Low-Power Semi-structured Pruning for Vision Transformers
    Xu, Kaixin
    Wang, Zhe
    Chen, Chunyun
    Geng, Xue
    Lin, Jie
    Yang, Xulei
    Wu, Min
    Li, Xiaoli
    Lin, Weisi
    COMPUTER VISION - ECCV 2024, PT LXXI, 2025, 15129 : 269 - 287
  • [38] Enabling Sustainable Steel Production with Computer Vision
    O'Donovan, Callum
    Giannetti, Cinzia
    Pleydell-Pearce, Cameron
    ADVANCES IN MANUFACTURING TECHNOLOGY XXXVI, 2023, 44 : 37 - 42
  • [39] Enabling video privacy through computer vision
    Senior, A
    Pankanti, S
    Hampapur, A
    Brown, L
    Tian, YL
    Ekin, A
    Connell, J
    Shu, CF
    Lu, M
    IEEE SECURITY & PRIVACY, 2005, 3 (03) : 50 - 57
  • [40] Skyrmion Vault: Maximizing Skyrmion Lifespan for Enabling Low-Power Skyrmion Racetrack Memory
    Lu, Syue-Wei
    Chen, Shuo-Han
    Liang, Yu-Pei
    Chang, Yuan-Hao
    Wang, Kang
    Chen, Tseng-Yi
    Shih, Wei-Kuan
    2023 28TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC, 2023, : 333 - 338