Predicting innovative firms using web mining and deep learning

被引:28
|
作者
Kinne, Jan [1 ,2 ,3 ]
Lenz, David [3 ,4 ]
机构
[1] ZEW Ctr European Econ Res, Dept Econ Innovat & Ind Dynam, Mannheim, Germany
[2] Univ Salzburg, Dept Geoinformat Z GIS, Salzburg, Austria
[3] Istari Ai, Mannheim, Germany
[4] Justus Liebig Univ, Dept Econometr & Stat, Giessen, Germany
来源
PLOS ONE | 2021年 / 16卷 / 04期
关键词
PATENT STATISTICS; NEURAL-NETWORKS;
D O I
10.1371/journal.pone.0249071
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Evidence-based STI (science, technology, and innovation) policy making requires accurate indicators of innovation in order to promote economic growth. However, traditional indicators from patents and questionnaire-based surveys often lack coverage, granularity as well as timeliness and may involve high data collection costs, especially when conducted at a large scale. Consequently, they struggle to provide policy makers and scientists with the full picture of the current state of the innovation system. In this paper, we propose a first approach on generating web-based innovation indicators which may have the potential to overcome some of the shortcomings of traditional indicators. Specifically, we develop a method to identify product innovator firms at a large scale and very low costs. We use traditional firm-level indicators from a questionnaire-based innovation survey (German Community Innovation Survey) to train an artificial neural network classification model on labelled (product innovator/no product innovator) web texts of surveyed firms. Subsequently, we apply this classification model to the web texts of hundreds of thousands of firms in Germany to predict whether they are product innovators or not. We then compare these predictions to firm-level patent statistics, survey extrapolation benchmark data, and regional innovation indicators. The results show that our approach produces reliable predictions and has the potential to be a valuable and highly cost-efficient addition to the existing set of innovation indicators, especially due to its coverage and regional granularity.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Mining Innovative Topics Based on Deep Learning
    Fu C.
    Qian L.
    Zhang H.
    Zhao H.
    Xie J.
    Data Analysis and Knowledge Discovery, 2019, 3 (01) : 46 - 54
  • [2] Learning and predicting operation strategies by sequence mining and deep learning
    Dorgo, Gyula
    Abonyi, Janos
    COMPUTERS & CHEMICAL ENGINEERING, 2019, 128 : 174 - 187
  • [3] An Intelligent System for Predicting a User Access to a Web Based E-Learning System Using Web Mining
    Sathiyamoorthi, V
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2020, 15 (01) : 75 - 94
  • [4] Automatic Bird-Species Recognition using the Deep Learning and Web Data Mining
    Kang, Min-Seok
    Hong, Kwang-Seok
    2018 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2018, : 1258 - 1260
  • [5] Deep learning model for recommendation system using web of things based knowledge graph mining
    Byeon, Haewon
    Chunduri, Venkata
    Narang, Geetika
    Alghayadh, Faisal Yousef
    Soni, Mukesh
    Ramesh, Janjhyam Venkata Naga
    SERVICE ORIENTED COMPUTING AND APPLICATIONS, 2025, 19 (01) : 57 - 76
  • [6] Deep-Hill: An Innovative Cloud Resource Optimization Algorithm by Predicting SaaS Instance Configuration Using Deep Learning
    Abouelyazid, Mahmoud
    IEEE ACCESS, 2024, 12 : 92573 - 92584
  • [7] Innovative Sensing by Using Deep Learning Framework
    Gulgec, Nur Sila
    Takac, Martin
    Pakzad, Shamim N.
    DYNAMICS OF CIVIL STRUCTURES, VOL 2, 2019, : 293 - 300
  • [8] Multimedia Data Mining using Deep Learning
    Wlodarczak, Peter
    Soar, Jeffrey
    Ally, Mustafa
    2015 FIFTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION PROCESSING AND COMMUNICATIONS (ICDIPC), 2015, : 190 - 196
  • [9] Deep neural networks and transfer learning applied to multimedia web mining
    Lopez-Sanchez, Daniel
    Gonzalez Arrieta, Angelica
    Corchado, Juan M.
    DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2018, 620 : 124 - 131
  • [10] Active Learning Based Frequent Itemset Mining Over the Deep Web
    Liu, Tantan
    Agrawal, Gagan
    IEEE 27TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2011), 2011, : 219 - 230