Semantic Network Analysis Pipeline-Interactive Text Mining Framework for Exploration of Semantic Flows in Large Corpus of Text

被引:1
|
作者
Cenek, Martin [1 ,4 ]
Bulkow, Rowan [2 ]
Pak, Eric [3 ]
Oyster, Levi [3 ]
Ching, Boyd [3 ]
Mulagada, Ashika [1 ]
机构
[1] Univ Portland, Comp Sci, Portland, OR 90203 USA
[2] Resource Data Inc, Anchorage, AK 99503 USA
[3] Univ Alaska Anchorage, Comp Sci, Anchorage, AK 99508 USA
[4] 5000 N Willamette Blvd, Portland, OR 97203 USA
来源
APPLIED SCIENCES-BASEL | 2019年 / 9卷 / 24期
关键词
semantic concept; text mining; computational linguistics; language processing; natural language processing; interactive visualization; MODEL;
D O I
10.3390/app9245302
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Historical topic modeling and semantic concepts exploration in a large corpus of unstructured text remains a hard, opened problem. Despite advancements in natural languages processing tools, statistical linguistics models, graph theory and visualization, there is no framework that combines these piece-wise tools under one roof. We designed and constructed a Semantic Network Analysis Pipeline (SNAP) that is available as an open-source web-service that implements work-flow needed by a data scientist to explore historical semantic concepts in a text corpus. We define a graph theoretic notion of a semantic concept as a flow of closely related tokens through the corpus of text. The modular work-flow pipeline processes text using natural language processing tools, statistical content narrowing, creates semantic networks from lexical token chaining, performs social network analysis of token networks and creates a 3D visualization of the semantic concept flows through corpus for interactive concept exploration. Finally, we illustrate the framework's utility to extract the information from a text corpus of Herman Melville's novel Moby Dick, the transcript of the 2015-2016 United States (U.S.) Senate Hearings on Environment and Public Works, and the Australian Broadcast Corporation's short news articles on rural and science topics.
引用
收藏
页数:15
相关论文
共 50 条
  • [11] A Data-Driven Text Mining and Semantic Network Analysis for Design Information Retrieval
    Shi, Feng
    Chen, Liuqing
    Han, Ji
    Childs, Peter
    JOURNAL OF MECHANICAL DESIGN, 2017, 139 (11)
  • [12] Approaching fashion design trend applications using text mining and semantic network analysis
    An, Hyosun
    Park, Minjung
    FASHION AND TEXTILES, 2020, 7 (01)
  • [13] Research trends in text mining: Semantic network and main path analysis of selected journals
    Jung, Hoon
    Lee, Bong Gyou
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 162
  • [14] Semantic Network Analysis as a Method for Visual Text Analytics
    Drieger, Philipp
    9TH CONFERENCE ON APPLICATIONS OF SOCIAL NETWORK ANALYSIS (ASNA), 2013, 79 : 4 - 17
  • [15] Dynamic Semantic Network Analysis of Unstructured Text Corpora
    Kharlamov, Alexander
    Gradoselskaya, Galina
    Dokuka, Sofia
    ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS, AIST 2017, 2018, 10716 : 392 - 403
  • [16] WordNet-based lexical semantic classification for text corpus analysis
    Jun Long
    Lu-da Wang
    Zu-de Li
    Zu-ping Zhang
    Liu Yang
    Journal of Central South University, 2015, 22 : 1833 - 1840
  • [17] WordNet-based lexical semantic classification for text corpus analysis
    Long Jun
    Wang Lu-da
    Li Zu-de
    Zhang Zu-ping
    Yang Liu
    JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2015, 22 (05) : 1833 - 1840
  • [18] An efficient framework of utilizing the latent semantic analysis in text extraction
    Ababneh, Ahmad Hussein
    Lu, Joan
    Xu, Qiang
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) : 785 - 815
  • [19] An efficient framework of utilizing the latent semantic analysis in text extraction
    Ahmad Hussein Ababneh
    Joan Lu
    Qiang Xu
    International Journal of Speech Technology, 2019, 22 : 785 - 815
  • [20] SEMANTIC FEATURE ANALYSIS - AN INTERACTIVE STRATEGY FOR VOCABULARY DEVELOPMENT AND TEXT COMPREHENSION
    ANDERS, PL
    BOS, CS
    JOURNAL OF READING, 1986, 29 (07): : 610 - 616