Construction of Internet of Things English terms model and analysis of language features via deep learning

被引:5
|
作者
Li, Yongbin [1 ]
机构
[1] Lanzhou Jiaotong Univ, Sch Foreign Languages, Lanzhou, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2022年 / 78卷 / 05期
关键词
Deep learning; Internet of Things English; Term model; Language features; EXTRACTION;
D O I
10.1007/s11227-021-04130-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This exploration aims to attain more detailed technical terms and form structured knowledge representation. An unsupervised knowledge representation learning method is constructed here, and a term model is constructed by the neural network of deep learning to drill down the English terms in the Internet of Things (IoT). After analyzing the IoT English language characteristics, an expert interaction method is proposed based on the Delphi method. Besides, the N-Gram model is utilized for unsupervised candidate acquisition process to break up long text into short fragments, and each fragment is a possible word string combination. The data used in this experiment are obtained from the Web of Science database, and two term lists including "adaptive control" and "self-learning" are selected for data retrieval. Meanwhile, the term frequency-inverse document frequency value is used to preliminarily screen the words. With the text in the IoT field as the experimental object, the influence of different N-Gram numbers on the system is analyzed from three aspects, namely system running time, average memory occupancy rate, and F1 value of term extraction. The experimental results demonstrate that when the number of N-Gram increases, the overall running time of the system increases, and the memory load also enlarges when performing the operation task. According to the F1 value of term extraction, when N = 1, 2, 3, and 4, the F1 value has reached the highest level. If the number of N-Grams continues to increase, the F1 value of term extraction by the system will decrease. When K is equal to 4 and 6, respectively, the Silhouette Coefficient results of the terms "adaptive control" and "self-learning" turn separately, so the two terms are classified into categories 4 and 6 categories accordingly. In summary, the deep learning technology can effectively and automatically extract professional terms from the original text, and classify the extraction results according to the original terminology database. Compared with existing methods, core experts play a central role in acquiring knowledge in specific areas, while external experts play an indispensable role in enriching and improving technology systems.
引用
收藏
页码:6296 / 6317
页数:22
相关论文
共 50 条