Linguistic Summarization using a Weighted N-gram Language Model based on the Similarity of Time-series Data

被引：0

作者：

Aoki, Kasumi ^{[1
]}

Kobayashi, Ichiro ^{[2
]}

机构：

[1] Ochanomizu Univ, Fac Sci, Dept Informat Sci, Tokyo, Japan

[2] Ochanomizu Univ, Grad Sch Humanities & Sci, Adv Sci, Tokyo, Japan

来源：

2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE) | 2016年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes a method to verbalize the trends of time-series data. As an example of time-series data, we use the price of Nikkei stock average and develop a method to generate natural language sentences which describe how the stock price goes in the market. As the basic idea for making linguistic descriptions of the stock price trends, we firstly classify all the time-series data including a newly observed time-series data, i.e., the target to be verbalized, by means of spectral clustering employing Dynamic Time Warping distance as its similarity metric. Secondly, a bi-gram language model for the newly observed data is built based on the weighted bi-gram language models of the other time-series data classified in the same cluster. The weights for the bi-gram model of the target data from other time-series data are decided based on the similarity between the target data and the other data in the same cluster. Lastly, linguistic summarization for the target data is generated by finding the most likely combination of words by means of dynamic programming, employing the weighted bi-gram model. Through the experiments under the conditions of various cluster numbers in spectral clustering, we have confirmed that natural language sentences, which properly describe the trends of the stock price, are generated by our method.

引用

页码：595 / 601

页数：7

共 50 条

[1] Symbolic Translation of Time Series using Piecewise N-gram Similarity Voting
Delannoy, Siegfried
Caillault, Emilie
Bigand, Andre
Rousseeuw, Kevin
PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM), 2021, : 327 - 333
[2] A WEIGHTED AVERAGE N-GRAM MODEL OF NATURAL-LANGUAGE
OBOYLE, P
OWENS, M
SMITH, FJ
COMPUTER SPEECH AND LANGUAGE, 1994, 8 (04): : 337 - 349
[3] An Approach to Linguistic Summarization based on Comparison among Multiple Time-series Data
Kobayashi, Mizuki
Kobayashi, Ichiro
6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS, 2012, : 1100 - 1103
[4] Blind data linkage using n-gram similarity comparisons
Churches, T
Christen, P
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2004, 3056 : 121 - 126
[5] Splitting input for machine translation using N-gram language model together with utterance similarity
Doi, T
Sumita, E
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (06): : 1256 - 1264
[6] SPANISH LINGUISTIC STEGANOGRAPHY BASED ON N-GRAM MODEL AND ZIPF LAW
Munoz Munoz, Alfonso
Argueelles Alvarez, Irina
ARBOR-CIENCIA PENSAMIENTO Y CULTURA, 2014, 190 (768)
[7] Discovery of Corrosion Patterns using Symbolic Time Series Representation and N-gram Model
Taib, Shakirah Mohd
Zabidi, Zahiah Akhma Mohd
Aziz, Izzatdin Abdul
Mousor, Farahida Hanim
Abu Bakar, Azuraliza
Mokhtar, Ainul Akmar
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (12) : 554 - 560
[8] UNSUPERVISED LANGUAGE MODEL ADAPTATION USING N-GRAM WEIGHTING
Haidar, Md. Akmal
O'Shaughnessy, Douglas
2011 24TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2011, : 857 - 860
[9] Bangla Word Clustering Based on N-gram Language Model
Ismail, Sabir
Rahman, M. Shahidur
2014 1ST INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATION & COMMUNICATION TECHNOLOGY (ICEEICT 2014), 2014,
[10] Linguistic Summarization of Time Series Data using Genetic Algorithms
Castillo-Ortega, Rita
Marin, Nicolas
Sanchez, Daniel
Tettamanzi, Andrea G. B.
PROCEEDINGS OF THE 7TH CONFERENCE OF THE EUROPEAN SOCIETY FOR FUZZY LOGIC AND TECHNOLOGY (EUSFLAT-2011) AND LFA-2011, 2011, : 416 - 423

← 1 2 3 4 5 →