AntNet with Reward-Penalty Reinforcement Learning

被引：21

作者：

Lalbakhsh, Pooia ^{[1
]}

Zaeri, Bahram ^{[2
]}

Lalbakhsh, Ali ^{[3
]}

Fesharaki, Mehdi N. ^{[4
]}

机构：

[1] Islamic Azad Univ, Dept Comp Engn, Borujerd Branch, Borujerd, Lorestan, Iran

[2] Islamic Azad Univ Arak Branch, Young Res Club YRC, Arak, Iran

[3] Islamic Azad Univ Sci & Res Campus, Dept Telecommun Engn, Tehran, Iran

[4] Islamic Azad Univ Sci & Res Campus, Dept Comp Engn, Tehran, Iran

来源：

2010 SECOND INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, COMMUNICATION SYSTEMS AND NETWORKS (CICSYN) | 2010年

关键词：

Ant colony optimization; AntNet; reward-penalty reinforcement learning; swarm intelligence;

D O I：

10.1109/CICSyN.2010.11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The paper deals with a modification in the learning phase of AntNet routing algorithm, which improves the system adaptability in the presence of undesirable events. Unlike most of the ACO algorithms which consider reward-inaction reinforcement learning, the proposed strategy considers both reward and penalty onto the action probabilities. As simulation results show, considering penalty in AntNet routing algorithm increases the exploration towards other possible and sometimes much optimal selections, which leads to a more adaptive strategy. The proposed algorithm also uses a self-monitoring solution called Occurrence-Detection, to sense traffic fluctuations and make decision about the level of undesirability of the current status. The proposed algorithm makes use of the two mentioned strategies to prepare a self-healing version of AntNet routing algorithm to face undesirable and unpredictable traffic conditions.

引用

页码：17 / 21

页数：5

共 50 条

[1] Reward-penalty reinforcement learning scheme for planning and reactive behavior
Araujo, AFR
Braga, APS
1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, : 1485 - 1490
[2] Learning Non-Unique Segmentation with Reward-Penalty Dice Loss
He, Jiabo
Erfani, Sarah
Wijewickrema, Sudanthi
O'Leary, Stephen
Ramamohanarao, Kotagiri
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[3] EPSILON-OPTIMAL DISCRETIZED LINEAR REWARD-PENALTY LEARNING AUTOMATA
OOMMEN, BJ
CHRISTENSEN, JPR
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1988, 18 (03): : 451 - 458
[4] A New Reward-Penalty Mechanism for Distribution Companies Based on Concept of Competition
Jooshaki, M.
Abbaspour, A.
Fotuhi-Firuzabad, M.
Moeini-Aghtaie, M.
Ozdemir, A.
2014 IEEE PES INNOVATIVE SMART GRID TECHNOLOGIES CONFERENCE EUROPE (ISGT EUROPE), 2014,
[5] Handling polysemous triggers and arguments in event extraction: an adaptive semantics learning strategy with reward-penalty mechanism
Li, Haili
Tian, Zhiliang
Wang, Xiaodong
Zhou, Yunyan
Pan, Shilong
Zhou, Jie
Xu, Qiubo
Li, Dongsheng
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2025,
[6] A dynamic yard space reservation algorithm based on reward-penalty mechanism
Xuan, Beng
Liang, Chengji
Yang, Xiaoming
Li, Haobin
Yang, Zhen
HELIYON, 2024, 10 (18)
[7] Design of a reward-penalty cost for the promotion of net-zero energy buildings
Lu, Yuehong
Zhang, Xiao-Ping
Li, Jianing
Huang, Zhijia
Wang, Changlong
Luo, Liang
ENERGY, 2019, 180 : 36 - 49
[8] Optimal design of reward-penalty demand response programs in smart power grids
Ghorashi, Seyed Morteza
Rastegar, Mohammad
Senemmar, Soroush
Seifi, Ali Reza
SUSTAINABLE CITIES AND SOCIETY, 2020, 60
[9] Designing an electricity distribution reward-penalty scheme based on spatial reliability statistics
Janjic, Aleksandar
Velimirovic, Lazar Z.
Vranic, Petar
UTILITIES POLICY, 2021, 70
[10] Decision models of emission reduction considering CSR under reward-penalty policy
Wang, Yang
Chen, Xiuling
Zhou, Xideng
PLOS ONE, 2023, 18 (07):

← 1 2 3 4 5 →