AntNet with Reward-Penalty Reinforcement Learning

被引:21
|
作者
Lalbakhsh, Pooia [1 ]
Zaeri, Bahram [2 ]
Lalbakhsh, Ali [3 ]
Fesharaki, Mehdi N. [4 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Borujerd Branch, Borujerd, Lorestan, Iran
[2] Islamic Azad Univ Arak Branch, Young Res Club YRC, Arak, Iran
[3] Islamic Azad Univ Sci & Res Campus, Dept Telecommun Engn, Tehran, Iran
[4] Islamic Azad Univ Sci & Res Campus, Dept Comp Engn, Tehran, Iran
关键词
Ant colony optimization; AntNet; reward-penalty reinforcement learning; swarm intelligence;
D O I
10.1109/CICSyN.2010.11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper deals with a modification in the learning phase of AntNet routing algorithm, which improves the system adaptability in the presence of undesirable events. Unlike most of the ACO algorithms which consider reward-inaction reinforcement learning, the proposed strategy considers both reward and penalty onto the action probabilities. As simulation results show, considering penalty in AntNet routing algorithm increases the exploration towards other possible and sometimes much optimal selections, which leads to a more adaptive strategy. The proposed algorithm also uses a self-monitoring solution called Occurrence-Detection, to sense traffic fluctuations and make decision about the level of undesirability of the current status. The proposed algorithm makes use of the two mentioned strategies to prepare a self-healing version of AntNet routing algorithm to face undesirable and unpredictable traffic conditions.
引用
收藏
页码:17 / 21
页数:5
相关论文
共 50 条
  • [1] Reward-penalty reinforcement learning scheme for planning and reactive behavior
    Araujo, AFR
    Braga, APS
    1998 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5, 1998, : 1485 - 1490
  • [2] Learning Non-Unique Segmentation with Reward-Penalty Dice Loss
    He, Jiabo
    Erfani, Sarah
    Wijewickrema, Sudanthi
    O'Leary, Stephen
    Ramamohanarao, Kotagiri
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [3] EPSILON-OPTIMAL DISCRETIZED LINEAR REWARD-PENALTY LEARNING AUTOMATA
    OOMMEN, BJ
    CHRISTENSEN, JPR
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1988, 18 (03): : 451 - 458
  • [4] A New Reward-Penalty Mechanism for Distribution Companies Based on Concept of Competition
    Jooshaki, M.
    Abbaspour, A.
    Fotuhi-Firuzabad, M.
    Moeini-Aghtaie, M.
    Ozdemir, A.
    2014 IEEE PES INNOVATIVE SMART GRID TECHNOLOGIES CONFERENCE EUROPE (ISGT EUROPE), 2014,
  • [5] Handling polysemous triggers and arguments in event extraction: an adaptive semantics learning strategy with reward-penalty mechanism
    Li, Haili
    Tian, Zhiliang
    Wang, Xiaodong
    Zhou, Yunyan
    Pan, Shilong
    Zhou, Jie
    Xu, Qiubo
    Li, Dongsheng
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2025,
  • [6] A dynamic yard space reservation algorithm based on reward-penalty mechanism
    Xuan, Beng
    Liang, Chengji
    Yang, Xiaoming
    Li, Haobin
    Yang, Zhen
    HELIYON, 2024, 10 (18)
  • [7] Design of a reward-penalty cost for the promotion of net-zero energy buildings
    Lu, Yuehong
    Zhang, Xiao-Ping
    Li, Jianing
    Huang, Zhijia
    Wang, Changlong
    Luo, Liang
    ENERGY, 2019, 180 : 36 - 49
  • [8] Optimal design of reward-penalty demand response programs in smart power grids
    Ghorashi, Seyed Morteza
    Rastegar, Mohammad
    Senemmar, Soroush
    Seifi, Ali Reza
    SUSTAINABLE CITIES AND SOCIETY, 2020, 60
  • [9] Designing an electricity distribution reward-penalty scheme based on spatial reliability statistics
    Janjic, Aleksandar
    Velimirovic, Lazar Z.
    Vranic, Petar
    UTILITIES POLICY, 2021, 70
  • [10] Decision models of emission reduction considering CSR under reward-penalty policy
    Wang, Yang
    Chen, Xiuling
    Zhou, Xideng
    PLOS ONE, 2023, 18 (07):