Unveiling the Power of Self-Attention for Shipping Cost Prediction: The Rate Card Transformer

被引：0

作者：

Sreekar, P. Aditya ^{[1
]}

Verma, Sahil ^{[1
]}

Madhavan, Varun ^{[1
,2
]}

Persad, Abhishek ^{[1
]}

机构：

[1] Amazon, Hyderabad, Telangana, India

[2] Indian Inst Technol, Kharagpur, W Bengal, India

来源：

ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222 | 2023年 / 222卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Amazon ships billions of packages to its customers annually within the United States. Shipping cost of these packages are used on the day of shipping (day 0) to estimate profitability of sales. Downstream systems utilize these days 0 profitability estimates to make financial decisions, such as pricing strategies and delisting loss-making products. However, obtaining accurate shipping cost estimates on day 0 is complex for reasons like delay in carrier invoicing or fixed cost components getting recorded at monthly cadence. Inaccurate shipping cost estimates can lead to bad decision, such as pricing items too low or high, or promoting the wrong product to the customers. Current solutions for estimating shipping costs on day 0 rely on tree-based models that require extensive manual engineering efforts. In this study, we propose a novel architecture called the Rate Card Transformer (RCT) that uses self-attention to encode all package shipping information such as package attributes, carrier information and route plan. Unlike other transformer-based tabular models, RCT has the ability to encode a variable list of one-to-many relations of a shipment, allowing it to capture more information about a shipment. For example, RCT can encode properties of all products in a package. Our results demonstrate that cost predictions made by the RCT have 28.82% less error compared to tree-based GBDT model. Moreover, the RCT outperforms the state-of-the-art transformer-based tabular model, FTTransformer, by 6.08%. We also illustrate that the RCT learns a generalized manifold of the rate card that can improve the performance of tree-based models.

引用

页数：13

共 50 条

[21] Efficient memristor accelerator for transformer self-attention functionality
Bettayeb, Meriem
Halawani, Yasmin
Khan, Muhammad Umair
Saleh, Hani
Mohammad, Baker
SCIENTIFIC REPORTS, 2024, 14 (01):
[22] A lightweight transformer with linear self-attention for defect recognition
Zhai, Yuwen
Li, Xinyu
Gao, Liang
Gao, Yiping
ELECTRONICS LETTERS, 2024, 60 (17)
[23] Transformer with sparse self-attention mechanism for image captioning
Wang, Duofeng
Hu, Haifeng
Chen, Dihu
ELECTRONICS LETTERS, 2020, 56 (15) : 764 - +
[24] An efficient parallel self-attention transformer for CSI feedback
Liu, Ziang
Song, Tianyu
Zhao, Ruohan
Jin, Jiyu
Jin, Guiyue
PHYSICAL COMMUNICATION, 2024, 66
[25] Transformer Self-Attention Network for Forecasting Mortality Rates
Roshani, Amin
Izadi, Muhyiddin
Khaledi, Baha-Eldin
JIRSS-JOURNAL OF THE IRANIAN STATISTICAL SOCIETY, 2022, 21 (01): : 81 - 103
[26] Keyword Transformer: A Self-Attention Model for Keyword Spotting
Berg, Axel
O'Connor, Mark
Cruz, Miguel Tairum
INTERSPEECH 2021, 2021, : 4249 - 4253
[27] Traffic Prediction for Optical Fronthaul Network Using Self-Attention Mechanism-Based Transformer
Zhao, Xujun
Wu, Yonghan
Hao, Xue
Zhang, Lifang
Wang, Danshi
Zhang, Min
2022 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE, ACP, 2022, : 1207 - 1210
[28] Vehicle Interaction Behavior Prediction with Self-Attention
Li, Linhui
Sui, Xin
Lian, Jing
Yu, Fengning
Zhou, Yafu
SENSORS, 2022, 22 (02)
[29] Mechanics of Next Token Prediction with Self-Attention
Li, Yingcong
Huang, Yixiao
Ildiz, M. Emrullah
Rawat, Ankit Singh
Oymak, Samet
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[30] Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-Attention
Leem, Saebom
Seo, Hyunseok
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 2956 - 2964

← 1 2 3 4 5 →