Unveiling the Power of Self-Attention for Shipping Cost Prediction: The Rate Card Transformer

被引：0

作者：

Sreekar, P. Aditya ^{[1
]}

Verma, Sahil ^{[1
]}

Madhavan, Varun ^{[1
,2
]}

Persad, Abhishek ^{[1
]}

机构：

[1] Amazon, Hyderabad, Telangana, India

[2] Indian Inst Technol, Kharagpur, W Bengal, India

来源：

ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222 | 2023年 / 222卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Amazon ships billions of packages to its customers annually within the United States. Shipping cost of these packages are used on the day of shipping (day 0) to estimate profitability of sales. Downstream systems utilize these days 0 profitability estimates to make financial decisions, such as pricing strategies and delisting loss-making products. However, obtaining accurate shipping cost estimates on day 0 is complex for reasons like delay in carrier invoicing or fixed cost components getting recorded at monthly cadence. Inaccurate shipping cost estimates can lead to bad decision, such as pricing items too low or high, or promoting the wrong product to the customers. Current solutions for estimating shipping costs on day 0 rely on tree-based models that require extensive manual engineering efforts. In this study, we propose a novel architecture called the Rate Card Transformer (RCT) that uses self-attention to encode all package shipping information such as package attributes, carrier information and route plan. Unlike other transformer-based tabular models, RCT has the ability to encode a variable list of one-to-many relations of a shipment, allowing it to capture more information about a shipment. For example, RCT can encode properties of all products in a package. Our results demonstrate that cost predictions made by the RCT have 28.82% less error compared to tree-based GBDT model. Moreover, the RCT outperforms the state-of-the-art transformer-based tabular model, FTTransformer, by 6.08%. We also illustrate that the RCT learns a generalized manifold of the rate card that can improve the performance of tree-based models.

引用

页数：13

共 50 条

[1] Relative molecule self-attention transformer
Łukasz Maziarka
Dawid Majchrowski
Tomasz Danel
Piotr Gaiński
Jacek Tabor
Igor Podolak
Paweł Morkisz
Stanisław Jastrzębski
Journal of Cheminformatics, 16
[2] Relative molecule self-attention transformer
Maziarka, Lukasz
Majchrowski, Dawid
Danel, Tomasz
Gainski, Piotr
Tabor, Jacek
Podolak, Igor
Morkisz, Pawel
Jastrzebski, Stanislaw
JOURNAL OF CHEMINFORMATICS, 2024, 16 (01)
[3] Self-attention Capsule Network Rate Prediction with Review Quality
Liang Shunpan
Liu Wei
You Dianlong
Liu Zeqian
Zhang Fuzhi
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (12) : 3451 - 3458
[4] Self-attention transformer model for pan evaporation prediction: a case study in Australia
Abed, Mustafa
Imteaz, Monzur Alam
Huang, Yuk Feng
Ahmed, Ali Najah
JOURNAL OF HYDROINFORMATICS, 2024, 26 (10) : 2538 - 2556
[5] GSSTU: Generative Spatial Self-Attention Transformer Unit for Enhanced Video Prediction
Singh, Binit
Singh, Divij
Kaushal, Rohan
Halder, Agrya
Chattopadhyay, Pratik
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 14
[6] Universal Graph Transformer Self-Attention Networks
Dai Quoc Nguyen
Tu Dinh Nguyen
Dinh Phung
COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 193 - 196
[7] Sparse self-attention transformer for image inpainting
Huang, Wenli
Deng, Ye
Hui, Siqi
Wu, Yang
Zhou, Sanping
Wang, Jinjun
PATTERN RECOGNITION, 2024, 145
[8] SST: self-attention transformer for infrared deconvolution
Gao, Lei
Yan, Xiaohong
Deng, Lizhen
Xu, Guoxia
Zhu, Hu
INFRARED PHYSICS & TECHNOLOGY, 2024, 140
[9] Lite Vision Transformer with Enhanced Self-Attention
Yang, Chenglin
Wang, Yilin
Zhang, Jianming
Zhang, He
Wei, Zijun
Lin, Zhe
Yuille, Alan
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11988 - 11998
[10] Synthesizer: Rethinking Self-Attention for Transformer Models
Tay, Yi
Bahri, Dara
Metzler, Donald
Juan, Da-Cheng
Zhao, Zhe
Zheng, Che
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7192 - 7203

← 1 2 3 4 5 →