Natural Language Processing Applications in Case-Law Text Publishing

被引:2
|
作者
Tarasconi, Francesco [1 ]
Botros, Milad [1 ]
Caserio, Matteo [1 ]
Sportelli, Gianpiero [1 ]
Giacalone, Giuseppe [2 ]
Uttini, Carlotta [2 ]
Vignati, Luca [2 ]
Zanetta, Fabrizio [2 ]
机构
[1] CELI Language Technol, Via San Quintino 31, I-10121 Turin, Italy
[2] Giuffre Francis Lefebvre, Milan, Italy
来源
关键词
natural language processing; applications; transfer learning; language models; text classification; information extraction; publishing industry; machine learning; BERT fine-tuning; random forest; Italian language;
D O I
10.3233/FAIA200859
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Processing case-law contents for electronic publishing purposes is a time-consuming activity that encompasses several sub-tasks and usually involves adding annotations to the original text. On the other hand, recent trends in Artificial Intelligence and Natural Language Processing enable the automatic and efficient analysis of big textual data. In this paper we present our Machine Learning solution to three specific business problems, regularly met by a real world Italian publisher in their day-to-day work: recognition of legal references in text spans, new content ranking by relevance, and text classification according to a given tree of topics. Different approaches based on BERT language model were experimented with, together with alternatives, typically based on Bag-of-Words. The optimal solution, deployed in a controlled production environment, was in two out of three cases based on fine-tuned BERT (for the extraction of legal references and text classification), while, in the case of relevance ranking, a Random Forest model, with hand-crafted features, was preferred. We will conclude by discussing the concrete impact, as perceived by the publisher, of the developed prototypes.
引用
收藏
页码:154 / 163
页数:10
相关论文
共 50 条
  • [21] Natural language processing for clinical text in Spanish: The case of waiting lists in Chile
    Baez, Pablo
    Arancibia, Antonia Paz
    Chaparro, Matias Ignacio
    Bucarey, Tomas
    Nunez, Fredy
    Dunstan, Jocelyn
    REVISTA MEDICA CLINICA LAS CONDES, 2022, 33 (06): : 576 - 582
  • [22] Trends and Features of the Applications of Natural Language Processing Techniques for Clinical Trials Text Analysis
    Chen, Xieling
    Xie, Haoran
    Cheng, Gary
    Poon, Leonard K. M.
    Leng, Mingming
    Wang, Fu Lee
    APPLIED SCIENCES-BASEL, 2020, 10 (06):
  • [23] Promises of text processing: natural language processing meets AI
    Chang, JT
    Altman, RB
    DRUG DISCOVERY TODAY, 2002, 7 (19) : 992 - 993
  • [24] Case-law and legal knowledge
    Angst, P
    ACCESS TO LEGAL NORMS, PROCEEDINGS, 2000, : 51 - 55
  • [25] Applications of natural language processing in construction
    Ding, Yuexiong
    Ma, Jie
    Luo, Xiaowei
    AUTOMATION IN CONSTRUCTION, 2022, 136
  • [26] Natural Language Processing of Radiology Text Reports: Interactive Text Classification
    Wiggins, Walter F.
    Kitamura, Felipe
    Santos, Igor
    Prevedello, Luciano M.
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2021, 3 (04)
  • [27] Reflex Intellectual Text Processing Systems: Natural Language Text Addressing
    Lenkov, Serhii
    Kubyavka, Mykola
    Kubiavka, Liubov
    Lenkov, Yevhen
    Shevchuk, Valerii
    MOMLET&DS-2019: MODERN MACHINE LEARNING TECHNOLOGIES AND DATA SCIENCE, 2019, 2386 : 85 - 95
  • [28] THE RULE OF LAW IN OUR CASE-LAW OF CONTRACT
    Llewellyn, K. N.
    YALE LAW JOURNAL, 1938, 47 (08): : 1243 - 1271
  • [29] On the Ethical Limits of Natural Language Processing on Legal Text
    Tsarapatsanis, Dimitrios
    Aletras, Nikolaos
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3590 - 3599
  • [30] Review on Natural Language Processing Tasks for Text Documents
    Dudhabaware, Rahul S.
    Madankar, Mangala S.
    2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 599 - 603