Domain-based Latent Personal Analysis and its use for impersonation detection in social media

被引:0
|
作者
Osnat Mokryn
Hagit Ben-Shoshan
机构
[1] University of Haifa,Information Systems
[2] University of Haifa,Management
关键词
Latent Personal Analysis (LPA); Zipf; Authorship attribution; Impersonation; Sockpuppets; Front-users;
D O I
暂无
中图分类号
学科分类号
摘要
Zipf’s law defines an inverse proportion between a word’s ranking in a given corpus and its frequency in it, roughly dividing the vocabulary into frequent words and infrequent ones. Here, we stipulate that within a domain an author’s signature can be derived from, in loose terms, the author’s missing popular words and frequently used infrequent words. We devise a method, termed Latent Personal Analysis (LPA), for finding domain-based attributes for entities in a domain: their distance from the domain and their signature, which determines how they most differ from a domain. We identify the most suitable distance metric for the method among several and construct the distances and personal signatures for authors, the domain’s entities. The signature consists of both over-used terms (compared to the average) and missing popular terms. We validate the correctness and power of the signatures in identifying users and set existence conditions. We test LPA in several domains, both textual and non-textual. We then demonstrate the use of the method in explainable authorship attribution: we define algorithms that utilize LPA  to identify two types of impersonation in social media: (1) authors with sockpuppets (multiple) accounts and (2) front-users accounts, operated by several authors. We validate the algorithms and employ them over a large-scale dataset obtained from a social media site with over 4000 users. We corroborate these results using temporal rate analysis. LPA  can further be used to devise personal attributes in a wide range of scientific domains in which the constituents have a long-tail distribution of elements.
引用
收藏
页码:785 / 828
页数:43
相关论文
共 50 条
  • [41] Domain-Based Change Propagation Analysis: An Enterprise System Case Study
    Aryani, Amir
    Peake, Ian D.
    Hamilton, Margaret
    2010 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, 2010,
  • [42] Language Choice in the Malaysia-Thailand Border: A Domain-based Analysis
    Mis, Mohammed Azlan
    Jaafar, Mohammad Fadzeli
    Awal, Norsimah Mat
    Lateh, Hayati
    PERTANIKA JOURNAL OF SOCIAL SCIENCE AND HUMANITIES, 2013, 21 : 169 - 182
  • [43] #YouToo? Detection of Personal Recollections of Sexual Harassment on Social Media
    Chowdhury, Arijit Ghosh
    Sawhney, Ramit
    Shah, Rajiv Ratn
    Mahata, Debanjan
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2527 - 2537
  • [44] Analysis of Driving Safety and Cellphone Use Based on Social Media
    Qian, Chao
    Li, Yueqing
    Zuo, Wenchao
    Wang, Yuhong
    ADVANCES IN HUMAN FACTORS OF TRANSPORTATION, 2020, 964 : 521 - 530
  • [45] Automated Detection of Substance Use-Related Social Media Posts Based on Image and Text Analysis
    Roy, Arpita
    Paul, Anamika
    Pirsiavash, Hamed
    Pan, Shimei
    2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017), 2017, : 772 - 779
  • [46] Domain-Based Functional Improvements in Bipolar Disorder After Interpersonal and Social Rhythm Therapy
    Moot, William
    Crowe, Marie
    Inder, Maree
    Eggleston, Kate
    Frampton, Christopher
    Porter, Richard J.
    FRONTIERS IN PSYCHIATRY, 2022, 13
  • [47] The associations between optimism, personal growth initiative and the latent classes of social media addiction
    Yue, Heng
    Gao, Shiwen
    Huang, Yufeng
    Zhang, Xuemin
    FRONTIERS IN PSYCHOLOGY, 2025, 15
  • [48] How to use social media for scientific advocacy and personal branding
    Pawlak, Katarzyna M. M.
    Lui, Rashid N. N.
    Bilal, Mohammad
    Siau, Keith
    UNITED EUROPEAN GASTROENTEROLOGY JOURNAL, 2023, 11 (05) : 488 - 491
  • [49] Curation through Use: Understanding the Personal Value of Social Media
    Zhao, Xuan
    Lindley, Sian
    32ND ANNUAL ACM CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2014), 2014, : 2431 - 2440
  • [50] A Domain-Based Model for Identifying Regional Collision Risk and Depicting Its Geographical Distribution
    Liu, Zihao
    Wu, Zhaolin
    Zheng, Zhongyi
    Yu, Xianda
    Bu, Xiaoxuan
    Zhang, Wenjun
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (11)