Extracting insights from social media with large-scale matrix approximations

被引:2
|
作者
Sindhwani, V. [1 ]
Ghoting, A. [1 ]
Ting, E. [2 ]
Lawrence, R. [1 ]
机构
[1] IBM Corp, Div Res, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
[2] IBM Software Grp, Silicon Valley Lab, San Jose, CA 95141 USA
关键词
FACTORIZATION; ALGORITHM;
D O I
10.1147/JRD.2011.2163281
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Social media platforms such as blogs, Twitter (R) accounts, and online discussion sites are large-scale forums where every individual can potentially voice an influential public opinion. According to recent surveys, a massive number of Internet users are turning to such forums to collect recommendations and reviews for products and services, and to shape their individual choices and stances by the commentary of the online community as a whole. The unsupervised extraction of insight from unstructured user-generated web content requires new methodologies that are likely to be rooted in natural language processing and machine-learning techniques. Furthermore, the unprecedented scale of data begging to be analyzed necessitates the implementation of these methodologies on modern distributed computing platforms. In this paper, we describe a flexible new family of low-rank matrix approximation algorithms for modeling topics in a given corpus of documents (e.g., blog posts and tweets). We benchmark distributed optimization algorithms for running these models in a Hadoop (TM)-enabled cluster environment. We describe online learning strategies for tracking the evolution of ongoing topics and rapidly detecting the emergence of new themes in a streaming setting.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Osmosis, from molecular insights to large-scale applications
    Marbach, Sophie
    Bocquet, Lyderic
    CHEMICAL SOCIETY REVIEWS, 2019, 48 (11) : 3102 - 3144
  • [22] Insights from Large-Scale Cancer Genome Sequencing
    Mardis, Elaine R.
    ANNUAL REVIEW OF CANCER BIOLOGY, VOL 2, 2018, 2 : 429 - 444
  • [23] Patient Knowledge, Attitudes, and Beliefs Regarding Biologic Therapies in Ankylosing Spondylitis (AS): Insights from a Large-Scale Analysis of Social Media Platforms
    Minhas, Deeba
    Noah, Benjamin
    Dzubur, Eldin
    Almario, Christopher
    Ishimori, Mariko
    Arnold, Corey
    Howard, Amber
    Khalil, Carine
    Jusufagic, Alma
    Chen, Michelle
    Park, Jina
    Weisman, Michael
    Spiegel, Brennan
    ARTHRITIS & RHEUMATOLOGY, 2017, 69
  • [24] Patient Understanding of the Risks and Benefits of Biologic Therapies in Inflammatory Bowel Disease: Insights from a Large-scale Analysis of Social Media Platforms
    Martinez, Bibiana
    Dailey, Francis
    Almario, Christopher V.
    Keller, Michelle S.
    Desai, Mansee
    Dupuy, Taylor
    Mosadeghi, Sasan
    Whitman, Cynthia
    Lasch, Karen
    Ursos, Lyann
    Spiegel, Brennan M. R.
    INFLAMMATORY BOWEL DISEASES, 2017, 23 (07) : 1057 - 1064
  • [25] Fostering social sustainability in large-scale agile projects: insights from Swedish software companies
    Ahmad, Muhammad Ovais
    Al-Baik, Osama
    JOURNAL OF DECISION SYSTEMS, 2025, 34 (01)
  • [26] Media attention to large-scale corporate scandals: Hype and boredom in the age of social media
    Barkemeyer, Ralf
    Faugere, Christophe
    Gergaud, Olivier
    Preuss, Lutz
    JOURNAL OF BUSINESS RESEARCH, 2020, 109 : 385 - 398
  • [27] Visual abstraction and exploration of large-scale geographical social media data
    Zhou, Zhiguang
    Zhang, Xinlong
    Guo, Zhiyong
    Liu, Yuhua
    NEUROCOMPUTING, 2020, 376 : 244 - 255
  • [28] Guest Editorial: Large-Scale Multimedia Content Analysis on Social Media
    Haojie Li
    Zheng-Jun Zha
    Benoit Huet
    Qi Tian
    Multimedia Tools and Applications, 2016, 75 : 1365 - 1369
  • [29] Social Media in and Around a Temporary Large-Scale Refugee Shelter in the Netherlands
    Smets, Peer
    Younes, Younes
    Dohmen, Marinka
    Boersma, Kees
    Brouwer, Lenie
    SOCIAL MEDIA + SOCIETY, 2021, 7 (02):
  • [30] Photo Privacy Conflicts in Social Media: A Large-scale Empirical Study
    Such, Jose M.
    Porter, Joel
    Preibusch, Soren
    Joinson, Adam
    PROCEEDINGS OF THE 2017 ACM SIGCHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'17), 2017, : 3821 - 3832