An analysis of design process and performance in distributed data science teams

被引:10
|
作者
Maier, Torsten [1 ]
DeFranco, Joanna [2 ]
Mccomb, Christopher [3 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] Penn State Univ, Software Engn, University Pk, PA 16802 USA
[3] Penn State Univ, Engn, Main Campus, University Pk, PA 16802 USA
关键词
Teamwork; Data science; Distributed teams; Global teamwork; Kaggle data set; Software engineering teams; Technical teams; SOCIAL DILEMMAS; COMMUNICATION; COOPERATION; SIZE;
D O I
10.1108/TPM-03-2019-0024
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Purpose Often, it is assumed that teams are better at solving problems than individuals working independently. However, recent work in engineering, design and psychology contradicts this assumption. This study aims to examine the behavior of teams engaged in data science competitions. Crowdsourced competitions have seen increased use for software development and data science, and platforms often encourage teamwork between participants. Design/methodology/approach We specifically examine the teams participating in data science competitions hosted by Kaggle. We analyze the data provided by Kaggle to compare the effect of team size and interaction frequency on team performance. We also contextualize these results through a semantic analysis. Findings This work demonstrates that groups of individuals working independently may outperform interacting teams on average, but that small, interacting teams are more likely to win competitions. The semantic analysis revealed differences in forum participation, verb usage and pronoun usage when comparing top- and bottom-performing teams. Research limitations/implications - These results reveal a perplexing tension that must be explored further: true teams may experience better performance with higher cohesion, but nominal teams may perform even better on average with essentially no cohesion. Limitations of this research include not factoring in team member experience level and reliance on extant data. Originality/value These results are potentially of use to designers of crowdsourced data science competitions as well as managers and contributors to distributed software development projects.
引用
收藏
页码:419 / 439
页数:21
相关论文
共 50 条
  • [41] Distributed Regression Analysis Application in Large Distributed Data Networks: Analysis of Precision and Operational Performance
    Her, Qoua
    Malenfant, Jessica
    Zhang, Zilu
    Vilk, Yury
    Young, Jessica
    Tabano, David
    Hamilton, Jack
    Johnson, Ron
    Raebel, Marsha
    Boudreau, Denise
    Toh, Sengwee
    JMIR MEDICAL INFORMATICS, 2020, 8 (06)
  • [42] Managing and Composing Teams in Data Science: An Empirical Study
    Aho, Timo
    Kilamo, Terhi
    Lwakatare, Lucy
    Mikkonen, Tommi
    Sievi-Korte, Outi
    Yaman, Sezin
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2291 - 2300
  • [43] A Performance Analysis of Hybrid and Columnar Cloud Databases for Efficient Schema Design in Distributed Data Warehouse as a Service
    Ferreira, Fred Eduardo Revoredo Rabelo
    Fidalgo, Robson do Nascimento
    DATA, 2024, 9 (08)
  • [44] Design and performance analysis of a hydrogen liquefaction process
    Shengan Zhang
    Guilian Liu
    Clean Technologies and Environmental Policy, 2022, 24 : 51 - 65
  • [45] Design and performance analysis of a hydrogen liquefaction process
    Zhang, Shengan
    Liu, Guilian
    CLEAN TECHNOLOGIES AND ENVIRONMENTAL POLICY, 2022, 24 (01) : 51 - 65
  • [46] LSST Science Data Quality Analysis Subsystem Design
    Laher, Russ R.
    Levine, Deborah
    Mannings, Vince
    McGehee, Peregrine
    Rho, Jeonghee
    Shaw, Richard A.
    Kantor, Jeff
    ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XVIII, 2009, 411 : 106 - +
  • [47] Supporting globally distributed engineering design teams with communication technologies
    Maitland, C
    Steinfield, C
    Jang, CY
    GLOBAL NETWORKING '97 - 21ST CENTURY COMMUNICATIONS NETWORKS, VOL 2: POLICY; SOCIAL APPLICATIONS, 1997, : 338 - 346
  • [48] Challenges in globally distributed product design teams: A case study
    Hayes, C. C.
    DETC2007: PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNOLOGY CONFERENCE AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, VOL 4, 2008, : 709 - 717
  • [49] Design discussion of the [braccetto] research platform: Supporting distributed intensely collaborating creative teams of teams
    Schremmer, Claudia
    Krumm-Heller, Alex
    Vernik, Rudi
    Epps, Julien
    HUMAN-COMPUTER INTERACTION, PT 4, PROCEEDINGS: HCI APPLICATIONS AND SERVICES, 2007, 4553 : 722 - 734
  • [50] The use of groupware for collaboration in distributed student engineering design teams
    Kirschman, Jill S.
    Greenstein, Joel S.
    Journal of Engineering Education, 2002, 91 (04) : 403 - 407