Bibliomining for automated collection development in a digital library setting: Using data mining to discover web-based scholarly research works

被引:14
|
作者
Nicholson, S [1 ]
机构
[1] Syracuse Univ, Sch Informat Studies, Ctr Sci & Technol 4 127, Syracuse, NY 13244 USA
关键词
D O I
10.1002/asi.10313
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This research creates an intelligent agent for automated collection development in a digital library setting. It uses a predictive model based on facets of each Web page to select scholarly works. The criteria came from the academic library selection literature, and a Delphi study was used to refine the list to 41 criteria. A Perl program was designed to analyze a Web page for each criterion and applied to a large collection of scholarly and nonscholarly Web pages. Bibliomining, or data mining for libraries, was then used to create different classification models. Four techniques were used: logistic regression, non-parametric discriminant analysis, classification trees, and neural networks. Accuracy and return were used to judge the effectiveness of each model on test datasets. In addition, a set of problematic pages that were difficult to classify because of their similarity to scholarly research was gathered and classified using the models. The resulting models could be used in the selection process to automatically create a digital library of Web-based scholarly research works. In addition, the technique can be extended to create a digital library of any type of structured electronic information.
引用
收藏
页码:1081 / 1090
页数:10
相关论文
共 50 条
  • [1] Research and application of web-based data mining
    Lijin, Li
    Xijun, Wen
    2007 International Symposium on Computer Science & Technology, Proceedings, 2007, : 731 - 733
  • [2] The Research and Application of Web-based Data Mining Technology
    Zhu Jian-Xin
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 3424 - 3426
  • [3] Development of a web-based system for the collection, analysis and data mining of ecotoxicological related data: A starting project
    Boatti, L.
    Dondero, F.
    Viarengo, A.
    Mignone, F.
    COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY A-MOLECULAR & INTEGRATIVE PHYSIOLOGY, 2010, 157 (01): : S5 - S5
  • [4] Automated Generation and Dynamic Rendering of Web-based Data Collection Systems
    Zhang, Ziyang
    Chen, Tsung Ting
    Vigneswaren, Kapil
    Hussain, Fatima
    Sharieh, Salah
    Ferworn, Alexander
    2019 IEEE 10TH ANNUAL INFORMATION TECHNOLOGY, ELECTRONICS AND MOBILE COMMUNICATION CONFERENCE (IEMCON), 2019, : 184 - 189
  • [5] Use of web-based GSS tools for research data collection
    Mittleman, D
    LePoire, DJ
    ASSOCIATION FOR INFORMATION SYSTEMS - PROCEEDINGS OF THE FIFTH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 1999), 1999, : 841 - 841
  • [6] The basis for bibliomining: Frameworks for bringing together usage-based data mining and bibliometrics through data warehousing in digital library services
    Nicholson, S
    INFORMATION PROCESSING & MANAGEMENT, 2006, 42 (03) : 785 - 804
  • [7] Web-Based Data Collection for Older Adults Living With HIV in a Clinical Research Setting: Pilot Observational Study
    Tassiopoulos, Katherine
    Roberts-Toler, Carla
    Fichtenbaum, Carl J.
    Koletar, Susan L.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (11)
  • [8] A study of Web-based data collection for social-network research
    Yang, Hung-Jen
    Lou, Shi-Jer
    Yang, Hsieh-Hua
    Hu, Wen Chen
    Tseng, Kuo-hung
    WMSCI 2005: 9TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL 1, 2005, : 247 - 252
  • [9] Using data mining for improving web-based course design
    Myller, N
    Suhonen, J
    Sutinen, E
    INTERNATIONAL CONFERENCE ON COMPUTERS IN EDUCATION, VOLS I AND II, PROCEEDINGS, 2002, : 959 - 963
  • [10] Development of a Taxonomy for Indexing Web-Based Mining Safety and Health Research
    Glowacki, A. F.
    FIRST INTERNATIONAL FUTURE MINING CONFERENCE AND EXHIBITION 2008, PROCEEDINGS, 2008, (10): : 125 - 129