Duplicate detection and record consolidation in large bibliographic databases: the COPAC database experience

被引:0
|
作者
Cousins, SA [1 ]
机构
[1] Univ Manchester, Natl Serv Sect, COPAC Project, Manchester M13 9PL, Lancs, England
关键词
D O I
10.1177/0165551984232225
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
COPAC is a union catalogue giving access to the online catalogue records of some of the largest academic research libraries in the UK and Ireland. Like all union catalogues, COPAC is supplied with multiple copies of records representing the same document in the contributing library catalogues. To reduce the level of duplication visible to the COPAC user, it has been necessary to develop duplicate detection and. record consolidation procedures. These result in the production of a single record for each document, representing the holdings of several libraries. This paper discusses the ways in which both the duplicate detection and record consolidation procedures are carried out, along with the problem areas encountered. The general structure of these procedures is also described, providing a model of the duplicate record handling mechanisms used in COPAC.
引用
收藏
页码:231 / 240
页数:10
相关论文
共 17 条
  • [2] DUPLICATE RECORD IDENTIFICATION IN BIBLIOGRAPHIC DATABASES
    GOYAL, P
    INFORMATION SYSTEMS, 1987, 12 (03) : 239 - 242
  • [3] DUPLICATE RECORD DETECTION FOR DATABASE CLEANSING
    Rehman, Mariam
    Esichaikul, Vatcharapon
    2009 SECOND INTERNATIONAL CONFERENCE ON MACHINE VISION, PROCEEDINGS, ( ICMV 2009), 2009, : 333 - 338
  • [4] PC-Filter: A robust filtering technique for duplicate record detection in large databases
    Zhang, J
    Ling, TW
    Bruckner, RM
    Liu, H
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2004, 3180 : 486 - 496
  • [5] AN EXPERT SYSTEM FOR QUALITY-CONTROL AND DUPLICATE DETECTION IN BIBLIOGRAPHIC DATABASES
    RIDLEY, MJ
    PROGRAM-AUTOMATED LIBRARY AND INFORMATION SYSTEMS, 1992, 26 (01): : 1 - 18
  • [7] An incremental clustering scheme for duplicate detection in large databases
    Cesario, E
    Folino, F
    Manco, G
    Pontieri, L
    9TH INTERNATIONAL DATABASE ENGINEERING & APPLICATION SYMPOSIUM, PROCEEDINGS, 2005, : 89 - 95
  • [8] Effective incremental clustering for duplicate detection in large databases
    Folino, Francesco
    Manco, Giuseppe
    Pontieri, Luigi
    10TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2006, : 45 - 52
  • [9] Identification of FRBR works within bibliographic databases: An experiment with UNIMARC and duplicate detection techniques
    Freire, Nuno
    Borbinha, Jose
    Calado, Pavel
    ASIAN DIGITAL LIBRARIES: LOOKING BACK 10 YEARS AND FORGING NEW FRONTIERS, PROCEEDINGS, 2007, 4822 : 267 - 276
  • [10] Efficient and Robust Detection of Duplicate Videos in a Large Database
    Sarkar, Anindya
    Singh, Vishwarkarma
    Ghosh, Pratim
    Manjunath, Bangalore S.
    Singh, Ambuj
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2010, 20 (06) : 870 - 885