Improving Access to Large-scale Digital Libraries Through Semantic-enhanced Search and Disambiguation

被引:9
|
作者
Hinze, Annika [1 ]
Taube-Schock, Craig [1 ]
Bainbridge, David [1 ]
Matamua, Rangi [2 ]
Downie, J. Stephen [3 ]
机构
[1] Univ Waikato, Comp Sci, Hamilton, New Zealand
[2] Univ Waikato, Maori & Pacific Dev, Hamilton, New Zealand
[3] Univ Illinois, Lib & Informat Sci, Chicago, IL 60680 USA
关键词
QUERY EXPANSION; SYSTEM;
D O I
10.1145/2756406.2756920
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With 13,000,000 volumes comprising 4.5 billion pages of text, it is currently very difficult for scholars to locate relevant sets of documents that are useful in their research from the HathiTrust Digital Libary (HTDL) using traditional lexically-based retrieval techniques. Existing document search tools and document clustering approaches use purely lexical analysis, which cannot address the inherent ambiguity of natural language. A semantic search approach offers the potential to overcome the shortcoming of lexical search, but-even if an appropriate network of ontologies could be decided upon-it would require a full semantic markup of each document. In this paper, we present a conceptual design and report on the initial implementation of a new framework that affords the benefits of semantic search while minimizing the problems associated with applying existing semantic analysis at scale. Our approach avoids the need for complete semantic document markup using pre-existing ontologies by developing an automatically generated Concept-in-Context (CiC) network seeded by a priori analysis of Wikipedia texts and identification of semantic metadata. Our Capisco system analyzes documents by the semantics and context of their content. The disambiguation of search queries is done interactively, to fully utilize the domain knowledge of the scholar. Our method achieves a form of semantic-enhanced search that simultaneously exploits the proven scale benefits provided by lexical indexing.
引用
收藏
页码:147 / 156
页数:10
相关论文
共 50 条
  • [21] Information-Seeking in Large-Scale Digital Libraries Strategies for Scholarly Workset Creation
    Weigl, David M.
    Page, Kevin R.
    Organisciak, Peter
    Downie, J. Stephen
    2017 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2017), 2017, : 253 - 256
  • [22] On Diversity of Viewing of Online Video through a Large-Scale Search Engine
    Chen, Yishuai
    Chen, Changjia
    Guo, Dan
    2009 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL IV, 2009, : 162 - 165
  • [23] A Multi-Temporal Network for Improving Semantic Segmentation of Large-Scale Landsat Imagery
    Yang, Xuan
    Zhang, Bing
    Chen, Zhengchao
    Bai, Yongqing
    Chen, Pan
    REMOTE SENSING, 2022, 14 (19)
  • [24] Improving data quality in large-scale repositories through conflict resolution
    Artur Kulmukhametov
    Andreas Rauber
    Christoph Becker
    International Journal on Digital Libraries, 2021, 22 : 365 - 383
  • [25] Improving data quality in large-scale repositories through conflict resolution
    Kulmukhametov, Artur
    Rauber, Andreas
    Becker, Christoph
    INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2021, 22 (04) : 365 - 383
  • [26] Cultural influences on word meanings revealed through large-scale semantic alignment
    Bill Thompson
    Seán G. Roberts
    Gary Lupyan
    Nature Human Behaviour, 2020, 4 : 1029 - 1038
  • [27] Cultural influences on word meanings revealed through large-scale semantic alignment
    Thompson, Bill
    Roberts, Sean G.
    Lupyan, Gary
    NATURE HUMAN BEHAVIOUR, 2020, 4 (10) : 1029 - +
  • [28] Towards Improving Web Search: A Large-Scale Exploratory Study of Selected Aspects of User Search Behavior
    Ohshima, Hiroaki
    Jatowt, Adam
    Oyama, Satoshi
    Nakamura, Satoshi
    Tanaka, Katsumi
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2009, PROCEEDINGS, 2009, 5802 : 379 - 386
  • [29] Improving the B&B search for large-scale hydrothermal weekly scheduling problems
    Parrilla, Ernesto
    Garcia-Gonzalez, Javier
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2006, 28 (05) : 339 - 348
  • [30] Deep Multi-label Hashing for Large-Scale Visual Search Based on Semantic Graph
    Zhong, Chunlin
    Yu, Yi
    Tang, Suhua
    Satoh, Shin'ichi
    Xing, Kai
    WEB AND BIG DATA, APWEB-WAIM 2017, PT I, 2017, 10366 : 169 - 184