Improving Access to Large-scale Digital Libraries Through Semantic-enhanced Search and Disambiguation

被引:9
|
作者
Hinze, Annika [1 ]
Taube-Schock, Craig [1 ]
Bainbridge, David [1 ]
Matamua, Rangi [2 ]
Downie, J. Stephen [3 ]
机构
[1] Univ Waikato, Comp Sci, Hamilton, New Zealand
[2] Univ Waikato, Maori & Pacific Dev, Hamilton, New Zealand
[3] Univ Illinois, Lib & Informat Sci, Chicago, IL 60680 USA
关键词
QUERY EXPANSION; SYSTEM;
D O I
10.1145/2756406.2756920
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With 13,000,000 volumes comprising 4.5 billion pages of text, it is currently very difficult for scholars to locate relevant sets of documents that are useful in their research from the HathiTrust Digital Libary (HTDL) using traditional lexically-based retrieval techniques. Existing document search tools and document clustering approaches use purely lexical analysis, which cannot address the inherent ambiguity of natural language. A semantic search approach offers the potential to overcome the shortcoming of lexical search, but-even if an appropriate network of ontologies could be decided upon-it would require a full semantic markup of each document. In this paper, we present a conceptual design and report on the initial implementation of a new framework that affords the benefits of semantic search while minimizing the problems associated with applying existing semantic analysis at scale. Our approach avoids the need for complete semantic document markup using pre-existing ontologies by developing an automatically generated Concept-in-Context (CiC) network seeded by a priori analysis of Wikipedia texts and identification of semantic metadata. Our Capisco system analyzes documents by the semantics and context of their content. The disambiguation of search queries is done interactively, to fully utilize the domain knowledge of the scholar. Our method achieves a form of semantic-enhanced search that simultaneously exploits the proven scale benefits provided by lexical indexing.
引用
收藏
页码:147 / 156
页数:10
相关论文
共 50 条
  • [41] Enhanced multi-scale feature adaptive fusion sparse convolutional network for large-scale scenes semantic segmentation☆
    Shen, Lingfeng
    Cao, Yanlong
    Zhu, Wenbin
    Ren, Kai
    Shou, Yejun
    Wang, Haocheng
    Xu, Zhijie
    COMPUTERS & GRAPHICS-UK, 2025, 126
  • [42] Evaluation of Possibility of Large-scale Digital Map through Precision Sensor Modeling of UAV
    Lim, Pyung-chae
    Kim, Han-gyeol
    Park, Jimin
    Rhee, Sooahm
    KOREAN JOURNAL OF REMOTE SENSING, 2020, 36 (06) : 1393 - 1405
  • [43] Demo: Creating Large-Scale Digital Twins for the Wireless Spectrum Through a Communication Link
    Robinson, Clifton Paul
    Johari, Pedram
    Melodia, Tommaso
    2024 IEEE 30TH INTERNATIONAL SYMPOSIUM ON LOCAL AND METROPOLITAN AREA NETWORKS, LANMAN 2024, 2024, : 5 - 6
  • [44] Improving health facility delivery rates in Zanzibar, Tanzania through a large-scale digital community health volunteer programme: a process evaluation
    Fulcher, Isabel R.
    Nelson, Allyson R.
    Tibaijuka, Jalia, I
    Seif, Samira S.
    Lilienfeld, Sam
    Abdalla, Omar A.
    Beckmann, Nadine
    Layer, Erica H.
    Hedt-Gauthier, Bethany
    Hofmann, Rachel Lieber
    HEALTH POLICY AND PLANNING, 2020, 35 (10) : 1269 - 1279
  • [45] Enhanced Local Feature Learning With Simple Offset Attention for Semantic Segmentation of Large-Scale Point Clouds
    Chen, Dong
    Wang, Yuebin
    Zhang, Liqiang
    Kang, Zhizhong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [46] Improving the Quality of Large-Scale Database-Centric Software Systems by Analyzing Database Access Code
    Chen, Tse-Hsun
    Hassan, Ahmed E.
    2015 13TH IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2015, : 245 - 249
  • [47] IdeaHound: Improving Large-scale Collaborative Ideation with Crowd-powered Real-time Semantic Modeling
    Siangliulue, Pao
    Chan, Joel
    Dow, Steven P.
    Gajos, Krzysztof Z.
    UIST 2016: PROCEEDINGS OF THE 29TH ANNUAL SYMPOSIUM ON USER INTERFACE SOFTWARE AND TECHNOLOGY, 2016, : 609 - 624
  • [48] DeepText2GO: Improving large-scale protein function prediction with deep semantic text representation
    You, Ronghui
    Huang, Xiaodi
    Zhu, Shanfeng
    METHODS, 2018, 145 : 82 - 90
  • [49] DeepText2Go: Improving Large-scale Protein Function Prediction with Deep Semantic Text Representation
    You, Ronghui
    Zhu, Shanfeng
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 42 - 49
  • [50] A very large-scale Neighborhood search algorithm for the combined through-fleet-assignment model
    Ahuja, Ravindra K.
    Goodstein, Jon
    Mukherjee, Amit
    Orlin, James B.
    Sharma, Dushyant
    INFORMS JOURNAL ON COMPUTING, 2007, 19 (03) : 416 - 428