CD-ROM is an attractive delivery vehicle for full-text databases. Because of large storage capacity and low access speed, carefully designed indexing structures-including a concordance-are necessary to enable the text to be retrieved efficiently. However, the indexes are sufficiently large that they tax the ability of main store to hold them when processing queries. The use of compression techniques can substantially increase the volume of text that a disk can accommodate, and substantially decrease the amount of primary storage needed to hold the indexes. This paper describes a suitable indexing mechanism, and its compression potential using modem compression methods. It is possible to double the amount of text that can be stored on a CD-Rom disk and include a full concordance and indexes as well. A single disk can accommodate around 180 million words of text-equivalent to a library of 1000-1500 books-and provide rapid response to a variety of queries involving multiple search terms and word fragments.
机构:
Islamic Azad Univ, N Tehran Branch, Dept Lib & Informat Studies, Tehran, IranIslamic Azad Univ, N Tehran Branch, Dept Lib & Informat Studies, Tehran, Iran
Abazari, Zahra
Isfandyari-Moghaddam, Alireza
论文数: 0引用数: 0
h-index: 0
机构:
Islamic Azad Univ, Hamedan Branch, Dept Lib & Informat Studies, Tehran, IranIslamic Azad Univ, N Tehran Branch, Dept Lib & Informat Studies, Tehran, Iran
Isfandyari-Moghaddam, Alireza
Ghorbabi, Mahboobeh
论文数: 0引用数: 0
h-index: 0
机构:
Islamic Azad Univ, N Tehran Branch, Dept Lib & Informat Studies, Tehran, IranIslamic Azad Univ, N Tehran Branch, Dept Lib & Informat Studies, Tehran, Iran