Computation of term/document discrimination values by use of the cover coefficient concept

Can, Fazlı
Özkarahan, Esen A.
Indexing in information retrieval (IR) is used to obtain a suitable vocabulary of index terms and optimum assignment of these terms to documents for increasing the effectiveness and efficiency of an IR system. The concept of term discrimination value (TDV) is one of the criteria used for index‐term selection. In this article a new concept called the cover coefficient (CC) will be used in computing TDVs. After a brief introduction to the theory of indexing and the CC concept, an efficient way of computing TDVs by use of the CC concept, index‐term selection, and weight modification are discussed. It is also shown that the computational cost of the CC approach in the calculation of TDVs is favorably comparable to the cost of a different approach that uses similarity coefficients. Furthermore, the TDVs obtained by the CC approach are consistent with those of the latter approach. © 1987 John Wiley & Sons, Inc.