On the Analysis of Indexing Schemes

Joseph M. Hellerstein, Elias Koutsoupias, and Christos H. Papadimitriou
UC Berkeley and UCLA

We consider the problem of indexing general database workloads (combinations of data sets and sets of potential queries). We define a framework for measuring the efficiency of an indexing scheme for a workload based on two characterizations: storage redundancy (how many times each item in the data set is stored), and access overhead (how many times more blocks than necessary does a query retrieve). Using this framework we present some initial results, showing upper and lower bounds and trade-offs between them in the case of multi-dimensional range queries and set queries.

(See the full paper for details.)