Title: Iterative-Improvement-Based Declustering Heuristics for Multi-disk Databases
Authors: Mehmet Koyuturk and Cevdet Aykanat
Status: Published in Information Systems, vol. 30, no. 9, pp. 47-70, 2005.

Abstract:

Data declustering is an important issue for reducing query response times in multi-disk database systems. In this paper, we propose a declustering method that utilizes the available information on query distribution, data distribution, data-item sizes, and disk capacity constraints. The proposed method exploits the natural correspondence between a dataset with a given query distribution and a hypergraph. We define an objective function that exactly represents the aggregate parallel query-response time for the declustering problem and adapt the iterative-improvement-based heuristics successfully used in hypergraph partitioning to this objective function. We propose a two-phase algorithm that first obtains an initial $K$-way declustering by recursively bipartitioning the dataset, then applies multiway refinement on this declustering. We provide effective gain models and efficient implementation schemes for both phases. The experimental results on a wide range of realistic datasets show that the proposed method provides a significant performance improvement compared to the state-of-the-art declustering strategy based on similarity-graph partitioning.