Bilkent University
Department of Computer Engineering


Financial Cost-Aware Search Engine Result Caching


Fethi Burak Sazoğlu
MSc Student
Computer Engineering Department
Bilkent University

As the size of the Web increases rapidly, the burden on search engines to present users with fast, fresh and reliable results rises. Considering more than 35 billion pages in the web that are indexed by search engines, processing of queries demand extensive use of resources of search engines.

Web search engines cache results of frequent and/or recent queries to decrease processing overhead of search queries and the workload of search engine backend. The motivation behind search engine caching is the temporal locality of queries, meaning that a query issued to the search engine is likely to be issued again in the near future. Result caching algorithms are evaluated using different metrics and hit rate is the most popular one among them. Hit rate measures the proportion of the queries whose results are retrieved from the cache, instead of search engine backend. Recent works proposed taking the processing overhead of queries into account when evaluating the performance of result caching strategies and developed suitable cost-aware caching strategies.

Cost-aware caching strategies attempt to minimize query processing cost which is the sum of processing costs for miss queries that are processed at backend, ignoring the cost for queries that are answered from the cache. In this work, we propose a financial cost metric that goes one step beyond and takes also the hourly electricity prices into account when computing the cost. We evaluate the most well-known static, dynamic, and hybrid result caching strategies under this new metric. We also propose a financial-cost-aware version of the least-recently-used (LRU) cache and show that it outperforms the baseline LRU algorithm in terms of the financial cost metric.


DATE: 11 March, 2013, Monday @ 15:40