SEMINAR

 

DEPARTMENT OF COMPUTER ENGINEERING

 

ABSTRACT

 

Adaptive Query Processing for Wide-Area Distributed Data Sources

 

by

 

Michael Franklin

 

Deptartment of Computer Science

University of California, Berkeley

 

Wide-area distribution raises new challenges for many components of database technology. Query processing and optimization are two important areas where traditional techniques break down. A key problem in the wide-area environment is the difficulty of predicting response-times for remote data access. Unpredictability arises due to the dynamic nature of wide-area network performance and because reliable information about remote data sources is often unavailable to the query processor. Traditional distributed query processing approaches simply do not work in this environment because they depend upon estimates of execution costs and are too static to adapt to unexpected problems and delays.

 

In this talk, I will describe two complementary techniques we have developed to address the problem of query processing in an unpredictable environment. The first technique, Query Scrambling, is a reactive approach that modifies a query execution plan "on-the-fly" when delays are encountered at runtime. Scrambling hides delays by using a response time-based query optimizer and a multi-threaded execution engine to selectively reschedule and reconstruct the execution plan. The second technique exploits pipelining to allow scheduling to adjust to the arrival properties of the data. It is based on a new a non-blocking join operator, called XJoin, which has a small memory footprint, allowing many such operators to be active in parallel. XJoin is optimized to produce initial results quickly and like Scrambling, aims to hide intermittent delays in data arrival by reactively scheduling background processing. Comparing these two mechanisms provides useful insights into the space of solutions for adaptive query processing. Our ongoing work aims at applying such solutions for providing data-intensive Internet services.

 

This is joint work with Tolga Urhan

 

Biography

 

Michael Franklin is an Associate Professor in Computer Science at the University of California, Berkeley where his research focuses on the architecture and performance of distributed databases and information systems. Previously, Dr. Franklin led the DIMSUM project to develop a flexible query processing architecture for local and wide-area networks and was a co-developer of the Broadcast Disks data dissemination paradigm. He is currently involved with the Telegraph universal information system project at Berkeley. He is Editor-n-Chief of the ACM SIGMOD Record, and is on the editorial boards of ACM Computing Surveys, and Distributed and Parallel Databases. He currently serves on the Technology Advisory Boards of several Bay Area Internet start ups, including: AppStream, CommonObject, RightOrder, and Propel. Dr. Franklin is a 1995 recipient of the NSF CAREER award.

 

 

The Seminar will be on August 22, 2000 at 10:00 in EA502