High Speed Filing Cabinet

Note: This course is usually offered in the Fall semester.
Other commitments meant that it was cancelled in Fall 1999,
however, I hope to offer it again in Fall 2000.

Theme for Fall 1998/9

Research on the Internet

CS533 People     Newsgroup

Mini-research Exercise

Semester Projects

Technical Presentations

What is Information Retrieval about?

Information Retrieval (IR) is about the process of providing answers to client's information needs! It is thus concerned with the collection, representation, storage, organisation, accessing, manipulation and display, of the information items necessary to satisfying those needs. IR generally investigates automated, computer-based, solutions.

IR is not a new field, but it is a very important and challenging one, and finding appropriate responses to the problems it presents is becoming more and more imperitive. Ever cheaper computers, high-capacity storage and computer networks, have simply served to exacerbate a long standing problem of the Information Age, summed up in the following quotation.

Help!"There is a growing mountain of research... The investigator is staggered by the findings and conclusions of thousands of other workers - conclusions which he cannot find time to grasp, much less remember. The summation of human experience is being expanded at a prodigious rate and the means we use for threading through the consequent maze to the momentarily important item is the same that was used in the days of the square rigged ships."

V. Bush 1945


In this course, I intend to give a very broad intepretation to IR emphasising the importance of organising information. Thus, we will include, not only classic IR topics, but also such diverse areas as, research and writing aids, file management systems and learning!

This year, I am particularly interested in investigating the problems posed by the Internet. The phenomenal growth of the World-Wide-Web, and its acceptance and use by people from all walks of life, presents exciting new challenges. While classical IR techniques can and are being used on the web, it is clear that more revolutionary solutions are badly needed. IR is no longer a problem faced by scientific professionals, it has become everybody's personal nightmare. Swanson highlighted the problem,

"A scientist who nowadays imagines either that he is keeping up with his field or that he can later find in the library whatever may have escaped his notice when it was first written is a victim of what might be called the 'fallacy of abundance'. The fact that so much can be found on any subject creates an illusion that little remains hidden. Although library searches probably seem more often than not to be successful simply because a relatively satisfying amount of material is exhumed, such success may be illusory, since the requester cannot assess the quantity and value of the relevant information which he fails to discover."

D.R.Swanson, "Searching Natural Language Text by Computer",
Science v.132, 21 Oct 1960 pp960-1104 (p1099)

Students taking this course are thus expected to propose, investigate and report on possible solutions. I have a number of semester projects which represent very real and challenging problems. It will be hard work, but, as the course progresses, I hope we can all acquire and practice new skills, and learn a little of the fast changing technology that is shaping our world. Of course, I hope we can have fun while doing so!

General topics expected to be covered



My primary objective in offering this course is to give you the opportunity to develop and practice skills which you will need in your research life. Of course, I also want to give you an idea of the difficulties involved in designing and implementing IR systems. IR is thus both vehicle and topic. I hope we can learn together. I am not an expert in either IR or research, and will not pretend to be. Cooperation is the 'name-of-the-game'. I want to think and work together. Helping each other decide what needs finding out in the first place, then how to find it out, how best to present the results, and, finally, how to intelligently criticise the results and find ways to improve upon them. If we can also do some new research which is publishable, so much the better.

Grading policy


{ No text book as such! The course will be based on the material in the following books which are available for inspection in my office.. Students will also be expected to consult recent journal articles and the Internet for more detailed up-to-date material. }

This page maintained by David Davenport [email]
Last updated: 22 December 1998