MSPLS Spring '95 Workshop

Lazy Functional Programming for Full-Text Information Retrieval

Donald A. Ziff
Dept. of Computer Science, University of Chicago

Very few applications have been written in lazy functional programming languages, and hardly any, except the compilers for those languages, are reported in academic literature. It is by no means a settled question whether ``real'' applications can be written in a lazy functional programming language. This is in part because these languages typically offer little or no support for interoperability, combining functional programs with programs or systems written in other languages.

This work describes an experimental textual information retrieval system, Philo/Philis 2, in which lazy functional programming is combined with a varied set of other applications techniques, from components written in other languages to off-the-shelf subsystems. Functional programming interfacing techniques, procedural and data abstraction, were used throughout the system, and greatly smoothed the overall implementation process.

In the retrieval engine implementation, called the Funser, for Functional Server, lazy functional programming is shown to be a powerful and elegant means of accomplishing several desirable concrete goals: delivering initial results promptly, using space economically, and avoiding unnecessary I/O. An innovative module in this system, the TOMS, Textual Object Management System, is designed as an abstract datatype for structured text; this design permitted the retrieval system written as its client to be, to a large extent, database independent. This work also features a new formal model of word-based textual information retrieval, the Matrix model.

Philo/Philis 2 is used by the ARTFL project, American and French Research on the Treasury of the French Language, as the basis of their on-line retrieval service.

Gerald Baumgartner