"An Incremental Learning Algorithm for Constructing Boolean Functions From Positive and Negative Examples"

Computers and Operations Research, 2002, (July 2002 issue), Vol. 29, No. 12, pp. 1677-1700.

Salvador Nieto Sanchez, Evangelos Triantaphyllou, Jianhua Chen and T. Warren Liao

Scope and Purpose (special requirement by the Comps & OR journal):
There is a growing need for methods that can analyze information and infer patterns in a way that can be useful to the analyst. This is the core of data mining and knowledge discovery from databases. Such methods often infer a Boolean function from observations that belong to different classes. This paper presents a methodology that infers a Boolean function in an incremental setting. The proposed approach can be used in conjunction with existing algorithms that infer a Boolean function from examples. The paper presents both algorithmic developments and also an extensive empirical study. The results presented in the paper suggest that the proposed incremental approach is highly promising.

This paper introduces an incremental algorithm for learning a Boolean function from examples. The functions are constructed in the Disjunctive Normal Form (DNF) or the Conjunctive Normal Form (CNF) and emphasis is placed in inferring functions with as few clauses as possible. This incremental algorithm can be combined with any existing algorithm that infers a Boolean function from examples. In this paper it is combined with the One Clause At a Time (OCAT) approach [Triantaphyllou, et al., 1994] and [Triantaphyllou, 1994] which is a non-incremental learning approach. An extensive computational study was undertaken to assess the performance characteristics of the new approach. As examples we used binary vectors that represent text documents from different categories from the TIPSTER collection. The computational results indicate that the new algorithm is considerably more efficient and it derives more accurate Boolean functions. As it was anticipated, the Boolean functions (in DNF or CNF form) derived by the new algorithm are comprised by more clauses than the functions derived by the non-incremental approach.

Key Words:
Incremental and non-incremental learning, learning from examples, Boolean functions, DNF and CNF form, machine learning, data mining, the One Clause At a Time (OCAT) approach for inferring a Boolean function.

Download this paper as a PDF file (size = 2,350 KB).

Visit Dr. Triantaphyllou's homepage.