Current classification algorithms usually do not try to achieve a balance between fitting and
generalization when they infer models from training data. Furthermore, current algorithms ignore
the fact that there may be different penalty costs for the false-positive, false-negative,
and unclassifiable types. Thus, their performance may not be optimal or may even be coincidental.
This paper proposes a meta-heuristic approach, called the Convexity Based Algorithm (CBA), to
address these issues. The new approach aims at optimally balancing the data fitting and
generalization behaviors of models when some traditional classification approaches are used.
The CBA first defines the total misclassification cost (TC) as a weighted function of the
three penalty costs and the corresponding error rates as mentioned above. Next it partitions the
training data into regions. This is done according to some convexity properties derivable from
the training data and the traditional classification method to be used in conjunction with the CBA.
Next the CBA uses a genetic approach to determine the optimal levels of fitting and generalization.
The TC is used as the fitness function in this genetic approach. Twelve real-life datasets from
a wide spectrum of domains were used to better understand the effectiveness of the proposed approach.
The computational results indicate that the CBA may potentially fill in a critical gap in the use
of current or future classification algorithms.