APPLIED INTELLIGENCE, cilt.25, sa.1, ss.23-36, 2006 (SCI-Expanded)
The least general generalization (LGG) of strings may cause an over-generalization in the generalization process of the clauses of predicates with string arguments. We propose a specific generalization (SG) for strings to reduce over-generalization. SGs of strings are used in the generalization of a set of strings representing the arguments of a set of positive examples of a predicate with string arguments. In order to create a SG of two strings, first, a unique match sequence between these strings is found. A unique match sequence of two strings consists of similarities and differences to represent similar parts and differing parts between those strings. The differences in the unique match sequence are replaced to create a SG of those strings. In the generalization process, a coverage algorithm based on SGs of strings or learning heuristics based on match sequences are used.