(12) United States Patent ao) Patent No.: Us 7,849,027 B2
Koran et al. (45) Date of Patent: Dec. 7,2010
An unsupervised classification approach is improved by imposing some order into the treatment of the records and their attributes, which otherwise would be treated as random variables. A method is provided to identify particular attributes that are most associated with the "good" records within each of the plurality of groups of records within a data set. Based on a supervised scoring method, the records of the data set are processed to indicate their measure of "goodness". There are various ways by which the records can be processed to indicate a bias during unsupervised clustering processing.
31 Claims, 5 Drawing Sheets
Identify items of a data set according to
gradations of "goodness," using a
supervised objective function.
Based on the identifications of the items, process the
items of the data set to indicate a bias for application
in an unsupervised clustering step.
Cluster the items of the data set, processed to indicate a bias, using an unsupervised approach.