IMAGE RETRIEVAL SYSTEMS AND METHODS WITH SEMANTIC AND FEATURE BASED RELEVANCE FEEDBACK
This is a continuation of and claims priority to U.S. patent application Ser. No. 09/702,292 filed Oct. 30, 2000 entitled “Image Retrieval Systems and Methods with Semantic and Feature Based Relevance Feedback” by inventors Wen-Yin Liu, Hong-Jiang Zhang, and Ye Lu.
This invention relates to image retrieval systems.
The popularity of digital images is rapidly increasing due to improving digital imaging technologies and easy availability facilitated by the Intemet. More and more digital images are becoming available every day.
Automatic image retrieval systems provide an eflicient way for users to navigate through the growing numbers of available images. Traditional image retrieval systems allow users to retrieve images in one of two ways: (1) keywordbased image retrieval or (2) content-based image retrieval. Keyword-based image retrieval finds images by matching keywords from a user query to keywords that have been manually added to the images. One of the more popular collections of annotated images is “Corel Gallery”, an image database from Corel Corporation that includes upwards of 1 million annotated images.
One problem with keyword-based image retrieval systems is it can be difficult or impossible for a user to precisely describe the inherent complexity of certain images. As a result, retrieval accuracy can be severely limited because images that camiot be described or can only be described ambiguously will not be retrieved successfully. In addition, due to the enonnous burden of manual amiotation, there are few databases with amiotated images, although this is changing.
Content-based image retrieval (CBIR) finds images that are similar to low-level image features of an example image, such as color histogram, texture, shape, and so forth. Although CBIR solves the problem of keyword-based image retrieval, it also has severe shortcomings. One drawback of CBIR is that searches may retum entirely irrelevant images that just happen to possess similar features. Additionally, individual objects in images contain a wide variety of lowlevel features. Therefore, using only the low-level features will not satisfactorily describe what is to be retrieved.
To weed out the irrelevant images retumed in CBIR, some CBIR-based image retrieval systems utilize user feedback to gain an understanding as to the relevancy of certain images. After an initial query, such systems estimate the user’s ideal query by monitoring user-entered positive and negative responses to the images retumed from the query. This approach reduces the need for a user to provide accurate initial queries.
One type of relevance feedback approach is to estimate ideal query parameters using only the low-level image features. This approach works well if the feature vectors can capture the essence of the query. For example, if the user is searching for an image with complex textures having a particular combination of colors, this query would be extremely difficult to describe but can be reasonably represented by a
combination of color and texture features. Therefore, with a few positive and negative examples, the relevance feedback process is able to retum reasonably accurate results. On the other hand, if the user is searching for a specific object that camiot be sufficiently represented by combinations of available feature vectors, these relevance feedback systems will not retum many relevant results even with a large number of user feedbacks.
Some researchers have attempted to apply models used in text infonnation retrieval to image retrieval. One of the most popular models used in text infonnation retrieval is the vector model. The vector model is described in such writings as Buckley and Salton, “Optimization of Relevance Feedback Weights,” in Proc of SIGIR’95; Salton and McGill, “Introduction to Modem Infonnation Retrieval,” McGraw-Hill Book Company, 1983; and W. M. Shaw, “Term-Relevance Computation and Perfect Retrieval Perfonnance,” Ir1fonnation processing and Management. Various effective retrieval techniques have been developed for this model and many employ relevance feedback.
Most of the previous relevance feedback research can be classified into two approaches: query point movement and re-weighting. The query point movement method essentially tries to improve the estimate of an “ideal query point” by moving it towards good example points and away from bad example points. The frequently used technique to iteratively improve this estimation is the Rocchio’ s fonnula given below for sets of relevant documents D'R and non-relevant documents D'N noted by the user:
._ 1 _ 1 _ <1) Q - 4Q +B[N—R,i; 0.] -7[N—N,‘_; 0.]
where 01, [3, and y are suitable constants and NR and NNare the number of documents in D'R and D'N respectively. This technique is implemented, for example, in the MARS system, as described in Rui,Y., Huang, T. S., and Mel1rotra, S. “ContentBased Image Retrieval with Relevance Feedback in MARS,” in Proc. IEEE Int. Conf. on Image proc., 1997.
The central idea behind the re-weighting method is very simple and intuitive. Since each image is represented by an N dimensional feature vector, the image may be viewed as a point in an N dimensional space. Therefore, if the variance of the good examples is high along a principle axis j, the values on this axis are mo st likely not very relevant to the input query and a low weight wj can be assigned to the axis. Therefore, the inverse of the standard deviation of the j th feature values in the feature matrix is used as the basic idea to update the weight wj. The MARS system mentioned above implements a slight refinement to the re-weighting method called the standard deviation method.
Recently, more computationally robust methods that perform global optimization have been proposed. One such proposal is the MindReader retrieval system described in Ishikawa, Y., Subramanya R., and Faloutsos, C., “Mindreader: Query Databases Through Multiple Examples,” In Proc. of the 24th VLDB Conference, (New York), 1998. It formulates a minimization problem on the parameter estimating process. Unlike traditional retrieval systems with a distance function that can be represented by ellipses aligned with the coordinate axis, the MindReader system proposed a distance function that is not necessarily aligned with the coordinate axis. Therefore, it allows for correlations between attributes in addition to different weights on each component.