The semantic integration problem for merging multiple databases of very large size, the merge/purge problem, can be solved by multiple runs of the sorted neighborhood method or the clustering method with small windows followed by the computation of the transitive closure over the results of each run....http://www.google.com/patents/US5497486?utm_source=gb-gplus-sharePatent US5497486 - Method of merging large databases in parallel