A multi-lingual indexing and search system performs tokenization and stemming in a manner which is independent of whether index entries and search terms appear as words in a dictionary. During the tokenization phase of the process, a string of text is separated into individual word tokens, and predetermined...http://www.google.com/patents/US20040006456?utm_source=gb-gplus-sharePatent US20040006456 - Multi-language document search and retrieval system