|Publication number||US7162482 B1|
|Application number||US 10/167,807|
|Publication date||Jan 9, 2007|
|Filing date||Jun 11, 2002|
|Priority date||May 3, 2000|
|Also published as||US7720852, US20060242193|
|Publication number||10167807, 167807, US 7162482 B1, US 7162482B1, US-B1-7162482, US7162482 B1, US7162482B1|
|Inventors||Ted E. Dunning|
|Original Assignee||Musicmatch, Inc.|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (42), Non-Patent Citations (29), Referenced by (63), Classifications (17), Legal Events (7)|
|External Links: USPTO, USPTO Assignment, Espacenet|
The present application is a continuation-in-part of U.S. patent application Ser. No. 09/848,982 for “Text Equivalencing Engine,” filed May 3, 2001, the disclosure of which is incorporated herein by reference. U.S. patent application Ser. No. 09/848,982 for “Text Equivalencing Engine” claims priority from provisional U.S. Patent Application Ser. No. 60/201,622, for “Recommendation Engine,” filed May 3, 2000, the disclosure of which is incorporated herein by reference.
A. Technical Field
The present invention is related to information retrieval, and more particularly, to retrieving information about a signal such as a music track or song, and creating a database of signals and the information retrieved about the signals.
B. Background of the Invention
In many context retrieving information about a sequence of unknown items is desirable. One such context is music. Some electronic copies of a piece of music include not only the music itself but also a tag (also known as meta-data). The tag includes information about the music such as the artist, the album name, and the track name. The tag can include any other information associated with the music entered by a listener.
However, some electronic copies of songs do not include the tag information. In particular, copies of songs downloaded from Napster or other sites where music can be downloaded only rarely include complete tag information. Additionally, as systems for recognizing music are fielded, it is likely that those providing copyrighted music without license will make efforts to evade any filtering system put into place. This has happened already with systems based on file name filtering. Thus, it is reasonable to expect slight variations of songs to be encountered in which the variations do not change the file substantially for the listener, but might make na´ve comparison methods unreliable. Examples of such modifications include slight trimming of the beginning of a song, changes in overall volume, band-limiting and changes in equalization.
One way to replace missing tag information is to compare the waveforms of an unknown song (i.e. one that does include tag information) to those of known songs (i.e. songs for which tag information is known). When two waveforms match, tags can be transferred to unknown song. However, this method is inefficient in time and space because direct wave-form comparison is computationally expensive. Also, music files available over the Internet are usually compressed using a compression algorithm such as MP3. Compression makes direct comparison of audio waveforms impossible since there will, by nature, be significant differences between the original signal and the signal recovered from a lossily compressed version. These differences will not generally be audible, but they can be substantial in magnitude, especially when phase information is discarded by the compression algorithm.
An alternative method of determining tag information is to perform feature comparison rather than waveform comparison. Feature comparison is the comparison of analog features associated with the waveform such as correlating spectra or comparing extracted rhythm signatures. Feature comparison is also slow because it involves comparison of analog features that occur over time. Thus, comparison using these analog features requires exhaustive comparison and time-alignment. Moreover, many analog features are not invariant under common transformations such as translation in time, volume scaling and small changes in equalization.
What is needed is a system and method of determining tag information for an unknown music track that avoids the above-described limitations and disadvantages. What is further needed is a system and method that leads to accurate retrieval of tag information for a compressed, trimmed piece of music of limited bandwidth. What is further needed is a system and method that is efficient in time and memory space. Currently available systems require an exhaustive search and expensive comparisons to each exemplar. Both the exhaustive search and the expensive comparison operations should be avoided if possible.
The present invention provides an information retrieval engine capable of retrieving information about an unknown signal. In one embodiment, the unknown signal is an unknown audio signal such as a song and the information is tag information that should be associated with the unknown audio signal. The tag information can be retrieved even for unknown songs that have been trimmed or compressed. The present invention can be used to retrieve the tag information accurately and efficiently. Thus, the present invention overcomes the problem of linear searching and difficulties due to compression and trimming.
The information is retrieved by converting analog signals into sequences of symbols that represent the original analog signal. The conversion to symbolic representation is designed to be robust with respect to commonly occurring transformation such as small time shifts and volume changes. The symbolic representation of each signal is further transformed by considering the symbols representing the signal as analogically equivalent to letters in normal text and then grouping the symbols that represent a signal into the analogical equivalent of words and then grouping the words into documents.
A quantization technique is applied to obtain clusters in the space of spectral representations of the signal and then proximity of a signal to a cluster is taken as indicating the occurrence of the symbol corresponding to that cluster. By suitable design of the feature extraction done prior to clustering, it is possible to cluster such that the clusters formed from an uncompressed signal are not significantly different from the clusters formed from the same signal after it has been compressed. A technique called indexing is used to retrieve the tag information based on the words included in the unknown signal. This indexing operation is essentially similar to that performed in conventional text retrieval engines and thus can make use of an extensive technology in that area.
Indexing involves building a data structure that relates individual words made up of a sequence of symbols to the known signals that contain those words. The tags for each known signal is also stored in association with the known signal. Each known signal may be broken into multiple documents in which case the index relates words back to the documents in which they occur and then relates documents back to the signals in which they occur. To create an index, each known signal is first converted into symbolic representation, sequences of symbols are grouped into words and then sequences of words are converted into documents which may actually overlap. The index from words to documents and from documents to signals is then updated to represent the occurrence of each word and each document.
In order to search for an unknown signal using an index, the unknown signal is first converted into a sequence of symbols, words and documents in exactly the same fashion as was done with the known signals. The index is then used to find all documents that contain any of the words in the unknown signal. Scores are accumulated for all documents that are part of a known signal and that share any word with the unknown signal. There are many methods that are known in the text retrieval literature for scoring documents. The signal corresponding to the documents with the highest score is then considered to be the same signal as the unknown signal. In one embodiment, the tag information for that known signal is retrieved and associated with the unknown signal. The known and unknown signals need not represent audio signals, but merely need to be signals for which a useful quantization operation exists that is invariant with respect to the transformations that might be expected to be encountered.
As can be seen from the above description, the present invention may be applied to many different domains, and is not limited to the application of retrieval of information corresponding to audio signals. In addition, information retrieval according to the techniques of the present invention is not limited to the retrieval of tag information for music songs. Many techniques of the present invention may be applied to retrieval of information in other domains.
The following description of preferred embodiments of the present invention is presented in the context of an information retrieval engine for music tracks, such as may be implemented in an Internet-based jukebox or personalized radio station. One skilled in the art will recognize that the present invention may be implemented in many other domains and environments, both within the context of musical information retrieval, and in other contexts. Accordingly, the following description, while intended to be illustrative of a particular implementation, is not intended to limit the scope of the present invention or its applicability to other domains and environments. Rather, the scope of the present invention is limited and defined solely by the claims.
Referring now to
In one embodiment, the system monitors the user's behavior with regard to searching, listening, and playback control in order to refine recommendations. Advertising, offers, and other information may be selected and presented to the user based on observations of user behavior and analysis as to which material may be of interest to the user. Thus, in this embodiment, it is useful to retrieve tag information for any unknown music tracks.
Stream delivery system 150 interacts with jukebox 103 to specify a sequence of audio files to deliver to jukebox 103. System 150 also delivers the audio files. Stream delivery system 150 also communicates with real-time subscription authorization module 157, which includes real-time server 154 and database server 156 that keeps track of which user accounts are active and enforces global business rules about which accounts can listen to the radio at a given time. Within stream delivery system 150, there are a number of distinct software entities. These include sequence server 153, authorization server 152, and content server 151. Sequence server 153 selects which files to play according to estimated listener preferences as well as predetermined station formats. Authorization server 152 keeps a record of the files that are selected by sequence server 153 and is consulted by content server 151 when files are requested. If jukebox 103 request files without the necessary security information or in an unexpected order, authorization server 152 flags this anomaly and content server 151 declines to provide the data.
Content database 102 contains descriptions of music tracks, and in one embodiment contains digitized representations of the music tracks themselves. Content database 102 is stored using conventional database storage means, and in one embodiment includes several fields providing descriptive information regarding music tracks, such as title, album, artist, type of music, track length, year, record label, and the like. For tracks without tag information, this descriptive information is retrieved using the information retrieval engine 160.
Referring now to
Index and search module 204 facilitates functionality for accepting user input and searching database 202 for particular music tracks. In one embodiment, the user enters input by accessing web site 206, which provides an interactive user interface for accessing the functions of the present invention. Web site 206 provides the main point of contact with users. A user interacts with web site 206 over a network, using a conventional web browser 205 (such as Microsoft Internet Explorer), running on a client computer. Module 204 accesses database 202 and index 210 in response to user queries. Results are returned to the user via web site 206. In one embodiment, index and search module 204 also dynamically updates content index 210 in order to provide improved efficiency for future searches. Such indexing techniques are well known in the art.
In addition, web site 206 offers the capability for suggesting tracks and artists that may interest the user, based on personal criteria 211, profiles 212, of track-level discovered relationships based on observed user listening behavior determined by log analysis and information retrieval 213 of play logs 214, as described in more detail below. Log analysis and information retrieval 213 also retrieves tag information for unknown songs using the information retrieval engine of the present invention.
Play log 214 is a database that monitors and stores information describing user behavior. Specifically, the user's interaction with jukebox 103, including track selection, repeats, aborts and skips, and the like, are recorded and stored in play log 214. Log analysis and information retrieval module 213 analyzes play log 214 in order to generate a profile of the user, which is stored in profile database 212. Log analysis and information retrieval module 213 also retrieves tag information in accordance with the present invention. The retrieved tag information is stored in play log 214 for unidentified tracks. Log analysis and information retrieval module 213 periodically updates the play log 214 as the information about unknown music tracks is retrieved. In one embodiment, the tag information is retrieved by the information retrieval engine module 213 prior to log analysis in the log analysis module 213.
Now referring to
Waveform converter 305 converts signal 301 to a waveform 308. The waveform 308 is an alternate representation of the signal 301 input into the waveform converter 305. The waveform 308 is input into a spectral feature detector 310. The spectral feature detector detects spectral features 312 of the waveform 308.
The spectral feature detector first converts the signal from audio to Mel Spectral. Mel is a nonlinear frequency scale that matches human frequency discriminatory capabilities. The spectral feature detector then performs another transformation from Mel Spectral to Mel Cepstral as described in “Mel-Frequency Cepstral Analysis,” by Unjung Nam available on the Internet at http://ccrma-www.stanford.edu/˜unjung/mylec/mfcc.html. In one embodiment, Mel Cepstral is represented in terms of multiple coefficients 312. A first coefficient 312 encodes amplitude, a second coefficient 312 encodes fundamental frequency, and a subsequent coefficients 312 encodes successively refined aspects of overall spectral shape. The coefficients 312 are then quantized, by quantizer 315, using any one of several quantization techniques. The quantization techniques are described below in reference to
The quantizer 315 outputs a symbol that represents an approximation of the signal 301 characteristics for a short period of time. These symbols are referred to here as letters. The sequence of letters 320 reflects the time evolution of the signal as it changes to be more or less like the clusters centroids. Because they are ordered and taken from a reasonably small vocabulary, these symbols can be analogized to letters in the textual domain. Therefore, in this application the terms symbols, clusters, and letters will be used interchangeably. The letters 320 are grouped into groups of one or more letters 320 by reference to a dictionary of letter groups that is created by examination of the frequencies of all letter groups in a large training set of signal inputs. These groups of letters are termed words, again following an analogy to the textual domain where the letters may be actual letters and the words actual words. The organization module 330 uses an algorithm described below with reference to
The words 335 are then used to access an index. An index records all of the documents that contain particular words. By examining all index entries for the words in the words 335, all of the documents that have one or more words in common with the words 335 can be found. The known signals used to create the index are temporally separated into sections called documents. Each document is quantized and organized into words which are then indexed so that there is an entry for each unique word in any known signal. The index can thus be searched for words in the unknown signal 301. Every indexed document containing a word in common with the unknown signal 301 is scored in the score and weight module 340. Every word in the unknown signal 301 is weighted in the score and weight module 340. Scoring and weighting algorithms are described in detail below in reference to
The information retrieval engine according to the present invention can determine the tag information for unknown music tracks that have been compressed or otherwise distorted in a fashion that leaves them psycho-acoustically similar to the original. By using quantization techniques and an index of quantized known tag information, tag information can be determined quickly and efficiently. The invention divides signals into documents, so that the known signal is broken down into smaller sections, documents, allowing the information retrieval engine to retrieve information for trimmed signals. One property of the quantization used in this embodiment is that the quantization of a compressed signal and the quantization of the same signal prior to compression are similar because the comparison operation that underlies the quantization process is designed to place psycho-acoustically similar signals near each other. Thus, a quantization technique allows the information retrieval engine to retrieve information for compressed signals.
Now referring to
Now referring to
Sets of coefficients representing spectral features are called input instances 501 and are clustered in one embodiment using the k as follows. First, a cluster centroid is chosen 505. Then, each cluster is built around the cluster centroids 510. Finally, the clusters are refined in some fashion. In one embodiment, the cluster centroids 510 are initially chosen at random. In one embodiment, the cluster centroids 510 are chosen by selecting an input instance as the initial value for each cluster centroid. Every instance is assigned to a cluster 515. In one embodiment, the instances 501 are assigned to the cluster 520 to which they are most proximal.
In one embodiment, the clustering groups are exclusive such that any instance belongs to only one group. In another embodiment, the clustering groups are overlapping such that one instance could belong to two groups.
In one embodiment, “k-means” clustering is used. In “k-means” clustering, the number of clusters sought is specified, k. Then points are chosen at random for cluster centroids. Each instance is assigned its closest cluster based on distance. The mean of all instances in each cluster is calculated. For each cluster, the mean of instances assigned to that cluster is considered to be the new cluster centroid and the process is repeated iteratively. Iteration can continue until the instances are assigned to the same clusters in consecutive iterations or until the error between consecutive iterations is sufficiently small, or until some other threshold is reached. One skilled in the art will note that there are many variants of clustering that could be used interchangeably with k-means clustering. Such variants as Gaussian mixtures, Expectation-Maximization, hierarchical clustering and so on are well known in the art.
Now referring to
In one embodiment of the present invention, the documents are overlapping. Any amount of overlap, or no overlap, can be used. In one embodiment of the present invention, the document overlap is 20 seconds. In an alternative embodiment, the document overlap is 15 seconds.
Known signals are broken into documents to provide smaller sections of the signal to group into words. The documents also aid in signal identification if a signal has been trimmed. If an unknown signal is only a 30-second segment out of a 4-minute signal, the unknown signal may not be identified if the entire known signal is scored. However, by breaking down the known signals into documents, the unknown signal is more likely to be identified.
It is also possible that the unknown signal is 30 seconds long, however it is not the first 30 seconds of the signal or the second 30 seconds, etc. It could be, for example, the 30 seconds that occurs from 7 seconds into the signal until 37 seconds into the signal. In this case, if there were no overlap at all between the documents, the unknown signal may also be hard to identify because it does not correspond in time to one of the known documents. However, if the known documents overlapped by either 5 or 10 seconds, the unknown signal would be easily identifiable because it would correspond closely in time to a known signal document.
If an unknown signal is longer than the predetermined length of a known document, the unknown song is itself separated into documents. In one embodiment, the documents of the unknown signal are the same length as the documents of the known signals. Also, the overlap of the documents of the unknown song are usually the same as the documents of the known songs, though this is not necessary to practice the present invention.
Now referring to
The signal for each document is converted into its corresponding waveform and the spectral features of the waveform are detected as described above in reference to
The letters are grouped together to form words 715. A word is a group of one or more letters. Words are formed based on the frequency of the groups of letters. If a letter or group of letters appears within certain threshold frequency limits, then the letter or group of letters are organized to form words. The formation of words is described in further detail below with reference to
In one embodiment, the database also associates the documents and their corresponding words 720. This method is repeated for as many known signals as is desirable to enter into the index. In one embodiment of the present invention, an index can be built that contains every signal currently known. In an alternative embodiment, an index can be built that in some way classifies the signals so that it does not contain every known signal, but instead contains a subset of every known signal. The classification could based on any criteria as long as it is a plausible design.
Once an index has been built it can be used to identify an unknown signal. The unknown signal can also be broken into documents, quantized, and grouped into words. In one embodiment of the present invention, the words in the unknown documents can be compared to the words in the known documents in order to find a match and identify an unknown piece of music by its tag information.
Now referring to
When words are formed, initially, each letter is assumed to be a word 805. Then, in one embodiment of the present invention, the system looks for pairs of frequently appearing letters 810. These are also considered words. In one embodiment, the system then looks for frequently appearing 3-grams 815 (a three letter word). In one embodiment, the system then looks for frequently appearing 4-grams 815. Finally, the system looks for frequently appearing n-grams 815, where n is any positive integer value greater than 2. In the event of overlap between a 2-gram and any n-gram, the system favors the longer word 825. The result is a series of the longest frequently appearing n-grams.
The following is a hypothetical example of word formation for an arbitrary string of letters from the alphabet.
The same process could be performed to form the word “cqr” as to form the word “age.” After the system finished forming words on this string the following words would be formed: “age” appearing 2 times, “cqr” appearing 2 times, “tl” appearing 2 times, “s” appearing 4 times, “p” appearing 3 times, “l” appearing 2 times, and “r” appearing 2 times.
Now referring to
Each document of the query, or the entire query if it was not divided into documents, is quantized 910. Any technique of quantization can be used. Several techniques of quantization are described above with reference to
In one embodiment, the present invention employs a binomial log likelihood ratio analysis for weighting and scoring words in the query and known documents in the index. The log likelihood ratio is a measure of how well a null hypothesis fits the observed data. If the null hypothesis is the assumed independence of occurrence of a word in the query and a word in a known document, for example, the log likelihood ratio measures the likelihood that such independence is a valid assumption. It follows, then, that the log likelihood ratio is a useful indicator of the relationship between the occurrences of the two words, if any.
The log likelihood ratio is based on a likelihood ratio. A likelihood ratio is the ratio of the maximum likelihood of the observed data for all models where the null hypothesis holds to the maximum likelihood of the observed data for all models where the null hypothesis may or may not hold. The log likelihood ratio is the logarithm of the likelihood ratio.
For the present invention, the log likelihood ratio is employed to determine whether a given word is more likely to appear in a first known document than in a second known document. Based on this measure, subsets of documents are defined so as to identify those documents most likely to contain the query words.
In one embodiment, the log likelihood ratio is applied in the present invention to determine whether a particular track occurs more frequently than expected in the selections of a subset of users. Variables are defined as follows:
The log likelihood ratio is then given as:
Referring now to
A total number of documents N is determined 1002. The system determines a document frequency (the number of times document occurred, or DFj=Σ(kij>0)) 1004. The results are weighted according to a product of up to three components: l=the number of times a word appears in the query document(s); g=the number of times a word appears in the known documents; and n=a normalizing factor based on how many words there are, in total 1005.
The first weighting factor, l, is a local weighting factor. It represents the frequency of the word within the query document. It may be represented and defined according to the following alternatives:
The second weighting factor, g, represents the frequency of the words within all the known documents. It may be represented and defined according to the following alternatives:
The third weighting factor, n, represents a normalizing factor, which serves to reduce the bias for scoring long documents higher than short ones. Using a normalizing factor, a short relevant document should score at least as well as a longer document with general relevance. n may be represented and defined according to the following alternatives:
By employing the above-described combination of three weighting factors in generating weights for words and scores for documents, the present invention avoids the problems of overstating frequently appearing words and overstating coincidental co-occurrence. If a word is frequently occurring, the second weighting factor will tend to diminish its overpowering effect. In addition, the effect of coincidental co-occurrence is lessened by the normalization factor.
In one embodiment, the system of the present invention generates scores as follows. For each track of interest, a large N-dimensional vector is determined. For each known document, a large N-dimensional vector is determined. The techniques of assigning meaning to such vectors and training the vector set to represent similarities among vectors are well known in the art, as described for example in Caid et al., U.S. Pat. No. 5,619,709, for “System and method of context vector generation and retrieval.”
A score can be generated for one or more documents, and the highest-scoring document is treated as if it is the same as the query document. A score for a document is determined by taking the dot product of the query vector and the document vector. In one embodiment of the present invention, the above-described weighting factors are applied to the vector terms in order to improve the results of the scoring process.
The words are weighted using the general formula lijgjni, where l, g, n are defined above 1005. The known documents are scored using the general formula lijgjni, where l, g, n are defined above 1005. The specific options can be chosen based on the desired results. In one embodiment of the present invention, it is desired to preserve diversity, to give rarity a bonus, and to normalize. In that embodiment, the weighting options L, I, and C may be chosen. Weighting option L acts to preserve diversity, option I acts to give variety a bonus, and C acts to normalize the results. Thus, in that embodiment the weighting would be equal to lLgInC. In other embodiments, different weighting options can be used. In one embodiment, the present invention uses the same weighting for the query words as it does for scoring the known documents. In an alternative embodiment, different weighting options can be chosen for the query words and for scoring the known documents.
Once the query has been weighted and the known documents have been scored and the dot product of two vectors determined, the document or documents with the highest score are determined. The highest scoring documents are then looked up in the database to determine the corresponding signal. The corresponding signal is considered to be the query signal. Thus, the present invention has identified the known signal using the information retrieval engine. The information for the unknown signal is the same as the information for the corresponding signal in the index. In one embodiment, the present invention associates the information with the unknown signal. In one embodiment, the present invention outputs the information to a user of the information engine 1006.
From the above description, it will be apparent that the invention disclosed herein provides a novel and advantageous system and method for information retrieval. The foregoing discussion discloses and describes merely exemplary methods and embodiments of the present invention. As will be understood by those familiar with the art, the invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. For example, the invention may be applied to other domains and environments, and may be employed in connection with additional applications where information retrieval is desirable. Accordingly, the disclosure of the present invention is intended to illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US3568156||Aug 9, 1967||Mar 2, 1971||Bell Telephone Labor Inc||Text matching algorithm|
|US4384329||Dec 19, 1980||May 17, 1983||International Business Machines Corporation||Retrieval of related linked linguistic expressions including synonyms and antonyms|
|US5182708||Sep 24, 1991||Jan 26, 1993||Ricoh Corporation||Method and apparatus for classifying text|
|US5241674||Mar 21, 1991||Aug 31, 1993||Kabushiki Kaisha Toshiba||Electronic dictionary system with automatic extraction and recognition of letter pattern series to speed up the dictionary lookup operation|
|US5303150||Dec 12, 1990||Apr 12, 1994||Ricoh Company, Ltd.||Wild-card word replacement system using a word dictionary|
|US5371807||Mar 20, 1992||Dec 6, 1994||Digital Equipment Corporation||Method and apparatus for text classification|
|US5392212||Jul 7, 1993||Feb 21, 1995||The United States Of America As Represented By The Secretary Of Commerce||Apparatus for identifying unknown words by comparison to known words|
|US5497488||Mar 13, 1995||Mar 5, 1996||Hitachi, Ltd.||System for parallel string search with a function-directed parallel collation of a first partition of each string followed by matching of second partitions|
|US5548507 *||Mar 14, 1994||Aug 20, 1996||International Business Machines Corporation||Language identification process using coded language words|
|US5608622 *||Sep 11, 1992||Mar 4, 1997||Lucent Technologies Inc.||System for analyzing translations|
|US5678054||Oct 5, 1994||Oct 14, 1997||Brother Kogyo Kabushiki Kaisha||Data searching device|
|US5713016||Sep 5, 1995||Jan 27, 1998||Electronic Data Systems Corporation||Process and system for determining relevance|
|US5754938 *||Oct 31, 1995||May 19, 1998||Herz; Frederick S. M.||Pseudonymous server for system for customized electronic identification of desirable objects|
|US5774357 *||Jun 6, 1995||Jun 30, 1998||Hoffberg; Steven M.||Human factored interface incorporating adaptive pattern recognition based controller apparatus|
|US5883986 *||Jun 2, 1995||Mar 16, 1999||Xerox Corporation||Method and system for automatic transcription correction|
|US6009382||Sep 30, 1996||Dec 28, 1999||International Business Machines Corporation||Word storage table for natural language determination|
|US6038527 *||Jun 18, 1996||Mar 14, 2000||Daimler Benz Ag||Method for generating descriptors for the classification of texts|
|US6047251 *||Sep 15, 1997||Apr 4, 2000||Caere Corporation||Automatic language identification system for multilingual optical character recognition|
|US6105022||Feb 23, 1998||Aug 15, 2000||Hitachi, Ltd.||Structured-text cataloging method, structured-text searching method, and portable medium used in the methods|
|US6131082||Dec 1, 1997||Oct 10, 2000||Int'l.Com, Inc.||Machine assisted translation tools utilizing an inverted index and list of letter n-grams|
|US6134532||Nov 14, 1997||Oct 17, 2000||Aptex Software, Inc.||System and method for optimal adaptive matching of users to most relevant entity and information in real-time|
|US6167369 *||Dec 23, 1998||Dec 26, 2000||Xerox Company||Automatic language identification using both N-gram and word information|
|US6185560||Apr 15, 1998||Feb 6, 2001||Sungard Eprocess Intelligance Inc.||System for automatically organizing data in accordance with pattern hierarchies therein|
|US6225546 *||Apr 5, 2000||May 1, 2001||International Business Machines Corporation||Method and apparatus for music summarization and creation of audio summaries|
|US6252988 *||Jul 9, 1998||Jun 26, 2001||Lucent Technologies Inc.||Method and apparatus for character recognition using stop words|
|US6272495||Apr 22, 1998||Aug 7, 2001||Greg Hetherington||Method and apparatus for processing free-format data|
|US6292795||Feb 16, 1999||Sep 18, 2001||International Business Machines Corporation||Indexed file system and a method and a mechanism for accessing data records from such a system|
|US6321205||Aug 17, 1998||Nov 20, 2001||Value Miner, Inc.||Method of and system for modeling and analyzing business improvement programs|
|US6370513||Jun 28, 1999||Apr 9, 2002||Parasoft Corporation||Method and apparatus for automated selection, organization, and recommendation of items|
|US6430539||May 6, 1999||Aug 6, 2002||Hnc Software||Predictive modeling of consumer financial behavior|
|US6438579||Jul 14, 2000||Aug 20, 2002||Agent Arts, Inc.||Automated content and collaboration-based system and methods for determining and providing content recommendations|
|US6526411||Nov 15, 2000||Feb 25, 2003||Sean Ward||System and method for creating dynamic playlists|
|US6614914 *||Feb 14, 2000||Sep 2, 2003||Digimarc Corporation||Watermark embedder and reader|
|US6615208||Sep 1, 2000||Sep 2, 2003||Telcordia Technologies, Inc.||Automatic recommendation of products using latent semantic indexing of content|
|US6655963||Jul 31, 2000||Dec 2, 2003||Microsoft Corporation||Methods and apparatus for predicting and selectively collecting preferences based on personality diagnosis|
|US6657117||Jul 13, 2001||Dec 2, 2003||Microsoft Corporation||System and methods for providing automatic classification of media entities according to tempo properties|
|US6658151 *||Apr 8, 1999||Dec 2, 2003||Ricoh Co., Ltd.||Extracting information from symbolically compressed document images|
|US20020010714||Jul 3, 2001||Jan 24, 2002||Greg Hetherington||Method and apparatus for processing free-format data|
|US20020065857 *||Aug 3, 2001||May 30, 2002||Zbigniew Michalewicz||System and method for analysis and clustering of documents for search engine|
|US20020152204||Apr 1, 2002||Oct 17, 2002||Ortega Ruben Ernesto||System and methods for predicting correct spellings of terms in multiple-term search queries|
|US20030007507||Feb 2, 2001||Jan 9, 2003||Doron Rajwan||Data streaming|
|EP1050833A2||May 4, 2000||Nov 8, 2000||Hnc Software Inc.||Predictive modeling of customer financial behavior|
|1||"Mel-Frequency Cepstral Analysis," [online] [retrieved on Feb. 2, 2001] Retrieved from the Internet <URL: http://ccrma-www.stanford.edu/~unjung/mylec/mfcc.html>, pp. 1-3.|
|2||Agosti, M., Gradenigo, G., Marchetti, P.G., "A Hypertext Environment for Interacting with Large Textual Databases," pp. 469-477.|
|3||Belkin, N. J., Oddy, R.N., Brooks, H. M., "The Journal of Documentation", vol. 38, No. 2, Jun. 1982, pp. 299-304.|
|4||de Marcken, Carl, "The Unsupervised Acquisition of a Lexicon from Continuous Speech," Massachusetts Institute of Technology Artificial Intelligence Laboratory; Center for Biological and Computational Learning, Department of Brain and Cognitive Sciences, A.I. Memo No. 1558 and C.B.C.L. Memo No. 129, Nov. 2001, pp. 1-27.|
|5||Griffiths, Alan, Luckhurst, H. Claire and Willett, Peter, "Using Interdocument Similarity Information in Document Retrieval Systems", Jul. 10, 1985, pp. 365-373.|
|6||Hayes, Conor et al., Smart Radio-A Proposal, Technical Report TCD-CS-1999-24, Online! Apr. 1999 , pp. 1-14, XP002279579, Trinity College Dublin, Ireland, Retrieved from the Internet: <URL:http://www.cs.tcd.ie/publications/tech-reports/reports.99/TCD-CS-1999-24.pdf>, retrieved on May 7, 2004.|
|7||Hoffman, Thomas et al., Latent Class Models For Collaborative Filtering, Proceedings Of The Sixteenth International Joint Conference On Artificial Intelligence, IJCAI 99, Stockholm, Sweden, Jul. 31-Aug. 6, 1999, Online! pp. 688-693, XP002279578, Retrieved from the Internet: <URL:http://www.cs.brown.edu/(th/papers/HofmannPuzicha-IJCAI99.pdf>, retrieved on May 7, 2004.|
|8||Hull, David A. and Grefenstette, Gregory, "Query Across Languages: A Dictionary-Based Approach to Multilingual Information Retrieval," pp. 484-492.|
|9||Jones, K. Sparck, Jones, G.J.F., Foote, J.T. and Young, S.J., "Experiments in Spoken Document Retrieval," pp. 493-502.|
|10||Jones, Karen Sparck, "Readings in Information Retrieval," Morgan Kaufmann Publishers, Inc., 1997, pp. 25-110.|
|11||Jones, Karen Sparck, "Search Term Relevance Weighting Given Little Relevance Information," revised version received Jan. 14, 1979, pp. 329-338.|
|12||Keen, E. Michael, "Presenting Results of Experimental Retrieval Comparisons," Department of Information & Library Studies, University College of Wales, Aberystwyth, SY23 3AS, U.K., pp. 217-222.|
|13||Lancaster, F.W., "MEDLARS: Report on the Evaluation of Its Operating Efficiency," pp. 223-246.|
|14||Loeb, Shoshana, Architecting Personalized Delivery Of Multimedia Information, Communications Of The ACM, vol. 25, No. 12, Dec. 1992, pp. 39-50, XP002102709.|
|15||McCune, Brian P., Tong, Richard M., Dean, Jeffrey S. and Shapiro, Daniel G., "RUBRIC: A System for Rule-Based Information Retrieval," pp. 440-445.|
|16||Nichols, David M. et al., Recommendation And Usage In The Digital Library, Technical Report Ref. CSEG/2/1997, Online! 1997, pp. 1-15, XP002279577, Retrieved from the Internet: <URL:ftp://ftp.comp.lancs.ac.uk/pub/reports/1997/CSEG.2.97.pdf>, retrieved on May 7, 2004.|
|17||Rau, Lisa F., "Conceptual Information Extraction and Retrieval from Natural Language Input," Artificial Intelligence Program GE Research and Development Center, Schenectady, NY 12301, pp. 527-533.|
|18||Robertson, S.E. and Walker, S, "Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval," pp. 345-354.|
|19||Robertson, S.E., "The Probability Ranking Principle in IR," received Jul. 12, 1977, pp. 281-286.|
|20||Salton, G and McGill, M.J., "The SMART and SIRE Experimental Retrieval Systems," pp. 381-399.|
|21||Salton, G., Wong, A. and Yang, C.S., "A Vector Space Model for Automatic Indexing," revised Mar. 1975, pp. 273-280.|
|22||Salton, Gerard and Buckley, Christopher, "Term-Weighting Approaches in Automatic Text Retrieval," accepted in final form Jan. 26, 1988, pp. 323-328.|
|23||Salton, Gerard and McGill, Michael J., "Introduction to Modern Information Retrieval," Computer Science Series, McGraw-Hill, Inc., 1983, pp. 1-23. 52-156, 199-302, 354-435.|
|24||Salton, Gerard, Allan, James, Buckley , Singhai, Amit, "Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts," pp. 478-483.|
|25||Schafer, J. Ben et al., Recommender Systems In E-Commerce, Proceedings ACM Conference On Electronic Commerce, 1999, pp. 158-166, XP002199598.|
|26||Strzalkowski, Tomek, "Robust Text Processing in Automated Information Retrieval," pp. 317-322.|
|27||Tague-Sutcliffe, Jean, "The Pragmatics of Information Retrieval Experimentation, Revisited," pp. 205-222.|
|28||van Rijsbergen, C. J., "Information Retrieval," [online], 1979 [retrieved on Aug. 24, 2001] Retrieved from the Internet <URL: http://www.dcs.gla.ac.uk/Keith/Preface.html>.|
|29||Witten, Ian H. and Frank, Eibe, "Data Mining, Practical Machine Learning Tools and Techniques with JAVA Implementations," Academic Press, pp. 57-76.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7747083 *||Mar 27, 2006||Jun 29, 2010||Yahoo! Inc.||System and method for good nearest neighbor clustering of text|
|US7844633 *||Sep 13, 2007||Nov 30, 2010||International Business Machines Corporation||System and method for storage, management and automatic indexing of structured documents|
|US7953735 *||Jun 12, 2006||May 31, 2011||Sony Corporation||Information processing apparatus, method and program|
|US8380507||Mar 9, 2009||Feb 19, 2013||Apple Inc.||Systems and methods for determining the language to use for speech generated by a text to speech engine|
|US8583418 *||Sep 29, 2008||Nov 12, 2013||Apple Inc.||Systems and methods of detecting language and natural language strings for text to speech synthesis|
|US8712776||Sep 29, 2008||Apr 29, 2014||Apple Inc.||Systems and methods for selective text to speech synthesis|
|US8751238||Feb 15, 2013||Jun 10, 2014||Apple Inc.||Systems and methods for determining the language to use for speech generated by a text to speech engine|
|US8892446||Dec 21, 2012||Nov 18, 2014||Apple Inc.||Service orchestration for intelligent automated assistant|
|US8903716||Dec 21, 2012||Dec 2, 2014||Apple Inc.||Personalized vocabulary for digital assistant|
|US8914729 *||Oct 30, 2006||Dec 16, 2014||Yahoo! Inc.||Methods and systems for providing a customizable guide for navigating a corpus of content|
|US8930191||Mar 4, 2013||Jan 6, 2015||Apple Inc.||Paraphrasing of user requests and results by automated digital assistant|
|US8942986||Dec 21, 2012||Jan 27, 2015||Apple Inc.||Determining user intent based on ontologies of domains|
|US9117447||Dec 21, 2012||Aug 25, 2015||Apple Inc.||Using event alert text as input to an automated assistant|
|US9262612||Mar 21, 2011||Feb 16, 2016||Apple Inc.||Device access using voice authentication|
|US9300784||Jun 13, 2014||Mar 29, 2016||Apple Inc.||System and method for emergency calls initiated by voice command|
|US9318108||Jan 10, 2011||Apr 19, 2016||Apple Inc.||Intelligent automated assistant|
|US9330720||Apr 2, 2008||May 3, 2016||Apple Inc.||Methods and apparatus for altering audio output signals|
|US9338493||Sep 26, 2014||May 10, 2016||Apple Inc.||Intelligent automated assistant for TV user interactions|
|US9368114||Mar 6, 2014||Jun 14, 2016||Apple Inc.||Context-sensitive handling of interruptions|
|US9430463||Sep 30, 2014||Aug 30, 2016||Apple Inc.||Exemplar-based natural language processing|
|US9483461||Mar 6, 2012||Nov 1, 2016||Apple Inc.||Handling speech synthesis of content for multiple languages|
|US9495129||Mar 12, 2013||Nov 15, 2016||Apple Inc.||Device, method, and user interface for voice-activated navigation and browsing of a document|
|US9502031||Sep 23, 2014||Nov 22, 2016||Apple Inc.||Method for supporting dynamic grammars in WFST-based ASR|
|US9535906||Jun 17, 2015||Jan 3, 2017||Apple Inc.||Mobile device having human language translation capability with positional feedback|
|US9548050||Jun 9, 2012||Jan 17, 2017||Apple Inc.||Intelligent automated assistant|
|US9576574||Sep 9, 2013||Feb 21, 2017||Apple Inc.||Context-sensitive handling of interruptions by intelligent digital assistant|
|US9582608||Jun 6, 2014||Feb 28, 2017||Apple Inc.||Unified ranking with entropy-weighted information for phrase-based semantic auto-completion|
|US9606986||Sep 30, 2014||Mar 28, 2017||Apple Inc.||Integrated word N-gram and class M-gram language models|
|US9620104||Jun 6, 2014||Apr 11, 2017||Apple Inc.||System and method for user-specified pronunciation of words for speech synthesis and recognition|
|US9620105||Sep 29, 2014||Apr 11, 2017||Apple Inc.||Analyzing audio input for efficient speech and music recognition|
|US9626955||Apr 4, 2016||Apr 18, 2017||Apple Inc.||Intelligent text-to-speech conversion|
|US9633004||Sep 29, 2014||Apr 25, 2017||Apple Inc.||Better resolution when referencing to concepts|
|US9633660||Nov 13, 2015||Apr 25, 2017||Apple Inc.||User profiling for voice input processing|
|US9633674||Jun 5, 2014||Apr 25, 2017||Apple Inc.||System and method for detecting errors in interactions with a voice-based digital assistant|
|US9646609||Aug 25, 2015||May 9, 2017||Apple Inc.||Caching apparatus for serving phonetic pronunciations|
|US9646614||Dec 21, 2015||May 9, 2017||Apple Inc.||Fast, language-independent method for user authentication by voice|
|US9668024||Mar 30, 2016||May 30, 2017||Apple Inc.||Intelligent automated assistant for TV user interactions|
|US9668121||Aug 25, 2015||May 30, 2017||Apple Inc.||Social reminders|
|US9697820||Dec 7, 2015||Jul 4, 2017||Apple Inc.||Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks|
|US9697822||Apr 28, 2014||Jul 4, 2017||Apple Inc.||System and method for updating an adaptive speech recognition model|
|US9711141||Dec 12, 2014||Jul 18, 2017||Apple Inc.||Disambiguating heteronyms in speech synthesis|
|US9715875||Sep 30, 2014||Jul 25, 2017||Apple Inc.||Reducing the need for manual start/end-pointing and trigger phrases|
|US9721566||Aug 31, 2015||Aug 1, 2017||Apple Inc.||Competing devices responding to voice triggers|
|US9734193||Sep 18, 2014||Aug 15, 2017||Apple Inc.||Determining domain salience ranking from ambiguous words in natural speech|
|US9760559||May 22, 2015||Sep 12, 2017||Apple Inc.||Predictive text input|
|US20050131677 *||Dec 12, 2003||Jun 16, 2005||Assadollahi Ramin O.||Dialog driven personal information manager|
|US20070078832 *||Jun 19, 2006||Apr 5, 2007||Yahoo! Inc.||Method and system for using smart tags and a recommendation engine using smart tags|
|US20070208771 *||Mar 26, 2007||Sep 6, 2007||Microsoft Corporation||Auto playlist generation with multiple seed songs|
|US20070226207 *||Mar 27, 2006||Sep 27, 2007||Yahoo! Inc.||System and method for clustering content items from content feeds|
|US20070226209 *||May 30, 2007||Sep 27, 2007||International Business Machines Corporation||Methods and Apparatus for Clustering Evolving Data Streams Through Online and Offline Components|
|US20070244874 *||Mar 27, 2006||Oct 18, 2007||Yahoo! Inc.||System and method for good nearest neighbor clustering of text|
|US20080077570 *||Apr 25, 2007||Mar 27, 2008||Infovell, Inc.||Full Text Query and Search Systems and Method of Use|
|US20080104521 *||Oct 30, 2006||May 1, 2008||Yahoo! Inc.||Methods and systems for providing a customizable guide for navigating a corpus of content|
|US20090043811 *||Jun 12, 2006||Feb 12, 2009||Noriyuki Yamamoto||Information processing apparatus, method and program|
|US20090077009 *||Sep 13, 2007||Mar 19, 2009||International Business Machines Corporation||System and method for storage, management and automatic indexing of structured documents|
|US20100082329 *||Sep 29, 2008||Apr 1, 2010||Apple Inc.||Systems and methods of detecting language and natural language strings for text to speech synthesis|
|US20100082349 *||Sep 29, 2008||Apr 1, 2010||Apple Inc.||Systems and methods for selective text to speech synthesis|
|US20100228549 *||Mar 9, 2009||Sep 9, 2010||Apple Inc||Systems and methods for determining the language to use for speech generated by a text to speech engine|
|US20100332541 *||Jan 28, 2009||Dec 30, 2010||France Telecom||Method for identifying a multimedia document in a reference base, corresponding computer program and identification device|
|US20110055192 *||Nov 10, 2010||Mar 3, 2011||Infovell, Inc.||Full text query and search systems and method of use|
|US20110125795 *||Jul 10, 2009||May 26, 2011||Koninklijke Philips Electronics N.V.||Method and apparatus for selecting a multimedia item|
|WO2010007570A2 *||Jul 10, 2009||Jan 21, 2010||Koninklijke Philips Electronics N.V.||Method and apparatus for selecting a multimedia item|
|WO2010007570A3 *||Jul 10, 2009||Apr 1, 2010||Koninklijke Philips Electronics N.V.||Method and apparatus for selecting a multimedia item|
|U.S. Classification||707/737, 84/645, 707/E17.101, 707/741, 707/750, 707/999.102, 707/916|
|International Classification||G06F7/00, G06F17/00|
|Cooperative Classification||Y10S707/916, Y10S707/99943, G06F17/30758, G06F17/30663, G06F17/30743|
|European Classification||G06F17/30T2P2E, G06F17/30U3E, G06F17/30U1|
|Jun 11, 2002||AS||Assignment|
Owner name: MUSICMATCH, INC., CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DUNNING, TED E.;REEL/FRAME:013006/0198
Effective date: 20020606
|Jul 26, 2006||AS||Assignment|
Owner name: YAHOO! INC., CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MUSICMATCH, INC.;REEL/FRAME:018005/0153
Effective date: 20060612
Owner name: YAHOO! INC.,CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MUSICMATCH, INC.;REEL/FRAME:018005/0153
Effective date: 20060612
|Jun 9, 2010||FPAY||Fee payment|
Year of fee payment: 4
|Jun 11, 2014||FPAY||Fee payment|
Year of fee payment: 8
|Apr 18, 2016||AS||Assignment|
Owner name: EXCALIBUR IP, LLC, CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YAHOO! INC.;REEL/FRAME:038383/0466
Effective date: 20160418
|Jun 1, 2016||AS||Assignment|
Owner name: YAHOO! INC., CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EXCALIBUR IP, LLC;REEL/FRAME:038951/0295
Effective date: 20160531
|Jun 3, 2016||AS||Assignment|
Owner name: EXCALIBUR IP, LLC, CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YAHOO! INC.;REEL/FRAME:038950/0592
Effective date: 20160531