A method for recognizing an audio sample locates an audio file that most closely matches the audio sample from a database indexing a large set of original recordings. Each indexed audio file is represented in the database index by a set of landmark timepoints and associated fingerprints. Landmarks occur at reproducible locations within the file, while fingerprints represent features of the signal at or near the landmark timepoints. To perform recognition, landmarks and fingerprints are computed for the unknown sample and used to retrieve matching fingerprints from the database. For each file containing matching fingerprints, the landmarks are compared with landmarks of the sample at which the same fingerprints were computed. If a large number of corresponding landmarks are linearly related, i.e., if equivalent fingerprints of the sample and retrieved file have the same time evolution, then the file is identified with the sample. The method can be used for any type of sound or music,... |
Citations|
| US4415767 | Oct 19, 1981 | Nov 15, 1983 | Votan | Method and apparatus for speech recognition and reproduction | | US4450531 | Sep 10, 1982 | May 22, 1984 | Ensco, Inc. | Broadcast signal recognition system and method | | US4843562 | Jun 24, 1987 | Jun 27, 1989 | Broadcast Data Systems Limited Partnership | Broadcast information classification system and method | | US4852181 | Sep 22, 1986 | Jul 25, 1989 | Oki Electric Industry Co., Ltd. | Speech recognition for recognizing the catagory of an input speech pattern | | US5210820 | May 2, 1990 | May 11, 1993 | Broadcast Data Systems Limited Partnership | Signal recognition system and method | | US5276629 | Aug 14, 1992 | Jan 4, 1994 | Reynolds Software, Inc. | Method and apparatus for wave analysis and event recognition | | US5400261 | Sep 7, 1993 | Mar 21, 1995 | Reynolds Software, Inc. | Method and apparatus for wave analysis and event recognition | | US5918223 | Jul 21, 1997 | Jun 29, 1999 | Muscle Fish | Method and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information | | US6434520 | Apr 16, 1999 | Aug 13, 2002 | International Business Machines Corporation | System and method for indexing and querying audio archives | | US6453252 | May 15, 2000 | Sep 17, 2002 | Creative Technology Ltd. | Process for identifying audio content | | US6570080 | May 18, 2000 | May 27, 2003 | Yamaha Corporation | Method and system for supplying contents via communication network | | US6748360 | Feb 21, 2002 | Jun 8, 2004 | International Business Machines Corporation | System for selling a product utilizing audio content identification | | US6834308 | Feb 17, 2000 | Dec 21, 2004 | Audible Magic Corporation | Method and apparatus for identifying media content presented on a media playing device |
Referenced by|
| US7072493 | Nov 8, 2004 | Jul 4, 2006 | Microsoft Corporation | Robust and stealthy video watermarking into regions of successive frames | | US7095873 | Jun 28, 2002 | Aug 22, 2006 | Microsoft Corporation | Watermarking via quantization of statistics of overlapping regions | | US7130892 | Sep 27, 2001 | Oct 31, 2006 | International Business Machines Corporation | Method and system for music distribution | | US7136535 | Dec 19, 2005 | Nov 14, 2006 | Microsoft Corporation | Content recognizer via probabilistic mirror distribution | | US7152163 | Nov 4, 2004 | Dec 19, 2006 | Microsoft Corporation | Content-recognition facilitator | | US7155386 | Mar 11, 2004 | Dec 26, 2006 | Mindspeed Technologies, Inc. | Adaptive correlation window for open-loop pitch | | US7171561 | Oct 17, 2002 | Jan 30, 2007 | The United States of America as represented by the Secretary of the Air Force | Method and apparatus for detecting and extracting fileprints | | US7181622 | Nov 3, 2005 | Feb 20, 2007 | Microsoft Corporation | Derivation and quantization of robust non-local characteristics for blind watermarking | | US7188065 | Nov 4, 2004 | Mar 6, 2007 | Microsoft Corporation | Categorizer of content in digital signals | | US7188249 | Nov 3, 2005 | Mar 6, 2007 | Microsoft Corporation | Derivation and quantization of robust non-local characteristics for blind watermarking | | US7240210 | Nov 4, 2004 | Jul 3, 2007 | Microsoft Corporation | Hash value computer of content of digital signals | | US7248715 | Sep 20, 2001 | Jul 24, 2007 | Digimarc Corporation | Digitally watermarking physical media | | US7248717 | Jul 27, 2005 | Jul 24, 2007 | Digimarc Corporation | Securing media content with steganographic encoding | | US7266244 | Jul 16, 2004 | Sep 4, 2007 | Microsoft Corporation | Robust recognizer of perceptually similar content | | US7289643 | Dec 19, 2001 | Oct 30, 2007 | Digimarc Corporation | Method, apparatus and programs for generating and utilizing content signatures | | US7302574 | Jun 21, 2001 | Nov 27, 2007 | Digimarc Corporation | Content identifiers triggering corresponding responses through collaborative processing | | US7318157 | Nov 12, 2004 | Jan 8, 2008 | Microsoft Corporation | Derivation and quantization of robust non-local characteristics for blind watermarking | | US7318158 | Nov 3, 2005 | Jan 8, 2008 | Microsoft Corporation | Derivation and quantization of robust non-local characteristics for blind watermarking | | US7328153 | Jul 22, 2002 | Feb 5, 2008 | Gracenote, Inc. | Automatic identification of sound recordings | | US7333957 | Jan 6, 2003 | Feb 19, 2008 | Digimarc Corporation | Connected audio and other media objects | | US7346512 | Jan 23, 2006 | Mar 18, 2008 | Landmark Digital Services, LLC | Methods for recognizing unknown media samples using characteristics of known media samples | | US7349552 | Jan 6, 2003 | Mar 25, 2008 | Digimarc Corporation | Connected audio and other media objects | | US7356188 | Apr 24, 2001 | Apr 8, 2008 | Microsoft Corporation | Recognizer of text-based work | | US7379875 | Feb 24, 2004 | May 27, 2008 | Microsoft Corporation | Systems and methods for generating audio thumbnails | | US7406195 | Aug 15, 2005 | Jul 29, 2008 | Microsoft Corporation | Robust recognizer of perceptually similar content | | US7421096 | Feb 23, 2004 | Sep 2, 2008 | | Input mechanism for fingerprint-based internet search | | US7421128 | Jul 28, 2003 | Sep 2, 2008 | Microsoft Corporation | System and method for hashing digital images | | US7430307 | Sep 29, 2004 | Sep 30, 2008 | Olympus Corporation | Data processing apparatus | | US7444388 | Apr 13, 2006 | Oct 28, 2008 | Concert Technology Corporation | System and method for obtaining media content for a portable media player | | US7453379 | Mar 12, 2007 | Nov 18, 2008 | Citrix Systems, Inc. | Systems and methods for identifying long matches of data in a compression history | | US7454417 | Sep 12, 2003 | Nov 18, 2008 | Google Inc. | Methods and systems for improving a search ranking using population information | | US7460038 | Mar 12, 2007 | Dec 2, 2008 | Citrix Systems, Inc. | Systems and methods of clustered sharing of compression histories | | US7477739 | Jan 21, 2003 | Jan 13, 2009 | Gracenote, Inc. | Efficient storage of fingerprints | | US7487180 | Jan 31, 2006 | Feb 3, 2009 | MusicIP Corporation | System and method for recognizing audio pieces via audio fingerprinting | | US7489801 | May 16, 2006 | Feb 10, 2009 | Digimarc Corporation | Encoding and decoding signals for digital watermarking | | US7499566 | Jul 22, 2005 | Mar 3, 2009 | Digimarc Corporation | Methods for steganographic encoding media | | US7505964 | Sep 12, 2003 | Mar 17, 2009 | Google Inc. | Methods and systems for improving a search ranking using related queries | | US7516074 | Sep 1, 2005 | Apr 7, 2009 | Auditude, Inc. | Extraction and matching of characteristic fingerprints from audio signals | | US7532134 | Mar 12, 2007 | May 12, 2009 | Citrix Systems, Inc. | Systems and methods for sharing compression histories between multiple devices | | US7545951 | Nov 14, 2005 | Jun 9, 2009 | Digimarc Corporation | Data transmission by watermark or derived identifier proxy | | US7549052 | Feb 11, 2002 | Jun 16, 2009 | Gracenote, Inc. Koninklijke PhilipsElectronics N.V. | Generating and matching hashes of multimedia content | | US7564992 | Oct 24, 2008 | Jul 21, 2009 | Digimarc Corporation | Content identification through deriving identifiers from video, images and audio | | US7568103 | Dec 15, 2004 | Jul 28, 2009 | Microsoft Corporation | Derivation and quantization of robust non-local characteristics for blind watermarking | | US7574451 | Nov 2, 2004 | Aug 11, 2009 | Microsoft Corporation | System and method for speeding up database lookups for multiple synchronized data streams | | US7590259 | Oct 29, 2007 | Sep 15, 2009 | Digimarc Corporation | Deriving attributes from images, audio or video to obtain metadata | | US7593576 | Dec 3, 2004 | Sep 22, 2009 | Digimarc Corporation | Systems and methods of managing audio and other media | | US7603434 | Apr 13, 2006 | Oct 13, 2009 | Domingo Enterprises, LLC | Central system providing previews of a user's media collection to a portable media player | | US7606390 | Aug 14, 2008 | Oct 20, 2009 | Digimarc Corporation | Processing data representing video and audio and methods and apparatus related thereto | | US7606790 | Mar 3, 2004 | Oct 20, 2009 | Digimarc Corporation | Integrating and enhancing searching of media content and biometric databases | | US7613736 | May 23, 2006 | Nov 3, 2009 | Resonance Media Services, Inc. | Sharing music essence in a recommendation system | | US7617398 | Nov 3, 2005 | Nov 10, 2009 | Microsoft Corporation | Derivation and quantization of robust non-local characteristics for blind watermarking | | US7619545 | Mar 12, 2007 | Nov 17, 2009 | Citrix Systems, Inc. | Systems and methods of using application and protocol specific parsing for compression | | US7623823 | Aug 30, 2005 | Nov 24, 2009 | Integrated Media Measurement, Inc. | Detecting and measuring exposure to media content items | | US7627477 | Oct 21, 2004 | Dec 1, 2009 | Landmark Digital Services, LLC | Robust and invariant audio pattern matching | | US7634405 | Jan 24, 2005 | Dec 15, 2009 | Microsoft Corporation | Palette-based classifying and synthesizing of auditory information | | US7634660 | Dec 15, 2004 | Dec 15, 2009 | Microsoft Corporation | Derivation and quantization of robust non-local characteristics for blind watermarking | | US7636849 | Nov 12, 2004 | Dec 22, 2009 | Microsoft Corporation | Derivation and quantization of robust non-local characteristics for blind watermarking | | US7650010 | Nov 21, 2008 | Jan 19, 2010 | Digimarc Corporation | Connected video and audio | | US7657752 | Nov 4, 2004 | Feb 2, 2010 | Microsoft Corporation | Digital signal watermaker | | US7676060 | Jun 5, 2007 | Mar 9, 2010 | | Distributed content identification | | US7680959 | Jul 11, 2006 | Mar 16, 2010 | Napo Enterprises, LLC | P2P network for providing real time media recommendations | | US7706570 | Feb 9, 2009 | Apr 27, 2010 | Digimarc Corporation | Encoding and decoding auxiliary signals | | US7707425 | Nov 4, 2004 | Apr 27, 2010 | Microsoft Corporation | Recognizer of content of digital signals | | US7711564 | Jun 27, 2002 | May 4, 2010 | Digimarc Corporation | Connected audio and other media objects | | US7765192 | Mar 29, 2006 | Jul 27, 2010 | Abo Enterprises, LLC | System and method for archiving a media collection | | US7770014 | Apr 30, 2004 | Aug 3, 2010 | Microsoft Corporation | Randomized signal transforms and their applications | | US7772478 | Apr 12, 2007 | Aug 10, 2010 | Massachusetts Institute of Technology | Understanding music | | US7787973 | Dec 29, 2004 | Aug 31, 2010 | Clear Channel Management Services, Inc. | Generating a composite media stream | | US7805500 | Oct 31, 2007 | Sep 28, 2010 | Digimarc Corporation | Network linking methods and apparatus | | US7824029 | May 12, 2003 | Nov 2, 2010 | L-1 Secure Credentialing, Inc. | Identification card printer-assembler for over the counter card issuing | | US7827110 | Sep 21, 2005 | Nov 2, 2010 | | Marketing compositions by using a customized sequence of compositions | | US7827237 | Mar 12, 2007 | Nov 2, 2010 | Citrix Systems, Inc. | Systems and methods for identifying long matches of data in a compression history | | US7840177 | May 23, 2007 | Nov 23, 2010 | Landmark Digital Services, LLC | Device for monitoring multiple broadcast signals | | US7844452 | Feb 25, 2009 | Nov 30, 2010 | Kabushiki Kaisha Toshiba | Sound quality control apparatus, sound quality control method, and sound quality control program | | US7849131 | May 12, 2006 | Dec 7, 2010 | Gracenote, Inc. | Method of enhancing rendering of a content item, client system and server system | | US7853664 | Sep 27, 2000 | Dec 14, 2010 | Landmark Digital Services LLC | Method and system for purchasing pre-recorded music | | US7856354 | Feb 25, 2009 | Dec 21, 2010 | Kabushiki Kaisha Toshiba | Voice/music determining apparatus, voice/music determination method, and voice/music determination program | | US7861087 | Aug 1, 2007 | Dec 28, 2010 | Citrix Systems, Inc. | Systems and methods for state signing of internet resources | | US7865368 | Mar 14, 2008 | Jan 4, 2011 | Landmark Digital Services, LLC | System and methods for recognizing sound and music signals in high noise and distortion | | US7865522 | Nov 7, 2007 | Jan 4, 2011 | Napo Enterprises, LLC | System and method for hyping media recommendations in a media recommendation system | | US7865585 | Mar 12, 2007 | Jan 4, 2011 | Citrix Systems, Inc. | Systems and methods for providing dynamic ad hoc proxy-cache hierarchies | | US7872597 | Oct 5, 2009 | Jan 18, 2011 | Citrix Systems, Inc. | Systems and methods of using application and protocol specific parsing for compression | | US7873521 | Jul 8, 2005 | Jan 18, 2011 | Nippon Telegraph and Telephone Corporation | Sound signal detection system, sound signal detection server, image signal search apparatus, image signal search method, image signal search program and medium, signal search apparatus, signal search method and signal search program and medium | | US7881931 | Feb 4, 2008 | Feb 1, 2011 | Gracenote, Inc. | Automatic identification of sound recordings | | US7884274 | Nov 3, 2003 | Feb 8, 2011 | | Adaptive personalized music and entertainment | | US7904503 | Aug 21, 2001 | Mar 8, 2011 | Gracenote, Inc. | Method of enhancing rendering of content item, client system and server system | | US7916047 | Oct 16, 2008 | Mar 29, 2011 | Citrix Systems, Inc. | Systems and methods of clustered sharing of compression histories | | US7921296 | May 7, 2007 | Apr 5, 2011 | Gracenote, Inc. | Generating and matching hashes of multimedia content | | US7925657 | Mar 17, 2004 | Apr 12, 2011 | Google Inc. | Methods and systems for adjusting a scoring measure based on query breadth | | US7936900 | Oct 20, 2009 | May 3, 2011 | Digimarc Corporation | Processing data representing video and audio and methods related thereto | | US7949149 | Jun 29, 2009 | May 24, 2011 | Digimarc Corporation | Deriving or calculating identifying data from video signals | | US7949494 | Dec 22, 2009 | May 24, 2011 | Blue Spike, Inc. | Method and device for monitoring and analyzing signals | | US7953981 | Aug 10, 2009 | May 31, 2011 | Wistaria Trading, Inc. | Optimization methods for the insertion, protection, and detection of digital watermarks in digital data | | US7961949 | Oct 12, 2009 | Jun 14, 2011 | Digimarc Corporation | Extracting multiple identifiers from audio and video content | | US7965864 | Jun 9, 2009 | Jun 21, 2011 | Digimarc Corporation | Data transmission by extracted or calculated identifying data | | US7970167 | Jul 21, 2009 | Jun 28, 2011 | Digimarc Corporation | Deriving identifying data from video and audio | | US7970922 | Aug 21, 2008 | Jun 28, 2011 | Napo Enterprises, LLC | P2P real time media recommendations | | US7987371 | Jul 9, 2008 | Jul 26, 2011 | Wistaria Trading, Inc. | Optimization methods for the insertion, protection, and detection of digital watermarks in digital data | | US7991188 | Aug 31, 2007 | Aug 2, 2011 | Wisteria Trading, Inc. | Optimization methods for the insertion, protection, and detection of digital watermarks in digital data | | US8001612 | Aug 12, 2005 | Aug 16, 2011 | | Distributing digital-works and usage-rights to user-devices | | US8010988 | Jun 2, 2006 | Aug 30, 2011 | | Using features extracted from an audio and/or video work to obtain information about the work | | US8020187 | Feb 11, 2010 | Sep 13, 2011 | | Identifying works, using a sub linear time search or a non exhaustive search, for initiating a work-based action, such as an action on the internet | | US8024326 | Jan 9, 2009 | Sep 20, 2011 | Google Inc. | Methods and systems for improving a search ranking using related queries | | US8036418 | Sep 22, 2009 | Oct 11, 2011 | Digimarc Corporation | Systems and methods of managing audio and other media | | US8046841 | Aug 21, 2007 | Oct 25, 2011 | Wistaria Trading, Inc. | Steganographic method and device | | US8051127 | May 26, 2010 | Nov 1, 2011 | Citrix Systems, Inc. | Systems and methods for identifying long matches of data in a compression history | | US8055667 | Oct 20, 2009 | Nov 8, 2011 | Digimarc Corporation | Integrating and enhancing searching of media content and biometric databases | | US8059646 | Dec 13, 2006 | Nov 15, 2011 | Napo Enterprises, LLC | System and method for identifying music content in a P2P real time recommendation network | | US8060477 | Jun 23, 2010 | Nov 15, 2011 | Abo Enterprises, LLC | System and method for archiving a media collection | | US8060517 | Apr 8, 2011 | Nov 15, 2011 | Google Inc. | Methods and systems for adjusting a scoring measure based on query breadth | | US8060525 | Dec 21, 2007 | Nov 15, 2011 | Napo Enterprises, LLC | Method and system for generating media recommendations in a distributed environment based on tagging play history information with location information | | US8063799 | Mar 30, 2009 | Nov 22, 2011 | Citrix Systems, Inc. | Systems and methods for sharing compression histories between multiple devices | | US8085978 | Mar 9, 2010 | Dec 27, 2011 | Digimarc Corporation | Distributed decoding of digitally encoded media signals | | US8090579 | Feb 8, 2006 | Jan 3, 2012 | Landmark Digital Services | Automatic identification of repeated material in audio signals | | US8090606 | Aug 8, 2006 | Jan 3, 2012 | Napo Enterprises, LLC | Embedded media recommendations | | US8090713 | Nov 18, 2008 | Jan 3, 2012 | Google Inc. | Methods and systems for improving a search ranking using population information | | US8099403 | Mar 30, 2010 | Jan 17, 2012 | Digimarc Corporation | Content identification and management in content distribution networks | | US8104079 | Mar 23, 2009 | Jan 24, 2012 | | Methods, systems and devices for packet watermarking and efficient provisioning of bandwidth | | US8108369 | Sep 21, 2009 | Jan 31, 2012 | Accenture Global Services Limited | Customized multi-media services | | US8112720 | Apr 5, 2007 | Feb 7, 2012 | Napo Enterprises, LLC | System and method for automatically and graphically associating programmatically-generated media item recommendations related to a user's socially recommended media items | | US8117193 | Aug 15, 2008 | Feb 14, 2012 | Lemi Technology, LLC | Tunersphere | | US8121343 | Oct 10, 2010 | Feb 21, 2012 | Wistaria Trading, Inc | Optimization methods for the insertion, protection, and detection of digital watermarks in digitized data | | US8121843 | Apr 23, 2007 | Feb 21, 2012 | Digimarc Corporation | Fingerprint methods and systems for media signals | | US8140331 | Jul 4, 2008 | Mar 20, 2012 | Xia Lou | Feature extraction for identification and classification of audio signals | | US8150096 | Mar 23, 2006 | Apr 3, 2012 | Digimarc Corporation | Video fingerprinting to identify video content | | US8151113 | Sep 4, 2009 | Apr 3, 2012 | Digimarc Corporation | Methods and devices responsive to ambient audio | | US8155582 | Mar 19, 2009 | Apr 10, 2012 | Digimarc Corporation | Methods and systems employing digital content | | US8160249 | Dec 22, 2009 | Apr 17, 2012 | Blue Spike, Inc. | Utilizing data reduction in steganographic and cryptographic system | | US8160840 | Nov 30, 2010 | Apr 17, 2012 | Yahoo! Inc. | Comparison of data signals using characteristic electronic thumbprints extracted therefrom | | US8161286 | Jun 21, 2010 | Apr 17, 2012 | Wistaria Trading, Inc. | Method and system for digital watermarking | | US8170273 | Apr 27, 2010 | May 1, 2012 | Digimarc Corporation | Encoding and decoding auxiliary signals | | US8171561 | Oct 9, 2008 | May 1, 2012 | Blue Spike, Inc. | Secure personal content server | | US8175330 | Aug 18, 2011 | May 8, 2012 | Wistaria Trading, Inc. | Optimization methods for the insertion, protection, and detection of digital watermarks in digitized data | | US8175730 | Jun 30, 2009 | May 8, 2012 | SONY Corporation | Device and method for analyzing an information signal | | US8185579 | Sep 19, 2008 | May 22, 2012 | Eloy Technology, LLC | System and method for obtaining media content for a portable media player | | US8190435 | Nov 24, 2010 | May 29, 2012 | Shazam Investments Limited | System and methods for recognizing sound and music signals in high noise and distortion | | US8195821 | Nov 18, 2003 | Jun 5, 2012 | Sony Corporation | Autonomous information processing apparatus and method in a network of information processing apparatuses | | US8200602 | May 27, 2009 | Jun 12, 2012 | Napo Enterprises, LLC | System and method for creating thematic listening experiences in a networked peer media recommendation environment | | US8205237 | Oct 23, 2007 | Jun 19, 2012 | | Identifying works, using a sub-linear time search, such as an approximate nearest neighbor search, for initiating a work-based action, such as an action on the internet | | US8209180 | Feb 1, 2007 | Jun 26, 2012 | NEC Corporation | Speech synthesizing device, speech synthesizing method, and program | | US8214175 | Feb 26, 2011 | Jul 3, 2012 | Blue Spike, Inc. | Method and device for monitoring and analyzing signals | | US8224022 | Sep 15, 2009 | Jul 17, 2012 | Digimarc Corporation | Connected audio and other media objects | | US8224705 | Sep 10, 2007 | Jul 17, 2012 | | Methods, systems and devices for packet watermarking and efficient provisioning of bandwidth | | US8225099 | Apr 14, 2010 | Jul 17, 2012 | Wistaria Trading, Inc. | Linear predictive coding implementation of digital watermarks | | US8238553 | Mar 30, 2009 | Aug 7, 2012 | Wistaria Trading, Inc | Steganographic method and device |
Claims1. A method for recognizing a media entity from a media sample, comprising: - computing a set of sample fingerprints, each sample fingerprint characterizing a particular sample landmark within said media sample;
- obtaining a set of file fingerprints, each file fingerprint characterizing a particular file landmark within a media entity to be identified;
- generating correspondences between said sample landmarks and said obtained file landmarks, wherein corresponding landmarks have equivalent fingerprints; and
- identifying said media entity if a plurality of said corresponding landmarks are substantially linearly related.
2. The method of claim 1 wherein said sample landmarks are computed in dependence on said media sample. 3. The method of claim 1 wherein said each sample fingerprint represents one or more features of said media sample at or near said particular sample landmark. 4. The method of claim 1 wherein said sample fingerprints and said file fingerprints have numerical values. 5. The method of claim 1 wherein values of said sample fingerprints specify a method for computing said sample fingerprints. 6. The method of claim 1 wherein said media sample is an audio sample. 7. The method of claim 6 wherein said sample landmarks are timepoints within said audio sample. 8. The method of claim 7 wherein said timepoints occur at local maxima of spectral Lp norms of said audio sample. 9. The method of claim 6 wherein said sample fingerprints are computed from a frequency analysis of said audio sample. 10. The method of claim 6 wherein said sample fingerprints are selected from the group consisting of spectral slice fingerprints, LPC coefficients, and cepstral coefficients. 11. The method of claim 6 wherein said sample fingerprints are computed from a spectrogram of said audio sample. 12. The method of claim 11 wherein salient points of said spectrogram comprise time coordinates and frequency coordinates, and wherein said sample landmarks are computed from said time coordinates, and said sample fingerprints are computed from said frequency coordinates. 13. The method of claim 12, further comprising linking at least one of said salient points to an anchor salient point, wherein one of said sample landmarks is computed from a time coordinate of said anchor salient point, and a corresponding fingerprint is computed from frequency coordinates of at least one of said linked salient points and said anchor point. 14. The method of claim 13, wherein said linked salient points fall within a target zone. 15. The method of claim 14, wherein said target zone is defined by a time range. 16. The method of claim 14, wherein said target zone is defined by a frequency range. 17. The method of claim 14, wherein said target zone is variable. 18. The method of claim 13 wherein said corresponding fingerprint is computed from a quotient between two of said frequency coordinates of said linked salient points and said anchor point, whereby said corresponding fingerprint is time-stretch invariant. 19. The method of claim 13 wherein said corresponding fingerprint is further computed from at least one time difference between said time coordinate of said anchor point and said time coordinates of said linked salient points. 20. The method of claim 19, wherein said corresponding fingerprint is further computed from a product of one of said time differences and one of said frequency coordinates of said linked salient points and said anchor point, whereby said corresponding fingerprint is time-stretch invariant. 21. The method of claim 6 wherein said sample landmarks and said sample fingerprints are computed from salient points of a multidimensional function of said audio sample, wherein at least one of said dimensions is a time dimension and at least one of said dimensions is a non-time dimension. 22. The method of claim 21 wherein said sample landmarks are computed from said time dimensions. 23. The method of claim 21 wherein said sample fingerprints are computed from at least one of said non-time dimensions. 24. The method of claim 21 wherein said salient points are selected from the group consisting of local maxima, local minima, and zero crossings of said multidimensional function. 25. The method of claim 6 wherein said sample fingerprints are time-stretch invariant. 26. The method of claim 6 wherein each sample fingerprint is computed from multiple timeslices of said audio sample. 27. The method of claim 26 wherein said multiple timeslices are offset by a variable amount of time. 28. The method of claim 27 wherein each fingerprint is computed in part from said variable amount. 29. The method of claim 1 wherein said identifying step comprises locating a diagonal line within a scatter plot of said corresponding landmarks. 30. The method of claim 29 wherein locating said diagonal line comprises forming differences between said corresponding landmarks. 31. The method of claim 30 wherein locating said diagonal line further comprises sorting said differences. 32. The method of claim 30 wherein locating said diagonal line further comprises calculating the peak of a histogram of said differences. 33. The method of claim 1 wherein said identifying step comprises computing one of a Hough transform and a Radon transform of said correspondences. 34. The method of claim 33 wherein said identifying step further comprises locating a peak of said Hough transform. 35. The method of claim 1 wherein said identifying step comprises determining whether a number of said correspondences exceeds a threshold value. 36. The method of claim 1 further comprising: - obtaining from a database index additional fingerprints characterizing file locations of additional media entities to be identified;
- generating additional correspondences between said sample landmarks and file landmarks of said additional media entities, wherein corresponding landmarks have equivalent fingerprints; and
- identifying media entities for which a plurality of said corresponding landmarks are substantially linearly related.
37. The method of claim 36 further comprising selecting a winning media entity from said identified media entities, wherein said winning media entity has a largest plurality of substantially linearly related corresponding landmarks. 38. The method of claim 36 wherein the step of identifying said media entities for which a plurality of said corresponding landmarks are substantially linearly related further comprises searching a first subset of said additional media entities. 39. The method of claim 38 wherein additional media entities in said first subset have a higher probability of being identified than additional media entities that are not in said first subset. 40. The method of claim 39 wherein said probability of being identified is computed in dependence on a recency of previous identification. 41. The method of claim 39 wherein said probability of being identified is computed in dependence on a frequency of previous identification. 42. The method of claim 38 wherein the step of identifying said media entities for which a plurality of said corresponding landmarks are substantially linearly related further comprises searching a second subset of said additional media entities, wherein no media entities in said first subset are identified. 43. The method of claim 36, further comprising ranking said additional media entities according to a probability of being identified. 44. The method of claim 43 wherein said probability is computed in part in dependence on a recency of previous identification. 45. The method of claim 44 wherein said probability is computed in part by increasing a recency score of a particular media entity when said particular media entity is identified. 46. The method of claim 44 wherein said probability is computed in part by decreasing recency scores of said additional media entities at regular time intervals. 47. The method of claim 46 wherein said recency scores are decreased exponentially in time. 48. The method of claim 43 wherein the step of identifying said media entities for which a plurality of said corresponding landmarks are substantially linearly related further comprises searching said additional media entities according to said ranking. 49. The method of claim 36 wherein the step of identifying said media entities for which a plurality of said corresponding landmarks are substantially linearly related further comprises terminating said search at a media entity having a number of said substantially linearly related corresponding landmarks that exceeds a predetermined threshold. 50. The method of claim 1 wherein said method is implemented in a distributed system. 51. The method of claim 50 wherein said computing step is performed in a client device, said obtaining, generating, and identifying steps are performed in a central location, and the method further comprises transmitting said sample fingerprints from said client device to said central location. 52. The method of claim 1, further comprising repeating said computing, obtaining, generating, and identifying steps for sequentially growing size of said media sample. 53. The method of claim 1, further comprising performing said obtaining, generating, and identifying steps at periodic intervals on a rolling buffer storing said computed sample fingerprints. 54. The method of claim 1, further comprising obtaining said media sample and simultaneously performing said computing step. 55. A method for recognizing a media entity from a media sample, comprising: - receiving a set of sample fingerprints, each sample fingerprint characterizing a particular sample landmark within said media sample;
- obtaining a set of file fingerprints, each file fingerprint characterizing a particular file landmark within a media entity to be identified;
- generating correspondences between said sample landmarks and said obtained file landmarks, wherein corresponding landmarks have equivalent fingerprints; and
- identifying said media entity if a plurality of said corresponding landmarks are substantially linearly related.
56. A method for recognizing a media sample, comprising: - continually sampling into a sound buffer N seconds of said media sample;
- computing a set of sample fingerprints characterizing a segment of said media sample stored in said sound buffer, wherein said segment has one or more distinct landmarks occurring at reproducible locations of said media sample;
- storing said fingerprints in a rolling buffer;
- obtaining a set of matching fingerprints in a database index, each matching fingerprint characterizing at least one distinct landmark of a media file and is equivalent to at least one fingerprint in said rolling buffer;
- identifying at least one media file having a plurality of matching fingerprints;
- reporting presence of said at least one media file; and
- removing at least one sample fingerprint from said rolling buffer.
57. The method of claim 56, further comprising repeating said method for additional segments of said media sample. 58. The method of claim 56 wherein said computing, storing, and removing steps are performed in a client device and said locating and identifying steps are performed in a central location, and wherein the method further comprises transmitting said sample fingerprints from said client device to said central location. 59. The method of claim 56 wherein said computing step is performed in a client device and said storing, locating, identifying, and removing steps are performed in a central location, and wherein the method further comprises transmitting said fingerprints from said client device to said central location. 60. A computer system programmed to perform the method steps of claim 1. 61. The method of claim 56, wherein said reproducible locations and said sample fingerprints are computed simultaneously. 62. A program storage device accessible by a computer, tangibly embodying a program of instructions executable by said computer to perform method steps for recognizing a media entity from a media sample, said program of instructions comprising: - code for computing a set of sample fingerprints, each sample fingerprint characterizing a particular sample landmark within said media sample;
- code for obtaining a set of file fingerprints, each file fingerprint characterizing a particular file landmark within a media entity to be identified;
- code for generating correspondences between said sample landmarks and said obtained file landmarks, wherein corresponding landmarks have equivalent fingerprints; and
- code for identifying said media entity if a plurality of said corresponding landmarks are substantially linearly related.
63. A system for recognizing a media entity from a media sample, comprising: - a landmarking and fingerprinting object for computing a set of particular sample landmarks within said media sample and a set of sample fingerprints, each sample fingerprint characterizing one of said particular sample landmarks;
- a database index containing file landmarks and corresponding file fingerprints for at least one media entity to be identified; and
- an analysis object for:
- locating a set of matching fingerprints in said database index, wherein said matching fingerprints are equivalent to said sample fingerprints;
- generating correspondences between said sample landmarks and said file landmarks, wherein corresponding landmarks have equivalent fingerprints; and
- identifying at least one media entity for which a plurality of said corresponding landmarks are substantially linearly related.
64. A computer-implemented method for recognizing an audio sample, comprising: - creating a database index of at least one audio file in a database, comprising:
- computing landmarks and fingerprints for each audio file, wherein each landmark occurs at a particular location within said audio file and is associated with a fingerprint;
- associating, for each audio file, said landmarks and fingerprints with an identifier; and
- storing said fingerprints, said landmarks, and said identifier in a memory.
65. The method of claim 64, further comprising sorting said database index by fingerprint value. 66. The method of claim 64 wherein said particular locations of each audio file are computed in dependence on said audio file. 67. The method of claim 64 wherein each fingerprint represents at least one feature of said audio file near said particular location. 68. The method of claim 64 wherein said fingerprints are numerical values. 69. The method of claim 64 wherein values of said fingerprints specify a method for computing said fingerprints. 70. The method of claim 64 wherein said particular locations are timepoints within said audio file. 71. The method of claim 70 wherein said timepoints occur at local maxima of spectral Lp norms of said audio file. 72. The method of claim 64 wherein said fingerprints are computed from a frequency analysis of said audio file. 73. The method of claim 64 wherein said fingerprints are selected from the group consisting of spectral slice fingerprints, LPC coefficients, and cepstral coefficients. 74. The method of claim 64 wherein said fingerprints are computed from a spectrogram of said audio file. 75. The method of claim 74 wherein salient points of said spectrogram comprise time coordinates and frequency coordinates, and wherein said particular locations are computed from said time coordinates, and said fingerprints are computed from said frequency coordinates. 76. The method of claim 75, further comprising linking at least one of said salient points to an anchor salient point, wherein one of said particular locations is computed from a time coordinate of said anchor salient point, and a corresponding fingerprint is computed from frequency coordinates of at least one of said linked salient points and said anchor point. 77. The method of claim 76, wherein said linked salient points fall within a target zone. 78. The method of claim 77, wherein said target zone is defined by a time range. 79. The method of claim 77, wherein said target zone is defined by a frequency range. 80. The method of claim 77, wherein said target zone is variable. 81. The method of claim 76, wherein said corresponding fingerprint is computed from a quotient between two of said frequency coordinates of said linked salient points and said anchor point, whereby said corresponding fingerprint is time-stretch invariant. 82. The method of claim 76, wherein said corresponding fingerprint is further computed from at least one time difference between said time coordinate of said anchor point and said time coordinates of said linked salient points. 83. The method of claim 82, wherein said corresponding fingerprint is further computed from a product of one of said time differences and one of said frequency coordinates of said linked salient points and said anchor point, whereby said corresponding fingerprint is time-stretch invariant. 84. The method of claim 64 wherein said particular locations and said fingerprints are computed from salient points of a multidimensional function of said audio file, wherein at least one of said dimensions is a time dimension and at least one of said dimensions is a non-time dimension. 85. The method of claim 84 wherein said particular locations are computed from said time dimensions. 86. The method of claim 84 wherein said fingerprints are computed from at least one of said non-time dimensions. 87. The method of claim 84 wherein said salient points are selected from the group consisting of local maxima, local minima, and zero crossings of said multidimensional function. 88. The method of claim 64 wherein said fingerprints are time-stretch invariant. 89. The method of claim 64 wherein each fingerprint is computed from multiple timeslices of said audio file. 90. The method of claim 89 wherein said multiple timeslices are offset by a variable amount of time. 91. The method of claim 90 wherein said fingerprints are computed in part from said variable amounts. 92. A method for recognizing a media entity from a media sample, comprising: - generating correspondences between landmarks of said media sample and corresponding landmarks of a media entity to be identified, wherein said landmarks of said media sample and said corresponding landmarks of said media entity have equivalent fingerprints; and
- identifying said media sample and said media entity if a plurality of said correspondences have a linear relationship defined by
- description="In-line Formulae" end="lead"landmark*n=m*landmarkn+offset,description="In-line Formulae" end="tail"
- where
- landmarkn is a sample landmark,
- landmark*n is a file landmark that corresponds to landmarkn, and
- m represents slope.
93. A method for recognizing a media sample, comprising - identifying media files that have file landmarks that are substantially linearly related to sample landmarks of said media sample; wherein
- said file landmarks and said sample landmarks have equivalent fingerprints; and wherein
- said file landmarks and said sample landmarks have a linear correspondence defined by
- description="In-line Formulae" end="lead"landmark*n=m*landmarkn+offset,description="In-line Formulae" end="tail"
- where
- landmarkn is a sample landmark,
- landmark*n is a file landmark that corresponds to landmarkn, and
- m represents slope.
94. A method for comparing an audio sample and an audio entity, comprising: - for each of at least one audio entity to be identified, computing a plurality of entity fingerprints representing said audio entity; wherein each entity fingerprint characterizes one or more features of said audio entity at or near an entity landmark in at least one dimensions including time;
- computing a plurality of sample fingerprints representing said audio sample, wherein said sample fingerprints are invariant to time stretching of said audio sample; and
- identifying a matching audio entity that has at least a threshold number of said file fingerprints that are equivalent to said sample fingerprints.
95. The method of claim 94 wherein said sample fingerprints comprise quotients of frequency components of said audio sample. 96. The method of claim 94 wherein said sample fingerprints comprise products of frequency components of said audio sample and time differences between points in said audio sample. 97. A method of characterizing an audio sample, comprising computing at least one fingerprint from a spectrogram of said audio sample, wherein said spectrogram comprises an anchor salient point and linked salient points, and wherein said fingerprint is computed from frequency coordinates of said anchor salient point and at least one linked salient point. 98. The method of claim 97, wherein said linked salient points fall within a target zone. 99. The method of claim 98, wherein said target zone is defined by a time range. 100. The method of claim 98, wherein said target zone is defined by a frequency range. 101. The method of claim 98, wherein said target zone is variable. 102. The method of claim 97 wherein said fingerprint is computed from a quotient between two of said frequency coordinates of said linked salient points and said anchor point, whereby said fingerprint is time-stretch invariant. 103. The method of claim 97 wherein said fingerprint is further computed from at least one time difference between said time coordinate of said anchor point and said time coordinates of said linked salient points. 104. The method of claim 103, wherein said fingerprint is further computed from a product of one of said time differences and one of said frequency coordinates of said linked salient points and said anchor point, whereby said fingerprint is time-stretch invariant. 105. The method of claim 97 wherein said anchor salient point and said linked salient points are selected from the group consisting of local maxima, local minima, and zero crossings of said spectrogram. 106. A method for comparing an audio sample and an audio entity, comprising: - for each of at least one audio entity to be identified, computing a plurality of entity landmark/fingerprint pairs representing said audio entity, wherein each landmark occurs at a particular location within said audio entity in at least one dimension including time, and wherein each fingerprint characterizes one or more features of said audio entity at or near said particular location;
- computing a plurality of sample landmark/fingerprint pairs representing said audio sample by
- obtaining time and frequency coordinates of at least one salient point of a spectrogram of said audio sample, wherein each salient point serves as an anchor point defining a sample landmark; and
- generating at least one multidimensional sample landmark/fingerprint pair from said at least one salient point, wherein sample landmarks of said audio sample are taken to be time coordinates and wherein corresponding sample fingerprints are computed from at least one of the remaining coordinates; and
- identifying a winning audio entity that has at least a threshold number of said file fingerprints that are equivalent to said sample fingerprints.
107. The method of claim 106, wherein said linked salient points fall within a target zone. 108. The method of claim 107, wherein said target zone is defined by a time range. 109. The method of claim 107, wherein said target zone is defined by a frequency range. 110. The method of claim 107, wherein said target zone is variable. 111. The method of claim 106 wherein said sample fingerprint is computed from a quotient between two of said frequency coordinates of said linked salient points and said anchor point, whereby said sample fingerprint is time-stretch invariant. 112. The method of claim 106 wherein said sample fingerprint is further computed from at least one time difference between said time coordinate of said anchor point and said time coordinates of said linked salient points. 113. The method of claim 112, wherein said sample fingerprint is further computed from a product of one of said time differences and one of said frequency coordinates of said linked salient points and said anchor point, whereby said sample fingerprint is time-stretch invariant. 114. The method of claim 106 wherein said anchor salient point and said linked salient points are selected from the group consisting of local maxima, local minima, and zero crossings of said spectrogram. |