Search Images Maps Play YouTube News Gmail Drive More »
Advanced Patent Search | Web History | Sign in

Patents

A method for recognizing an audio sample locates an audio file that most closely matches the audio sample from a database indexing a large set of original recordings. Each indexed audio file is represented in the database index by a set of landmark timepoints and associated fingerprints. Landmarks occur at reproducible locations within the file, while fingerprints represent features of the signal at or near the landmark timepoints. To perform recognition, landmarks and fingerprints are computed for the unknown sample and used to retrieve matching fingerprints from the database. For each file containing matching fingerprints, the landmarks are compared with landmarks of the sample at which the same fingerprints were computed. If a large number of corresponding landmarks are linearly related, i.e., if equivalent fingerprints of the sample and retrieved file have the same time evolution, then the file is identified with the sample. The method can be used for any type of sound or music,...

InventorsAvery Li-Chun Wang, Julius O. Smith, III
Original AssigneeLandmark Digital Services LLC
Primary Examiner: Susan McFadden
Attorney: Fulbright & Jaworski LLP
Current U.S. Classification704/270; 84/609; 704/E15.045; 704/E17.002; 707/E17.101; 709/231

View patent at USPTO
Search USPTO Assignment Database
Download USPTO Public PAIR data

Citations

Cited PatentFiling dateIssue dateOriginal AssigneeTitle
US4415767Oct 19, 1981Nov 15, 1983VotanMethod and apparatus for speech recognition and reproduction
US4450531Sep 10, 1982May 22, 1984Ensco, Inc.Broadcast signal recognition system and method
US4843562Jun 24, 1987Jun 27, 1989Broadcast Data Systems Limited PartnershipBroadcast information classification system and method
US4852181Sep 22, 1986Jul 25, 1989Oki Electric Industry Co., Ltd.Speech recognition for recognizing the catagory of an input speech pattern
US5210820May 2, 1990May 11, 1993Broadcast Data Systems Limited PartnershipSignal recognition system and method
US5276629Aug 14, 1992Jan 4, 1994Reynolds Software, Inc.Method and apparatus for wave analysis and event recognition
US5400261Sep 7, 1993Mar 21, 1995Reynolds Software, Inc.Method and apparatus for wave analysis and event recognition
US5918223Jul 21, 1997Jun 29, 1999Muscle FishMethod and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information
US6434520Apr 16, 1999Aug 13, 2002International Business Machines CorporationSystem and method for indexing and querying audio archives
US6453252May 15, 2000Sep 17, 2002Creative Technology Ltd.Process for identifying audio content
US6570080May 18, 2000May 27, 2003Yamaha CorporationMethod and system for supplying contents via communication network
US6748360Feb 21, 2002Jun 8, 2004International Business Machines CorporationSystem for selling a product utilizing audio content identification
US6834308Feb 17, 2000Dec 21, 2004Audible Magic CorporationMethod and apparatus for identifying media content presented on a media playing device

Referenced by

Citing PatentFiling dateIssue dateOriginal AssigneeTitle
US7072493Nov 8, 2004Jul 4, 2006Microsoft CorporationRobust and stealthy video watermarking into regions of successive frames
US7095873Jun 28, 2002Aug 22, 2006Microsoft CorporationWatermarking via quantization of statistics of overlapping regions
US7130892Sep 27, 2001Oct 31, 2006International Business Machines CorporationMethod and system for music distribution
US7136535Dec 19, 2005Nov 14, 2006Microsoft CorporationContent recognizer via probabilistic mirror distribution
US7152163Nov 4, 2004Dec 19, 2006Microsoft CorporationContent-recognition facilitator
US7155386Mar 11, 2004Dec 26, 2006Mindspeed Technologies, Inc.Adaptive correlation window for open-loop pitch
US7171561Oct 17, 2002Jan 30, 2007The United States of America as represented by the Secretary of the Air ForceMethod and apparatus for detecting and extracting fileprints
US7181622Nov 3, 2005Feb 20, 2007Microsoft CorporationDerivation and quantization of robust non-local characteristics for blind watermarking
US7188065Nov 4, 2004Mar 6, 2007Microsoft CorporationCategorizer of content in digital signals
US7188249Nov 3, 2005Mar 6, 2007Microsoft CorporationDerivation and quantization of robust non-local characteristics for blind watermarking
US7240210Nov 4, 2004Jul 3, 2007Microsoft CorporationHash value computer of content of digital signals
US7248715Sep 20, 2001Jul 24, 2007Digimarc CorporationDigitally watermarking physical media
US7248717Jul 27, 2005Jul 24, 2007Digimarc CorporationSecuring media content with steganographic encoding
US7266244Jul 16, 2004Sep 4, 2007Microsoft CorporationRobust recognizer of perceptually similar content
US7289643Dec 19, 2001Oct 30, 2007Digimarc CorporationMethod, apparatus and programs for generating and utilizing content signatures
US7302574Jun 21, 2001Nov 27, 2007Digimarc CorporationContent identifiers triggering corresponding responses through collaborative processing
US7318157Nov 12, 2004Jan 8, 2008Microsoft CorporationDerivation and quantization of robust non-local characteristics for blind watermarking
US7318158Nov 3, 2005Jan 8, 2008Microsoft CorporationDerivation and quantization of robust non-local characteristics for blind watermarking
US7328153Jul 22, 2002Feb 5, 2008Gracenote, Inc.Automatic identification of sound recordings
US7333957Jan 6, 2003Feb 19, 2008Digimarc CorporationConnected audio and other media objects
US7346512Jan 23, 2006Mar 18, 2008Landmark Digital Services, LLCMethods for recognizing unknown media samples using characteristics of known media samples
US7349552Jan 6, 2003Mar 25, 2008Digimarc CorporationConnected audio and other media objects
US7356188Apr 24, 2001Apr 8, 2008Microsoft CorporationRecognizer of text-based work
US7379875Feb 24, 2004May 27, 2008Microsoft CorporationSystems and methods for generating audio thumbnails
US7406195Aug 15, 2005Jul 29, 2008Microsoft CorporationRobust recognizer of perceptually similar content
US7421096Feb 23, 2004Sep 2, 2008Input mechanism for fingerprint-based internet search
US7421128Jul 28, 2003Sep 2, 2008Microsoft CorporationSystem and method for hashing digital images
US7430307Sep 29, 2004Sep 30, 2008Olympus CorporationData processing apparatus
US7444388Apr 13, 2006Oct 28, 2008Concert Technology CorporationSystem and method for obtaining media content for a portable media player
US7453379Mar 12, 2007Nov 18, 2008Citrix Systems, Inc.Systems and methods for identifying long matches of data in a compression history
US7454417Sep 12, 2003Nov 18, 2008Google Inc.Methods and systems for improving a search ranking using population information
US7460038Mar 12, 2007Dec 2, 2008Citrix Systems, Inc.Systems and methods of clustered sharing of compression histories
US7477739Jan 21, 2003Jan 13, 2009Gracenote, Inc.Efficient storage of fingerprints
US7487180Jan 31, 2006Feb 3, 2009MusicIP CorporationSystem and method for recognizing audio pieces via audio fingerprinting
US7489801May 16, 2006Feb 10, 2009Digimarc CorporationEncoding and decoding signals for digital watermarking
US7499566Jul 22, 2005Mar 3, 2009Digimarc CorporationMethods for steganographic encoding media
US7505964Sep 12, 2003Mar 17, 2009Google Inc.Methods and systems for improving a search ranking using related queries
US7516074Sep 1, 2005Apr 7, 2009Auditude, Inc.Extraction and matching of characteristic fingerprints from audio signals
US7532134Mar 12, 2007May 12, 2009Citrix Systems, Inc.Systems and methods for sharing compression histories between multiple devices
US7545951Nov 14, 2005Jun 9, 2009Digimarc CorporationData transmission by watermark or derived identifier proxy
US7549052Feb 11, 2002Jun 16, 2009Gracenote, Inc.
Koninklijke PhilipsElectronics N.V.
Generating and matching hashes of multimedia content
US7564992Oct 24, 2008Jul 21, 2009Digimarc CorporationContent identification through deriving identifiers from video, images and audio
US7568103Dec 15, 2004Jul 28, 2009Microsoft CorporationDerivation and quantization of robust non-local characteristics for blind watermarking
US7574451Nov 2, 2004Aug 11, 2009Microsoft CorporationSystem and method for speeding up database lookups for multiple synchronized data streams
US7590259Oct 29, 2007Sep 15, 2009Digimarc CorporationDeriving attributes from images, audio or video to obtain metadata
US7593576Dec 3, 2004Sep 22, 2009Digimarc CorporationSystems and methods of managing audio and other media
US7603434Apr 13, 2006Oct 13, 2009Domingo Enterprises, LLCCentral system providing previews of a user's media collection to a portable media player
US7606390Aug 14, 2008Oct 20, 2009Digimarc CorporationProcessing data representing video and audio and methods and apparatus related thereto
US7606790Mar 3, 2004Oct 20, 2009Digimarc CorporationIntegrating and enhancing searching of media content and biometric databases
US7613736May 23, 2006Nov 3, 2009Resonance Media Services, Inc.Sharing music essence in a recommendation system
US7617398Nov 3, 2005Nov 10, 2009Microsoft CorporationDerivation and quantization of robust non-local characteristics for blind watermarking
US7619545Mar 12, 2007Nov 17, 2009Citrix Systems, Inc.Systems and methods of using application and protocol specific parsing for compression
US7623823Aug 30, 2005Nov 24, 2009Integrated Media Measurement, Inc.Detecting and measuring exposure to media content items
US7627477Oct 21, 2004Dec 1, 2009Landmark Digital Services, LLCRobust and invariant audio pattern matching
US7634405Jan 24, 2005Dec 15, 2009Microsoft CorporationPalette-based classifying and synthesizing of auditory information
US7634660Dec 15, 2004Dec 15, 2009Microsoft CorporationDerivation and quantization of robust non-local characteristics for blind watermarking
US7636849Nov 12, 2004Dec 22, 2009Microsoft CorporationDerivation and quantization of robust non-local characteristics for blind watermarking
US7650010Nov 21, 2008Jan 19, 2010Digimarc CorporationConnected video and audio
US7657752Nov 4, 2004Feb 2, 2010Microsoft CorporationDigital signal watermaker
US7676060Jun 5, 2007Mar 9, 2010Distributed content identification
US7680959Jul 11, 2006Mar 16, 2010Napo Enterprises, LLCP2P network for providing real time media recommendations
US7706570Feb 9, 2009Apr 27, 2010Digimarc CorporationEncoding and decoding auxiliary signals
US7707425Nov 4, 2004Apr 27, 2010Microsoft CorporationRecognizer of content of digital signals
US7711564Jun 27, 2002May 4, 2010Digimarc CorporationConnected audio and other media objects
US7765192Mar 29, 2006Jul 27, 2010Abo Enterprises, LLCSystem and method for archiving a media collection
US7770014Apr 30, 2004Aug 3, 2010Microsoft CorporationRandomized signal transforms and their applications
US7772478Apr 12, 2007Aug 10, 2010Massachusetts Institute of TechnologyUnderstanding music
US7787973Dec 29, 2004Aug 31, 2010Clear Channel Management Services, Inc.Generating a composite media stream
US7805500Oct 31, 2007Sep 28, 2010Digimarc CorporationNetwork linking methods and apparatus
US7824029May 12, 2003Nov 2, 2010L-1 Secure Credentialing, Inc.Identification card printer-assembler for over the counter card issuing
US7827110Sep 21, 2005Nov 2, 2010Marketing compositions by using a customized sequence of compositions
US7827237Mar 12, 2007Nov 2, 2010Citrix Systems, Inc.Systems and methods for identifying long matches of data in a compression history
US7840177May 23, 2007Nov 23, 2010Landmark Digital Services, LLCDevice for monitoring multiple broadcast signals
US7844452Feb 25, 2009Nov 30, 2010Kabushiki Kaisha ToshibaSound quality control apparatus, sound quality control method, and sound quality control program
US7849131May 12, 2006Dec 7, 2010Gracenote, Inc.Method of enhancing rendering of a content item, client system and server system
US7853664Sep 27, 2000Dec 14, 2010Landmark Digital Services LLCMethod and system for purchasing pre-recorded music
US7856354Feb 25, 2009Dec 21, 2010Kabushiki Kaisha ToshibaVoice/music determining apparatus, voice/music determination method, and voice/music determination program
US7861087Aug 1, 2007Dec 28, 2010Citrix Systems, Inc.Systems and methods for state signing of internet resources
US7865368Mar 14, 2008Jan 4, 2011Landmark Digital Services, LLCSystem and methods for recognizing sound and music signals in high noise and distortion
US7865522Nov 7, 2007Jan 4, 2011Napo Enterprises, LLCSystem and method for hyping media recommendations in a media recommendation system
US7865585Mar 12, 2007Jan 4, 2011Citrix Systems, Inc.Systems and methods for providing dynamic ad hoc proxy-cache hierarchies
US7872597Oct 5, 2009Jan 18, 2011Citrix Systems, Inc.Systems and methods of using application and protocol specific parsing for compression
US7873521Jul 8, 2005Jan 18, 2011Nippon Telegraph and Telephone CorporationSound signal detection system, sound signal detection server, image signal search apparatus, image signal search method, image signal search program and medium, signal search apparatus, signal search method and signal search program and medium
US7881931Feb 4, 2008Feb 1, 2011Gracenote, Inc.Automatic identification of sound recordings
US7884274Nov 3, 2003Feb 8, 2011Adaptive personalized music and entertainment
US7904503Aug 21, 2001Mar 8, 2011Gracenote, Inc.Method of enhancing rendering of content item, client system and server system
US7916047Oct 16, 2008Mar 29, 2011Citrix Systems, Inc.Systems and methods of clustered sharing of compression histories
US7921296May 7, 2007Apr 5, 2011Gracenote, Inc.Generating and matching hashes of multimedia content
US7925657Mar 17, 2004Apr 12, 2011Google Inc.Methods and systems for adjusting a scoring measure based on query breadth
US7936900Oct 20, 2009May 3, 2011Digimarc CorporationProcessing data representing video and audio and methods related thereto
US7949149Jun 29, 2009May 24, 2011Digimarc CorporationDeriving or calculating identifying data from video signals
US7949494Dec 22, 2009May 24, 2011Blue Spike, Inc.Method and device for monitoring and analyzing signals
US7953981Aug 10, 2009May 31, 2011Wistaria Trading, Inc.Optimization methods for the insertion, protection, and detection of digital watermarks in digital data
US7961949Oct 12, 2009Jun 14, 2011Digimarc CorporationExtracting multiple identifiers from audio and video content
US7965864Jun 9, 2009Jun 21, 2011Digimarc CorporationData transmission by extracted or calculated identifying data
US7970167Jul 21, 2009Jun 28, 2011Digimarc CorporationDeriving identifying data from video and audio
US7970922Aug 21, 2008Jun 28, 2011Napo Enterprises, LLCP2P real time media recommendations
US7987371Jul 9, 2008Jul 26, 2011Wistaria Trading, Inc.Optimization methods for the insertion, protection, and detection of digital watermarks in digital data
US7991188Aug 31, 2007Aug 2, 2011Wisteria Trading, Inc.Optimization methods for the insertion, protection, and detection of digital watermarks in digital data
US8001612Aug 12, 2005Aug 16, 2011Distributing digital-works and usage-rights to user-devices
US8010988Jun 2, 2006Aug 30, 2011Using features extracted from an audio and/or video work to obtain information about the work
US8020187Feb 11, 2010Sep 13, 2011Identifying works, using a sub linear time search or a non exhaustive search, for initiating a work-based action, such as an action on the internet
US8024326Jan 9, 2009Sep 20, 2011Google Inc.Methods and systems for improving a search ranking using related queries
US8036418Sep 22, 2009Oct 11, 2011Digimarc CorporationSystems and methods of managing audio and other media
US8046841Aug 21, 2007Oct 25, 2011Wistaria Trading, Inc.Steganographic method and device
US8051127May 26, 2010Nov 1, 2011Citrix Systems, Inc.Systems and methods for identifying long matches of data in a compression history
US8055667Oct 20, 2009Nov 8, 2011Digimarc CorporationIntegrating and enhancing searching of media content and biometric databases
US8059646Dec 13, 2006Nov 15, 2011Napo Enterprises, LLCSystem and method for identifying music content in a P2P real time recommendation network
US8060477Jun 23, 2010Nov 15, 2011Abo Enterprises, LLCSystem and method for archiving a media collection
US8060517Apr 8, 2011Nov 15, 2011Google Inc.Methods and systems for adjusting a scoring measure based on query breadth
US8060525Dec 21, 2007Nov 15, 2011Napo Enterprises, LLCMethod and system for generating media recommendations in a distributed environment based on tagging play history information with location information
US8063799Mar 30, 2009Nov 22, 2011Citrix Systems, Inc.Systems and methods for sharing compression histories between multiple devices
US8085978Mar 9, 2010Dec 27, 2011Digimarc CorporationDistributed decoding of digitally encoded media signals
US8090579Feb 8, 2006Jan 3, 2012Landmark Digital ServicesAutomatic identification of repeated material in audio signals
US8090606Aug 8, 2006Jan 3, 2012Napo Enterprises, LLCEmbedded media recommendations
US8090713Nov 18, 2008Jan 3, 2012Google Inc.Methods and systems for improving a search ranking using population information
US8099403Mar 30, 2010Jan 17, 2012Digimarc CorporationContent identification and management in content distribution networks
US8104079Mar 23, 2009Jan 24, 2012Methods, systems and devices for packet watermarking and efficient provisioning of bandwidth
US8108369Sep 21, 2009Jan 31, 2012Accenture Global Services LimitedCustomized multi-media services
US8112720Apr 5, 2007Feb 7, 2012Napo Enterprises, LLCSystem and method for automatically and graphically associating programmatically-generated media item recommendations related to a user's socially recommended media items
US8117193Aug 15, 2008Feb 14, 2012Lemi Technology, LLCTunersphere
US8121343Oct 10, 2010Feb 21, 2012Wistaria Trading, IncOptimization methods for the insertion, protection, and detection of digital watermarks in digitized data
US8121843Apr 23, 2007Feb 21, 2012Digimarc CorporationFingerprint methods and systems for media signals
US8140331Jul 4, 2008Mar 20, 2012Xia LouFeature extraction for identification and classification of audio signals
US8150096Mar 23, 2006Apr 3, 2012Digimarc CorporationVideo fingerprinting to identify video content
US8151113Sep 4, 2009Apr 3, 2012Digimarc CorporationMethods and devices responsive to ambient audio
US8155582Mar 19, 2009Apr 10, 2012Digimarc CorporationMethods and systems employing digital content
US8160249Dec 22, 2009Apr 17, 2012Blue Spike, Inc.Utilizing data reduction in steganographic and cryptographic system
US8160840Nov 30, 2010Apr 17, 2012Yahoo! Inc.Comparison of data signals using characteristic electronic thumbprints extracted therefrom
US8161286Jun 21, 2010Apr 17, 2012Wistaria Trading, Inc.Method and system for digital watermarking
US8170273Apr 27, 2010May 1, 2012Digimarc CorporationEncoding and decoding auxiliary signals
US8171561Oct 9, 2008May 1, 2012Blue Spike, Inc.Secure personal content server
US8175330Aug 18, 2011May 8, 2012Wistaria Trading, Inc.Optimization methods for the insertion, protection, and detection of digital watermarks in digitized data
US8175730Jun 30, 2009May 8, 2012SONY CorporationDevice and method for analyzing an information signal
US8185579Sep 19, 2008May 22, 2012Eloy Technology, LLCSystem and method for obtaining media content for a portable media player
US8190435Nov 24, 2010May 29, 2012Shazam Investments LimitedSystem and methods for recognizing sound and music signals in high noise and distortion
US8195821Nov 18, 2003Jun 5, 2012Sony CorporationAutonomous information processing apparatus and method in a network of information processing apparatuses
US8200602May 27, 2009Jun 12, 2012Napo Enterprises, LLCSystem and method for creating thematic listening experiences in a networked peer media recommendation environment
US8205237Oct 23, 2007Jun 19, 2012Identifying works, using a sub-linear time search, such as an approximate nearest neighbor search, for initiating a work-based action, such as an action on the internet
US8209180Feb 1, 2007Jun 26, 2012NEC CorporationSpeech synthesizing device, speech synthesizing method, and program
US8214175Feb 26, 2011Jul 3, 2012Blue Spike, Inc.Method and device for monitoring and analyzing signals
US8224022Sep 15, 2009Jul 17, 2012Digimarc CorporationConnected audio and other media objects
US8224705Sep 10, 2007Jul 17, 2012Methods, systems and devices for packet watermarking and efficient provisioning of bandwidth
US8225099Apr 14, 2010Jul 17, 2012Wistaria Trading, Inc.Linear predictive coding implementation of digital watermarks
US8238553Mar 30, 2009Aug 7, 2012Wistaria Trading, IncSteganographic method and device

Claims

1. A method for recognizing a media entity from a media sample, comprising:

computing a set of sample fingerprints, each sample fingerprint characterizing a particular sample landmark within said media sample;

obtaining a set of file fingerprints, each file fingerprint characterizing a particular file landmark within a media entity to be identified;

generating correspondences between said sample landmarks and said obtained file landmarks, wherein corresponding landmarks have equivalent fingerprints; and

identifying said media entity if a plurality of said corresponding landmarks are substantially linearly related.

2. The method of claim 1 wherein said sample landmarks are computed in dependence on said media sample.

3. The method of claim 1 wherein said each sample fingerprint represents one or more features of said media sample at or near said particular sample landmark.

4. The method of claim 1 wherein said sample fingerprints and said file fingerprints have numerical values.

5. The method of claim 1 wherein values of said sample fingerprints specify a method for computing said sample fingerprints.

6. The method of claim 1 wherein said media sample is an audio sample.

7. The method of claim 6 wherein said sample landmarks are timepoints within said audio sample.

8. The method of claim 7 wherein said timepoints occur at local maxima of spectral Lp norms of said audio sample.

9. The method of claim 6 wherein said sample fingerprints are computed from a frequency analysis of said audio sample.

10. The method of claim 6 wherein said sample fingerprints are selected from the group consisting of spectral slice fingerprints, LPC coefficients, and cepstral coefficients.

11. The method of claim 6 wherein said sample fingerprints are computed from a spectrogram of said audio sample.

12. The method of claim 11 wherein salient points of said spectrogram comprise time coordinates and frequency coordinates, and wherein said sample landmarks are computed from said time coordinates, and said sample fingerprints are computed from said frequency coordinates.

13. The method of claim 12, further comprising linking at least one of said salient points to an anchor salient point, wherein one of said sample landmarks is computed from a time coordinate of said anchor salient point, and a corresponding fingerprint is computed from frequency coordinates of at least one of said linked salient points and said anchor point.

14. The method of claim 13, wherein said linked salient points fall within a target zone.

15. The method of claim 14, wherein said target zone is defined by a time range.

16. The method of claim 14, wherein said target zone is defined by a frequency range.

17. The method of claim 14, wherein said target zone is variable.

18. The method of claim 13 wherein said corresponding fingerprint is computed from a quotient between two of said frequency coordinates of said linked salient points and said anchor point, whereby said corresponding fingerprint is time-stretch invariant.

19. The method of claim 13 wherein said corresponding fingerprint is further computed from at least one time difference between said time coordinate of said anchor point and said time coordinates of said linked salient points.

20. The method of claim 19, wherein said corresponding fingerprint is further computed from a product of one of said time differences and one of said frequency coordinates of said linked salient points and said anchor point, whereby said corresponding fingerprint is time-stretch invariant.

21. The method of claim 6 wherein said sample landmarks and said sample fingerprints are computed from salient points of a multidimensional function of said audio sample, wherein at least one of said dimensions is a time dimension and at least one of said dimensions is a non-time dimension.

22. The method of claim 21 wherein said sample landmarks are computed from said time dimensions.

23. The method of claim 21 wherein said sample fingerprints are computed from at least one of said non-time dimensions.

24. The method of claim 21 wherein said salient points are selected from the group consisting of local maxima, local minima, and zero crossings of said multidimensional function.

25. The method of claim 6 wherein said sample fingerprints are time-stretch invariant.

26. The method of claim 6 wherein each sample fingerprint is computed from multiple timeslices of said audio sample.

27. The method of claim 26 wherein said multiple timeslices are offset by a variable amount of time.

28. The method of claim 27 wherein each fingerprint is computed in part from said variable amount.

29. The method of claim 1 wherein said identifying step comprises locating a diagonal line within a scatter plot of said corresponding landmarks.

30. The method of claim 29 wherein locating said diagonal line comprises forming differences between said corresponding landmarks.

31. The method of claim 30 wherein locating said diagonal line further comprises sorting said differences.

32. The method of claim 30 wherein locating said diagonal line further comprises calculating the peak of a histogram of said differences.

33. The method of claim 1 wherein said identifying step comprises computing one of a Hough transform and a Radon transform of said correspondences.

34. The method of claim 33 wherein said identifying step further comprises locating a peak of said Hough transform.

35. The method of claim 1 wherein said identifying step comprises determining whether a number of said correspondences exceeds a threshold value.

36. The method of claim 1 further comprising:

obtaining from a database index additional fingerprints characterizing file locations of additional media entities to be identified;

generating additional correspondences between said sample landmarks and file landmarks of said additional media entities, wherein corresponding landmarks have equivalent fingerprints; and

identifying media entities for which a plurality of said corresponding landmarks are substantially linearly related.

37. The method of claim 36 further comprising selecting a winning media entity from said identified media entities, wherein said winning media entity has a largest plurality of substantially linearly related corresponding landmarks.

38. The method of claim 36 wherein the step of identifying said media entities for which a plurality of said corresponding landmarks are substantially linearly related further comprises searching a first subset of said additional media entities.

39. The method of claim 38 wherein additional media entities in said first subset have a higher probability of being identified than additional media entities that are not in said first subset.

40. The method of claim 39 wherein said probability of being identified is computed in dependence on a recency of previous identification.

41. The method of claim 39 wherein said probability of being identified is computed in dependence on a frequency of previous identification.

42. The method of claim 38 wherein the step of identifying said media entities for which a plurality of said corresponding landmarks are substantially linearly related further comprises searching a second subset of said additional media entities, wherein no media entities in said first subset are identified.

43. The method of claim 36, further comprising ranking said additional media entities according to a probability of being identified.

44. The method of claim 43 wherein said probability is computed in part in dependence on a recency of previous identification.

45. The method of claim 44 wherein said probability is computed in part by increasing a recency score of a particular media entity when said particular media entity is identified.

46. The method of claim 44 wherein said probability is computed in part by decreasing recency scores of said additional media entities at regular time intervals.

47. The method of claim 46 wherein said recency scores are decreased exponentially in time.

48. The method of claim 43 wherein the step of identifying said media entities for which a plurality of said corresponding landmarks are substantially linearly related further comprises searching said additional media entities according to said ranking.

49. The method of claim 36 wherein the step of identifying said media entities for which a plurality of said corresponding landmarks are substantially linearly related further comprises terminating said search at a media entity having a number of said substantially linearly related corresponding landmarks that exceeds a predetermined threshold.

50. The method of claim 1 wherein said method is implemented in a distributed system.

51. The method of claim 50 wherein said computing step is performed in a client device, said obtaining, generating, and identifying steps are performed in a central location, and the method further comprises transmitting said sample fingerprints from said client device to said central location.

52. The method of claim 1, further comprising repeating said computing, obtaining, generating, and identifying steps for sequentially growing size of said media sample.

53. The method of claim 1, further comprising performing said obtaining, generating, and identifying steps at periodic intervals on a rolling buffer storing said computed sample fingerprints.

54. The method of claim 1, further comprising obtaining said media sample and simultaneously performing said computing step.

55. A method for recognizing a media entity from a media sample, comprising:

receiving a set of sample fingerprints, each sample fingerprint characterizing a particular sample landmark within said media sample;

obtaining a set of file fingerprints, each file fingerprint characterizing a particular file landmark within a media entity to be identified;

generating correspondences between said sample landmarks and said obtained file landmarks, wherein corresponding landmarks have equivalent fingerprints; and

identifying said media entity if a plurality of said corresponding landmarks are substantially linearly related.

56. A method for recognizing a media sample, comprising:

continually sampling into a sound buffer N seconds of said media sample;

computing a set of sample fingerprints characterizing a segment of said media sample stored in said sound buffer, wherein said segment has one or more distinct landmarks occurring at reproducible locations of said media sample;

storing said fingerprints in a rolling buffer;

obtaining a set of matching fingerprints in a database index, each matching fingerprint characterizing at least one distinct landmark of a media file and is equivalent to at least one fingerprint in said rolling buffer;
identifying at least one media file having a plurality of matching fingerprints;
reporting presence of said at least one media file; and
removing at least one sample fingerprint from said rolling buffer.

57. The method of claim 56, further comprising repeating said method for additional segments of said media sample.

58. The method of claim 56 wherein said computing, storing, and removing steps are performed in a client device and said locating and identifying steps are performed in a central location, and wherein the method further comprises transmitting said sample fingerprints from said client device to said central location.

59. The method of claim 56 wherein said computing step is performed in a client device and said storing, locating, identifying, and removing steps are performed in a central location, and wherein the method further comprises transmitting said fingerprints from said client device to said central location.

60. A computer system programmed to perform the method steps of claim 1.

61. The method of claim 56, wherein said reproducible locations and said sample fingerprints are computed simultaneously.

62. A program storage device accessible by a computer, tangibly embodying a program of instructions executable by said computer to perform method steps for recognizing a media entity from a media sample, said program of instructions comprising:

code for computing a set of sample fingerprints, each sample fingerprint characterizing a particular sample landmark within said media sample;

code for obtaining a set of file fingerprints, each file fingerprint characterizing a particular file landmark within a media entity to be identified;

code for generating correspondences between said sample landmarks and said obtained file landmarks, wherein corresponding landmarks have equivalent fingerprints; and

code for identifying said media entity if a plurality of said corresponding landmarks are substantially linearly related.

63. A system for recognizing a media entity from a media sample, comprising:

a landmarking and fingerprinting object for computing a set of particular sample landmarks within said media sample and a set of sample fingerprints, each sample fingerprint characterizing one of said particular sample landmarks;

a database index containing file landmarks and corresponding file fingerprints for at least one media entity to be identified; and

an analysis object for:

locating a set of matching fingerprints in said database index, wherein said matching fingerprints are equivalent to said sample fingerprints;
generating correspondences between said sample landmarks and said file landmarks, wherein corresponding landmarks have equivalent fingerprints; and
identifying at least one media entity for which a plurality of said corresponding landmarks are substantially linearly related.

64. A computer-implemented method for recognizing an audio sample, comprising:

creating a database index of at least one audio file in a database, comprising:

computing landmarks and fingerprints for each audio file, wherein each landmark occurs at a particular location within said audio file and is associated with a fingerprint;

associating, for each audio file, said landmarks and fingerprints with an identifier; and

storing said fingerprints, said landmarks, and said identifier in a memory.

65. The method of claim 64, further comprising sorting said database index by fingerprint value.

66. The method of claim 64 wherein said particular locations of each audio file are computed in dependence on said audio file.

67. The method of claim 64 wherein each fingerprint represents at least one feature of said audio file near said particular location.

68. The method of claim 64 wherein said fingerprints are numerical values.

69. The method of claim 64 wherein values of said fingerprints specify a method for computing said fingerprints.

70. The method of claim 64 wherein said particular locations are timepoints within said audio file.

71. The method of claim 70 wherein said timepoints occur at local maxima of spectral Lp norms of said audio file.

72. The method of claim 64 wherein said fingerprints are computed from a frequency analysis of said audio file.

73. The method of claim 64 wherein said fingerprints are selected from the group consisting of spectral slice fingerprints, LPC coefficients, and cepstral coefficients.

74. The method of claim 64 wherein said fingerprints are computed from a spectrogram of said audio file.

75. The method of claim 74 wherein salient points of said spectrogram comprise time coordinates and frequency coordinates, and wherein said particular locations are computed from said time coordinates, and said fingerprints are computed from said frequency coordinates.

76. The method of claim 75, further comprising linking at least one of said salient points to an anchor salient point, wherein one of said particular locations is computed from a time coordinate of said anchor salient point, and a corresponding fingerprint is computed from frequency coordinates of at least one of said linked salient points and said anchor point.

77. The method of claim 76, wherein said linked salient points fall within a target zone.

78. The method of claim 77, wherein said target zone is defined by a time range.

79. The method of claim 77, wherein said target zone is defined by a frequency range.

80. The method of claim 77, wherein said target zone is variable.

81. The method of claim 76, wherein said corresponding fingerprint is computed from a quotient between two of said frequency coordinates of said linked salient points and said anchor point, whereby said corresponding fingerprint is time-stretch invariant.

82. The method of claim 76, wherein said corresponding fingerprint is further computed from at least one time difference between said time coordinate of said anchor point and said time coordinates of said linked salient points.

83. The method of claim 82, wherein said corresponding fingerprint is further computed from a product of one of said time differences and one of said frequency coordinates of said linked salient points and said anchor point, whereby said corresponding fingerprint is time-stretch invariant.

84. The method of claim 64 wherein said particular locations and said fingerprints are computed from salient points of a multidimensional function of said audio file, wherein at least one of said dimensions is a time dimension and at least one of said dimensions is a non-time dimension.

85. The method of claim 84 wherein said particular locations are computed from said time dimensions.

86. The method of claim 84 wherein said fingerprints are computed from at least one of said non-time dimensions.

87. The method of claim 84 wherein said salient points are selected from the group consisting of local maxima, local minima, and zero crossings of said multidimensional function.

88. The method of claim 64 wherein said fingerprints are time-stretch invariant.

89. The method of claim 64 wherein each fingerprint is computed from multiple timeslices of said audio file.

90. The method of claim 89 wherein said multiple timeslices are offset by a variable amount of time.

91. The method of claim 90 wherein said fingerprints are computed in part from said variable amounts.

92. A method for recognizing a media entity from a media sample, comprising:

generating correspondences between landmarks of said media sample and corresponding landmarks of a media entity to be identified, wherein said landmarks of said media sample and said corresponding landmarks of said media entity have equivalent fingerprints; and

identifying said media sample and said media entity if a plurality of said correspondences have a linear relationship defined by
description="In-line Formulae" end="lead"landmark*n=m*landmarkn+offset,description="In-line Formulae" end="tail"

where
landmarkn is a sample landmark,
landmark*n is a file landmark that corresponds to landmarkn, and
m represents slope.

93. A method for recognizing a media sample, comprising

identifying media files that have file landmarks that are substantially linearly related to sample landmarks of said media sample; wherein

said file landmarks and said sample landmarks have equivalent fingerprints; and wherein

said file landmarks and said sample landmarks have a linear correspondence defined by
description="In-line Formulae" end="lead"landmark*n=m*landmarkn+offset,description="In-line Formulae" end="tail"
where
landmarkn is a sample landmark,
landmark*n is a file landmark that corresponds to landmarkn, and
m represents slope.

94. A method for comparing an audio sample and an audio entity, comprising:

for each of at least one audio entity to be identified, computing a plurality of entity fingerprints representing said audio entity; wherein each entity fingerprint characterizes one or more features of said audio entity at or near an entity landmark in at least one dimensions including time;

computing a plurality of sample fingerprints representing said audio sample, wherein said sample fingerprints are invariant to time stretching of said audio sample; and

identifying a matching audio entity that has at least a threshold number of said file fingerprints that are equivalent to said sample fingerprints.

95. The method of claim 94 wherein said sample fingerprints comprise quotients of frequency components of said audio sample.

96. The method of claim 94 wherein said sample fingerprints comprise products of frequency components of said audio sample and time differences between points in said audio sample.

97. A method of characterizing an audio sample, comprising computing at least one fingerprint from a spectrogram of said audio sample, wherein said spectrogram comprises an anchor salient point and linked salient points, and wherein said fingerprint is computed from frequency coordinates of said anchor salient point and at least one linked salient point.

98. The method of claim 97, wherein said linked salient points fall within a target zone.

99. The method of claim 98, wherein said target zone is defined by a time range.

100. The method of claim 98, wherein said target zone is defined by a frequency range.

101. The method of claim 98, wherein said target zone is variable.

102. The method of claim 97 wherein said fingerprint is computed from a quotient between two of said frequency coordinates of said linked salient points and said anchor point, whereby said fingerprint is time-stretch invariant.

103. The method of claim 97 wherein said fingerprint is further computed from at least one time difference between said time coordinate of said anchor point and said time coordinates of said linked salient points.

104. The method of claim 103, wherein said fingerprint is further computed from a product of one of said time differences and one of said frequency coordinates of said linked salient points and said anchor point, whereby said fingerprint is time-stretch invariant.

105. The method of claim 97 wherein said anchor salient point and said linked salient points are selected from the group consisting of local maxima, local minima, and zero crossings of said spectrogram.

106. A method for comparing an audio sample and an audio entity, comprising:

for each of at least one audio entity to be identified, computing a plurality of entity landmark/fingerprint pairs representing said audio entity, wherein each landmark occurs at a particular location within said audio entity in at least one dimension including time, and wherein each fingerprint characterizes one or more features of said audio entity at or near said particular location;

computing a plurality of sample landmark/fingerprint pairs representing said audio sample by
obtaining time and frequency coordinates of at least one salient point of a spectrogram of said audio sample, wherein each salient point serves as an anchor point defining a sample landmark; and
generating at least one multidimensional sample landmark/fingerprint pair from said at least one salient point, wherein sample landmarks of said audio sample are taken to be time coordinates and wherein corresponding sample fingerprints are computed from at least one of the remaining coordinates; and
identifying a winning audio entity that has at least a threshold number of said file fingerprints that are equivalent to said sample fingerprints.

107. The method of claim 106, wherein said linked salient points fall within a target zone.

108. The method of claim 107, wherein said target zone is defined by a time range.

109. The method of claim 107, wherein said target zone is defined by a frequency range.

110. The method of claim 107, wherein said target zone is variable.

111. The method of claim 106 wherein said sample fingerprint is computed from a quotient between two of said frequency coordinates of said linked salient points and said anchor point, whereby said sample fingerprint is time-stretch invariant.

112. The method of claim 106 wherein said sample fingerprint is further computed from at least one time difference between said time coordinate of said anchor point and said time coordinates of said linked salient points.

113. The method of claim 112, wherein said sample fingerprint is further computed from a product of one of said time differences and one of said frequency coordinates of said linked salient points and said anchor point, whereby said sample fingerprint is time-stretch invariant.

114. The method of claim 106 wherein said anchor salient point and said linked salient points are selected from the group consisting of local maxima, local minima, and zero crossings of said spectrogram.