|Publication number||US7858867 B2|
|Application number||US 12/844,363|
|Publication date||Dec 28, 2010|
|Filing date||Jul 27, 2010|
|Priority date||May 1, 2006|
|Also published as||US7790974, US20070261535, US20100288106|
|Publication number||12844363, 844363, US 7858867 B2, US 7858867B2, US-B2-7858867, US7858867 B2, US7858867B2|
|Inventors||Adil Ahmed Sherwani, Chad C. Gibson, Sumit Basu|
|Original Assignee||Microsoft Corporation|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (10), Non-Patent Citations (20), Referenced by (3), Classifications (16), Legal Events (4)|
|External Links: USPTO, USPTO Assignment, Espacenet|
This application is a divisional of U.S. Non Provisional application Ser. No. 11/415,327, filed May 1, 2006, the entire contents of which are incorporated herein by reference.
Traditional methods for creating a song or musical idea include composing the exact sequences of notes for each instrument involved and then playing all the instruments simultaneously. Contemporary advances in music software for computers allow a user to realize musical ideas without playing any instruments. In such applications, software virtualizes the instruments by generating the sounds required for the song or musical piece and plays the generated sounds through the speakers of the computer.
Existing software applications employ a fixed mapping between the high-level parameters and the low-level musical details of the instruments. Such a mapping enables the user to specify a high-level parameter (e.g., a musical genre) to control the output of the instruments. Even though such applications remove the requirement for the user to compose the musical details for each instrument in the composition, the fixed mapping is static, limiting, and non-extensible. For example, with the existing software applications, the user still needs to specify the instruments required, the chord progressions to be used, the structure of song sections, and specific musical sequences in the virtual instruments that sound pleasant when played together with the other instruments. Additionally, the user has to manually replicate the high-level information across all virtual instruments, as there is no unified method to specify the relevant information to all virtual instruments simultaneously. As such, such existing software applications are too complicated for spontaneous experimentation in musical ideas.
Embodiments of the invention dynamically map high-level musical concepts to low-level musical elements. In an embodiment, the invention defines a plurality of musical elements and musical element values associated therewith. Metadata describes each of the plurality of musical elements and associated musical element values. An embodiment of the invention queries the defined plurality of musical elements and associated musical element values based on selected metadata to dynamically produce a set of musical elements and associated musical element values associated with the selected metadata. The produced set of musical elements and associated musical element values is provided to a user.
Aspects of the invention dynamically map low-level musical elements to high-level musical concepts. In particular, aspects of the invention receive audio data (e.g., as analog data or as musical instrument digital interface data) and identify patterns within the received data to determine musical elements corresponding to the identified patterns. Based on the mapping between the low-level musical elements and the high-level musical concepts represented as metadata, an embodiment of the invention identifies the metadata corresponding to the determined musical elements. The identified metadata may be used to dynamically adjust a song model associated with the received data.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Other features will be in part apparent and in part pointed out hereinafter.
Corresponding reference characters indicate corresponding parts throughout the drawings.
In an embodiment, the invention identifies correlations between high-level musical concepts and low-level musical elements such as illustrated in
Exemplary Description Categories and Description Values.
Category of music
Rock, Hip-hop, Jazz
Chronological period to which particular
50 s, 70 s, 90 s
musical concepts belong
The characteristics of a particular composer or
performer that give their work a unique and
playing the piano
Emotional characteristics of music
A rough measure of how “busy” a piece of
music is with respect to the number of
instruments and notes playing, durations of
notes, and/or level of dissonance and arrhythmic
characteristics in the sound
The description categories (and values associated therewith) are mapped to lower-level musical elements 104 such as song structure, song section, instrument arrangement, instrument, chord progression, chord, loop, note, and the like. Within the musical elements 104, several layers may also be defined such as shown in
Exemplary Musical Elements and Musical Element Values.
A specific pitch played at a specific time for
C, Db, F#
a specific duration, with some additional
musical properties such as velocity, bend, mod,
Multiple notes played simultaneously
C = C + E + G
Dm = D + F + A
Sequence of notes, generally all played by
Funk Loop 1 = C D
the same instrument
C E D
List of instruments played together
Drums, Bass Guitar,
Sequence of chords
C Am F G
Temporal division of a song containing a
single chord progression, instrument
arrangement and sequence of loops per
Sequence of song sections
A B A B C B B
Songs with similar attributes of genre, complexity, mood, and other description categories often use similar expressions at lower musical layers. For example, many blues songs use similar chord progressions, song structures, chords, and riffs. The spread of mappings from higher to lower layers varies from genre to genre. Similarly, songs using specific kinds of musical elements 104 (e.g., instruments, chord progressions, loops, song structures, and the like) are likely to belong to specific description categories (e.g., genre, mood, complexity, and the like) at the higher level. This is the relationship people recognize when listening to a song and identifying the genre to which it belongs. Further, dependencies exist between the values of different musical elements 104 in one embodiment. For example, a particular chord may be associated with a particular loop or instrument. In another embodiment, no such dependencies exist in that the musical elements 104 are orthogonal or independent of each other. Aspects of the invention describe a technique to leverage these mappings to automate the processes of song creation and editing thereby making it easier for musicians and non-musicians to express musical ideas at a high level of abstraction.
Referring next to
For metadata received from the user at 206, aspects of the invention produce a set of musical elements and associated musical element values having the received metadata associated therewith at 208. The metadata may be a particular keyword (e.g., a particular genre such as “rock”), or a plurality of descriptive metadata terms or phrases corresponding to the genre, subgenre, style information, user-specific keywords, or the like. In another embodiment, the metadata is determined without requiring direct input from the user. For example, aspects of the invention may examine the user's music library to determine what types of music the user likes and infer the metadata based on this information.
In one embodiment, aspects of the invention produce the set of musical elements by querying the correlations between the metadata and the musical elements. If no musical elements were produced at 210, the process ends. If the set of musical elements is not empty at 210, one or more musical elements corresponding to each type of musical element are selected to create the song model at 212. For example, musical elements may be selected per song section and/or per instrument. Alternatively or in addition, aspects of the invention select or order musical elements based on a weight with each musical element value or the metadata associated therewith. For example, the weight assigned to “genre=rock” for an electric guitar may be more significant relative to the weight assigned to “genre=country” for the electric guitar. In this manner, aspects of the invention provide a song model without a need for the user to select all the musical elements associated with the song model (e.g., instruments, chords, etc.).
The song model with the selected musical element values may be displayed to the user, or used to generate audio data at 214 representing the backing track, song map, or the like. Alternatively or in addition, the song model is sent to virtual instruments via standard musical instrument digital interface (MIDI) streams.
In one embodiment, one or more computer-readable media have computer-executable instructions for performing the method illustrated in
Referring next to
Computer readable media, which include both volatile and nonvolatile media, removable and non-removable media, may be any available medium that may be accessed by computer 302. By way of example and not limitation, computer readable media comprise computer storage media and communication media. Computer storage media include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Communication media typically embody computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and include any information delivery media. Those skilled in the art are familiar with the modulated data signal, which has one or more of its characteristics set or changed in such a manner as to encode information in the signal. Wired media, such as a wired network or direct-wired connection, and wireless media, such as acoustic, RF, infrared, and other wireless media, are examples of communication media. Combinations of any of the above are also included within the scope of computer readable media.
The computer 302 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer. Generally, the data processors of computer 302 are programmed by means of instructions stored at different times in the various computer-readable storage media of the computer 302. Although described in connection with an exemplary computing system environment, including computer 302, embodiments of the invention are operational with numerous other general purpose or special purpose computing system environments or configurations. The computing system environment is not intended to suggest any limitation as to the scope of use or functionality of any aspect of the invention. Moreover, the computing system environment should not be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the exemplary operating environment.
In operation, computer 302 executes computer-executable instructions such as those illustrated in the figures to implement aspects of the invention.
The memory area 310 stores correlations between the plurality of musical elements and the metadata (e.g., musical elements, musical element values, description categories, and description values 313). In addition, the memory area 310 stores computer-executable components including a correlation module 312, an interface module 314, a database module 316, and a backing track module 318. The correlation module 312 defines the plurality of musical elements and associated musical element values and the description categories and associated description values. The interface module 314 receives, from the user 304, the selection of at least one of the description categories and at least one of the description values associated with the selected description category. The database module 316 queries the plurality of musical elements and associated musical element values defined by the correlation module 312 based on the description category and the description value selected by the user 304 via the interface module 314 to produce a set of musical elements and associated musical element values. The backing track module 318 selects, from the set of musical elements and associated musical element values from the database module 316, at least one of the musical element values corresponding to each type of musical element to create the song model.
In one embodiment, the musical elements, musical element values, description categories, and description values 313 are stored in a database as a two-dimensional table. Each row represents a particular instance of a lower-level element (e.g., a particular loop, chord progression, song structure, or instrument arrangement). Each column represents a particular instance of a higher-level element (e.g., a particular genre, mood, or style). Each cell has a weight of 0.0 to 1.0 that indicates the strength of the correspondence between the low-level item and the higher-level item. For example, a particular loop may be tagged with 0.7 for rock, 0.5 for pop and 0.2 for classical. Similarly, the particular loop may be tagged with 0.35 for “happy”, 0.9 for “manic”, and 0.2 for “sad”. The weights may be generated algorithmically, by humans, or both. A blank cell indicates that no weight exists for the particular mapping between the higher-level element and the low-level element.
In another embodiment, the database includes a one-dimensional table to enable the database to be extended with custom higher-level elements. Each row in the table corresponds to a particular instance of a lower-level element. Each row is further marked with additional tags to map each lower-level element to higher-level items. For example, the tags include an identification of the higher-level items along with a weight. In this manner, new higher-level elements may be created easily and arbitrarily without adding columns to the database. For example, a user creates a genre with a unique name and tags some of the lower-level elements with the unique name and weight corresponding to that genre.
In another embodiment, the mappings between the lower-level and higher-level elements are generated collectively by a community of users. The mappings may be accomplished with or without weights. If no weights are supplied, the weights may be algorithmically determined based on the number of users who have tagged a particular lower-level element with a particular higher-level element.
Referring next to
If the user likes the rendered music at 408, the song model is ready for further musical additions at 410. If the user is not satisfied with the rendered audio at 408, some or all of the remaining unselected musical elements from each of the lists are presented to the user for browsing and audition at 412. These unselected musical elements represent the statistically possible options that the user may audition and select if the user dislikes the sound generated based on the automatically selected musical elements. A user interface associated with aspects of the invention enables the user to change any of the musical elements at 414, while audio data reflective of the changed musical elements is rendered to the user at 416. In this manner, the user auditions alternate options from these lists and rapidly selects options that sound better to quickly and easily arrive at a pleasant-sounding song model.
In one example, querying the database at 404 includes retrieving all entries in a database that have been tagged with a valid weight to, for example, “genre=Jazz”. The highest-scoring loops, chord progressions, song structures and instrument arrangements for Jazz are ordered into lists. Selecting the first musical element at 406 includes automatically selecting the highest-scoring instrument arrangement and song structure. For each song section, the highest scoring chord progression is selected. For each instrument in each song section, the highest-scoring loop is selected. In one embodiment, aspects of the invention attempt to minimize repetition of any loop or chord progression within a particular song. Ties may be resolved by random selection. A set of the next-highest, unselected musical elements is provided to the user for auditioning and selection. If the user selects a particular instrument or song section to change, aspects of the invention apply the algorithm only within the selected scope.
Referring next to
Based on the defined correlations between the musical elements and the metadata (e.g., see
In one embodiment, the identified metadata is provided to the user as rendered audio. For example, the identified metadata is used to query the plurality of musical elements to produce a set of musical elements from which at least one of the musical elements corresponding to each type of musical element is selected. For example, aspects of the invention may select one of the song structures, one of the instrument arrangements, and one of the loops from the produced set of musical elements.
The selected musical elements represent the song model or outline. Audio data is generated based on the selected musical elements and rendered to the user. In such an embodiment, the determined high-level musical attributes such as style, tempo, intensity, complexity, and chord progressions are used to modify the computer-generated musical output of virtual instruments.
For example, in a real-time, live musical performance environment, the supporting musical tracks in the live performance may be dynamically adjusted in real-time as the performance occurs. The dynamic adjustment may occur continuously or at user-configurable intervals (e.g., every few seconds, every minute, after every played note, after every beat, after every end-note, after a predetermined quantity of notes have been played, etc.). Further, holding a note longer during the performance affects the backing track being played. In one example, a current note being played in the performance and the backing track currently being rendered serve as input to an embodiment of the invention to adjust the backing track. As such, the user may specify transitions (e.g., how the backing track responds to the live musical performance). For example, the user may specify smooth transitions (e.g., select musical elements similar to those currently being rendered) or jarring transitions (select musical elements less similar to those currently being rendered).
The notes, which are played by the user, give a strong indication of active chords, and the sequence of chords provides the chord progression. Embodiments of the invention dynamically adjust the chord progressions on the backing tracks responsive to the input notes. Additionally, the sequences of melody-based or riff-based notes indicate a performance loop. From this information, embodiments of the invention determine pre-defined performance loops that sound musically similar (e.g., in pitch, rhythm, intervals and position on circle of fifths) to the loop being played. With the information on chord progressions and similarity to pre-defined performance loops, the information on the chord progressions and performance loops played by the user allows embodiments of the invention to estimate the high-level parameters (e.g. genre, complexity, etc) associated with the music the user is playing. The parameters are determined via the mapping between the high-level musical concepts and the low-level musical elements described herein. The estimated parameters are used to adapt the virtual instruments accordingly by changing not only the chord progressions but also the entire style of playing to suit the user's live performance. As a result, the user has the ability to dynamically influence the performance of virtual instruments via the user's own performance without having to adjust any parameters directly on the computer (e.g., via the user interface).
While aspects of the invention have been described in relation to musical concepts, the embodiments of the invention may generally be applied to any concepts that rely on a library of content at the lower level that has been tagged with higher-level attributes describing the content. For example, the techniques may be applied to lyrics generation for songs. Songs in specific genres tend to use particular words and phrases more frequently than others. A system applying techniques described herein may learn the lyrical vocabulary of a song genre and then suggest words and phrases to assist with lyric writing in a particular genre. Alternately or in addition, a genre may be suggested given a set of lyrics as input data.
The figures, description, and examples herein as well as elements not specifically described herein but within the scope of aspects of the invention constitute means for defining the correlations between the plurality of musical elements each having a musical element value associated therewith and the one or more description categories each having a description value associated therewith, and means for identifying the musical elements and associated musical element values based on the selected description category and associated description value.
The order of execution or performance of the operations in embodiments of the invention illustrated and described herein is not essential, unless otherwise specified. That is, the operations may be performed in any order, unless otherwise specified, and embodiments of the invention may include additional or fewer operations than those disclosed herein. For example, it is contemplated that executing or performing a particular operation before, contemporaneously with, or after another operation is within the scope of aspects of the invention.
Embodiments of the invention may be implemented with computer-executable instructions. The computer-executable instructions may be organized into one or more computer-executable components or modules. Generally, program modules include, but are not limited to, routines, programs, objects, components, and data structures that perform particular tasks or implement particular abstract data types. Aspects of the invention may be implemented with any number and organization of such components or modules. For example, aspects of the invention are not limited to the specific computer-executable instructions or the specific components or modules illustrated in the figures and described herein. Other embodiments of the invention may include different computer-executable instructions or components having more or less functionality than illustrated and described herein.
When introducing elements of aspects of the invention or the embodiments thereof, the articles “a,” “an,” “the,” and “said” are intended to mean that there are one or more of the elements. The terms “comprising,” “including,” and “having” are intended to be inclusive and mean that there may be additional elements other than the listed elements.
Having described aspects of the invention in detail, it will be apparent that modifications and variations are possible without departing from the scope of aspects of the invention as defined in the appended claims. As various changes could be made in the above constructions, products, and methods without departing from the scope of aspects of the invention, it is intended that all matter contained in the above description and shown in the accompanying drawings shall be interpreted as illustrative and not in a limiting sense.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US5054360||Nov 1, 1990||Oct 8, 1991||International Business Machines Corporation||Method and apparatus for simultaneous output of digital audio and midi synthesized music|
|US6281424||Dec 7, 1999||Aug 28, 2001||Sony Corporation||Information processing apparatus and method for reproducing an output audio signal from midi music playing information and audio information|
|US6462264||Jul 26, 1999||Oct 8, 2002||Carl Elam||Method and apparatus for audio broadcast of enhanced musical instrument digital interface (MIDI) data formats for control of a sound generator to create music, lyrics, and speech|
|US7227073||Aug 29, 2003||Jun 5, 2007||Samsung Electronics Co., Ltd.||Playlist managing apparatus and method|
|US20030014262||Dec 20, 2000||Jan 16, 2003||Yun-Jong Kim||Network based music playing/song accompanying service system and method|
|US20040089134||Dec 19, 2002||May 13, 2004||Alain Georges||Systems and methods for creating, modifying, interacting with and playing musical compositions|
|US20050187976||Jan 10, 2005||Aug 25, 2005||Creative Technology Ltd.||Automatic hierarchical categorization of music by metadata|
|US20060028951||Aug 1, 2005||Feb 9, 2006||Ned Tozun||Method of customizing audio tracks|
|US20060054007||Nov 2, 2005||Mar 16, 2006||Microsoft Corporation||Automatic music mood detection|
|WO2003094148A1||Apr 30, 2002||Nov 13, 2003||Nokia Corp||Metadata type fro media data format|
|1||Feng, Yazhong et al., Music Information Retrieval by Detecting Mood via Computational Media Aesthetics, pp. 235-241, WI, Oct. 13-17, 2003.|
|2||GarageBand at a Glance, Apple Computer, Inc. 2004, 9 pages.|
|3||Kitagawa, Takashi et al., An Implementation method of automatic metadata extraction method for music data and its application to semantic associative search, Abstract printed from http://www3.interscience.wiley.com/cgi-bin/abstract/108061992/ABSTRACT, 2004, Wiley Periodicals, Inc., 2 pages, USA.|
|4||Tzanetakis, George et al., Musical Genre Classification of Audio Signals, IEEE Transactions on Speech and Audio Processing, vol. 10, No. 5 (2002) pp. 293-302.|
|5||Unknown, "GarageBand 3," printed from http://www.apple.com/ilife/garageband/, 2006, 3 pages, Apple Computer, Inc., USA.|
|6||Unknown, Band-in-a-Box Packages, http://www.pgmusic.com/bbwin-newfeatures.htm printed on Apr. 18, 2006, 7 pages, PG Music, Inc., USA.|
|7||Unknown, Band-in-a-Box Packages, http://www.pgmusic.com/bbwin—newfeatures.htm printed on Apr. 18, 2006, 7 pages, PG Music, Inc., USA.|
|8||Unknown, GarageBand3, http://www.apple.com/ilife/garageband/ printed on Apr. 18, 2006, 3 pages, Apple Computer, Inc., USA.|
|9||Unknown, Groove Agent 2-Your Virtual Drummer, http://www.steinberg.net/158-1.html printed on Apr. 18, 2006, 2 pages, Steinberg Media Technologies GmbH, Germany.|
|10||Unknown, Groove Agent 2—Your Virtual Drummer, http://www.steinberg.net/158—1.html printed on Apr. 18, 2006, 2 pages, Steinberg Media Technologies GmbH, Germany.|
|11||Unknown, How Does Groove Agent 2 Work?, http://www.steinberg.net/165-1.html printed on Apr. 18, 2006, 2 pages, Steinberg Media Technologies GmbH, Germany.|
|12||Unknown, How Does Groove Agent 2 Work?, http://www.steinberg.net/165—1.html printed on Apr. 18, 2006, 2 pages, Steinberg Media Technologies GmbH, Germany.|
|13||Unknown, Virtual Bassist-Your Virtual Bass Guitar Player, http://www.steinberg.net/160-1.html printed on Apr. 18, 2006, 2 pages, Steinberg Media Technologies GmbH, Germany.|
|14||Unknown, Virtual Bassist—Your Virtual Bass Guitar Player, http://www.steinberg.net/160—1.html printed on Apr. 18, 2006, 2 pages, Steinberg Media Technologies GmbH, Germany.|
|15||Unknown, What is a Virtual Bassist?, printed from http://www.steinberg.net/219-1.html printed on Apr. 18, 2006, 4 pages, Steinberg Media Technologies GmbH, Germany.|
|16||Unknown, What is a Virtual Bassist?, printed from http://www.steinberg.net/219—1.html printed on Apr. 18, 2006, 4 pages, Steinberg Media Technologies GmbH, Germany.|
|17||Unknown, What is Virtual Guitarist?, http://www.steinberg.net/171-1.html printed on Apr. 18, 2006, 2 pages, Steinberg Media Technologies GmbH, Germany.|
|18||Unknown, What is Virtual Guitarist?, http://www.steinberg.net/171—1.html printed on Apr. 18, 2006, 2 pages, Steinberg Media Technologies GmbH, Germany.|
|19||Whitman, Brian et al., Inferring Descriptions and Similarity for Music from Community Metadata, Proceedings of the 2002 International Computer Music Conference, 2002, University of Michigan, 8 pages, USA.|
|20||Willmore, Matt, Expanding Your Garage Band Loop Library (2004) http://web.archive.org/web/20040404032647/http://maczealots.com/tutorials/loops/, 7 pages.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US8058544 *||Sep 19, 2008||Nov 15, 2011||The University Of Western Ontario||Flexible music composition engine|
|US20130339035 *||Jun 5, 2013||Dec 19, 2013||Smule, Inc.||Automatic conversion of speech into song, rap, or other audible expression having target meter or rhythm|
|US20140365608 *||Aug 25, 2014||Dec 11, 2014||Microsoft Corporation||Arrangement for synchronizing media files with portable devices|
|Cooperative Classification||G10H2240/091, G10H2210/111, G10H2210/381, G10H2210/576, G10H1/0025, G10H2240/085, G10H2230/021, G10H2250/641, G10H2210/151, G10H2210/105, G10H2240/081, G10H2240/135, G10H2210/155|
|Jul 27, 2010||AS||Assignment|
Owner name: MICROSOFT CORPORATION, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SHERWANI, ADIL AHMED;GIBSON, CHAD;BASU, SUMIT;REEL/FRAME:024747/0593
Effective date: 20060501
|Feb 14, 2012||CC||Certificate of correction|
|May 28, 2014||FPAY||Fee payment|
Year of fee payment: 4
|Dec 9, 2014||AS||Assignment|
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0001
Effective date: 20141014