« PreviousContinue »
PROTECTING CONTENT FROM ILLICIT REPRODUCTION BY PROOF OF EXISTENCE OF A COMPLETE DATA SET VIA SELF-REFERENCING SECTIONS
CROSS REFERENCE TO RELATED
This application claims the benefit of U.S. Provisional Application No. 60/180,838 filed Feb. 7, 2000, and is a con- 10 tinuation of prior application Ser. No. 09/536,944 filed Mar. 28, 2000, now U.S. Pat. No. 7,228,425.
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates primarily to the field of consumer electronics, and in particular to the protection of copy-protected content material.
2. Description of Related Art 20 The illicit distribution of copyright material deprives the
holder of the copyright legitimate royalties for this material, and could provide the supplier of this illicitly distributed material with gains that encourage continued illicit distributions. In light of the ease of information transfer provided by 25 the Internet, content material that is intended to be copyprotected, such as artistic renderings or other material having limited distribution rights, are susceptible to wide-scale illicit distribution. The MP3 format for storing and transmitting compressed audio files has made the wide-scale distribution 30 of audio recordings feasible, because a 30 or 40 megabyte digital audio recording of a song can be compressed into a 3 or 4 megabyte MP3 file. Using a typical 56 kbps dial-up connection to the Internet, this MP3 file can be downloaded to a user's computer in a few minutes. Thus, a malicious party 35 could read songs from an original and legitimate CD, encode the songs into MP3 format, and place the MP3 encoded song on the Internet for wide-scale illegitimate distribution. Alternatively, the malicious party could provide a direct dial-in service for downloading the MP3 encoded song. The illicit 40 copy of the MP3 encoded song can be subsequently rendered by software or hardware devices, or can be decompressed and stored onto a recordable CD for playback on a conventional CD player.
A number of schemes have been proposed for limiting the 45 reproduction of copy-protected content material. The Secure Digital Music Initiative (SDMI) and others advocate the use of "digital watermarks" to identify authorized content material. EP 0981901 "Embedding auxiliary data in a signal" issued 1 Mar. 2000 to Antonius A. C. M. Kalker, discloses a 50 technique for watermarking electronic material, and is incorporated by reference herein. As in its paper watermark counterpart, a digital watermark is embedded in the content material so as to be detectable, but unobtrusive. An audio playback of a digital music recording containing a watermark, for 55 example, will be substantially indistinguishable from a playback of the same recording without the watermark. A watermark detection device, however, is able to distinguish these two recordings based on the presence or absence of the watermark. Because some content material may not be copy-pro- 60 tected and hence may not contain a watermark, the absence of a watermark cannot be used to distinguish legitimate from illegitimate material. On the contrary, the absence of a watermark is indicative of content material that can be legitimately copied freely. 65
Other copy protection schemes are also available. For example, European patent EP0906700, "Method and system
for transferring content information and supplemental information related thereto", issued 7 Apr. 1999 to JohanP. M. G., presents a technique for the protection of copyright material via the use of a watermark "ticket" that controls the number of times the protected material may be rendered, and is incorporated by reference herein.
An accurate reproduction of watermarked material will cause the watermark to be reproduced in the copy of the watermarked material. An inaccurate, or lossy reproduction of watermarked material, however, may not provide a reproduction of the watermark in the lossy copy of the material. A number of protection schemes, including those of the SDMI, have taken advantage of this characteristic of lossy reproduction to distinguish legitimate material from illegitimate material, based on the presence or absence of an appropriate watermark. In the SDMI scenario, two types of watermarks are defined: "robust" watermarks, and "fragile" watermarks. A robust watermark is one that is expected to survive a lossy reproduction that is designed to retain a substantial portion of the original content material, such as an MP3 encoding of an audio recording. That is, if the reproduction retains sufficient information to allow a reasonable rendering of the original recording, the robust watermark will also be retained. A fragile watermark, on the other hand, is one that is expected to be corrupted by a lossy reproduction or other illicit tampering.
In the SDMI scheme, the presence of a robust watermark indicates that the content material is copy protected, and the absence or corruption of a corresponding fragile watermark when a robust watermark is present indicates that the copy protected material has been tampered with in some manner. An SDMI compliant device is configured to refuse to render watermarked material with a corrupted watermark, or with a detected robust watermark but an absent fragile watermark, except if the corruption or absence of the watermark is justified by an "SDMI-certified" process, such as an SDMI compression of copy protected material for use on a portable player. For ease of reference and understanding, the term "render" is used herein to include any processing or transferring of the content material, such as playing, recording, converting, validating, storing, loading, and the like. This scheme serves to limit the distribution of content material via MP3 or other compression techniques, but does not affect the distribution of counterfeit unaltered (uncompressed) reproductions of content material. This limited protection is deemed commercially viable, because the cost and inconvenience of downloading an extremely large file to obtain a song will tend to discourage the theft of uncompressed content material.
BRIEF SUMMARY OF THE INVENTION
It is an object of this invention to extend the protection of copy-protected material to include the protection of uncompressed content material.
This obj ect and others are achieved by selecting a sufficient number of data items for inclusion in a data set so as to discourage a transmission of the entire set over a limited bandwidth communications path, such as the Internet. Each data item comprises one or more sections, and the totality of sections constitute the complete data set. Each section of the data set contains a watermark that includes an identifier of the section, and an identifier of the data set. In a preferred embodiment, the identifier of the section is the address of the section, and the identifier of the data set is a serial number and an indicator of the total size of the data set. The presence of the data set is confirmed by checking the watermarks of randomly selected sections to verify that the original section that formed the data set is present. If a section is discovered to
be missing or altered, subsequent processing of data items of the data set is prevented. In a preferred embodiment, the identifiers are stored as a combination of robust and fragile watermarks.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention is explained in further detail, and by way of example, with reference to the accompanying drawings wherein: 10
FIG. 1 illustrates an example system for protecting copyprotected content material in accordance with this invention.
FIG. 2 illustrates an example data structure that facilitates a determination of the presence of an entirety of a data set in 15 accordance with this invention.
FIG. 3 illustrates an example alternative data structure that facilitates a determination of the presence of an entirety of a data set in accordance with this invention.
FIG. 4 illustrates an example flow diagram for creating a 20 data set with security items that facilitate a determination of the presence of an entirety of the data set in accordance with this invention.
FIG. 5 illustrates an example flow diagram of a decoding system for rendering content material in dependence upon the 25 presence of an entirety of a data set in accordance with this invention.
Throughout the drawings, the same reference numerals indicate similar or corresponding features or functions. 3Q
DETAILED DESCRIPTION OF THE INVENTION
For ease of understanding, the invention is presented herein in the context of digitally recorded songs. As will be evident 35 to one of ordinary skill in the art, the invention is applicable to any recorded information that is expected to be transmitted via a limited bandwidth communications path. For example, the individual content material items may be data records in a larger database, rather than songs of an album. 40
The theft of an item can be discouraged by making the theft more time consuming or inconvenient than the worth of the stolen item. For example, a bolted-down safe is often used to protect small valuables, because the effort required to steal the safe will typically exceed the gain that can be expected by 45 stealing the safe. Copending U.S. patent application "Protecting Content from Illicit Reproduction by Proof of Existence of a Complete Data Set", U.S. Ser. No. 09/537,815, filed Mar. 28, 2000 for Michael A. Epstein, teaches selecting and binding data items to a data set that is sized sufficiently large so as 50 to discourage a transmission of the data set via a bandwidth limited communications system, such as the Internet, and is incorporated by reference herein. This copending application teaches a binding of the data items in the data set by creating a watermark that contains a data-set-entirety parameter and 55 embedding this watermark into each section of each data item. The copending application also teaches including a section-specific parameter (a random number assigned to each section) in the watermark. The referenced copending application teaches the use of "out of band data" to contain the 60 entirety parameter, or information that can be used to determine the entirety parameter. The section watermarks are compared to this entirety parameter to assure that they are the same sections that were used to create the data set and this entirety parameter. To minimize the likelihood of forgery, the 65 entirety parameter is based on a hash of a composite of section-specific identifiers. The referenced copending applica
tion also teaches the use of digitally signed certificates and other techniques that rely on cryptographic techniques, such as hashing and the like.
Copending U.S. patent application "Protecting Content from Illicit Reproduction by Proof of Existence of a Complete Data Set via a Linked List", U.S. Ser. No. 09/537,079, filed Mar. 28,2000forAntoniusA. M. Staring and Michael A. Epstein, teaches a self-referential data set that facilitates the determination of whether the entirety of the data set is present, without the use of out of band data and without the use of cryptographic functions, such as a hash function. This copending application creates a linked list of sections of a data set, encodes the link address as a watermark of each section, and verifies the presence of the entirety of the data set by verifying the presence of the linked-to sections of some or all of the sections of the data set.
In this invention, each section of a data set is uniquely identified and this section identifier is associated with each section in a secure manner. To assure that a collection of sections are all from the same data set, an identifier of the data set is also securely encoded with each section. Preferably, the section identifier and the data set identifier are encoded as a watermark that is embedded in each section, preferably as a combination of robust and fragile watermarks. Using exhaustive or random sampling, the presence of the entirety of the data set is determined, either absolutely or with statistical certainty. If the entirety of the data set is not present, subsequent processing of the data items of the data set is terminated. In the context of digital audio recordings, a compliant playback or recording device is configured to refuse to render an individual song in the absence of the entire contents of the CD. The time required to download an entire album on a CD in uncompressed digital form, even at DSL and cable modem speeds, can be expected to be greater than an hour, depending upon network loading and other factors. Thus, by requiring that the entire contents of the CD be present, at a download "cost" of over an hour, the likelihood of a theft of a song via a wide-scale distribution on the Internet is substantially reduced.
FIG. 1 illustrates an example block diagram of a protection system 100 in accordance with this invention. The protection system 100 comprises an encoder 110 that encodes content material onto a medium 130, and a decoder 120 that renders the content material from the medium 130. The encoder 110 includes a selector 112 that selects content material from a source, a binder 116 that builds an entirety verification structure, and a recorder 114 that records the content material with the entirety verification structure onto the medium 130. The selectorll2, forexample, may be configured to select content information corresponding to songs that are being compiled into an album. Each selected content material item is termed a data item; each data item includes one or more sections of data comprising the data item. The binder 116 is configured to bind each section to the data set, to facilitate a determination of whether the entirety of the data set is present when a data item of the data set is presented for rendering, for example, when a selected song is presented to a rendering device for playback. The recorder 114 appropriately formats, encodes, and stores the information on the medium 130, using techniques common in the art.
In accordance with this invention, the selector 112 selects data items to be added to the data set until the size of the data set is deemed large enough to discourage a subsequent transmission of the data set via a limited bandwidth communications channel. This "discouraging size" is a subjective value, and will depend upon the assumed available communications bandwidth, the loss incurred by the transmission, and so on.
Other criteria may also be used to determine whether to add additional data items to the data set. For example, if the data items correspond to songs of an existing album collection, all of the songs will typically be added to the data set, regardless of whether the size of the data set has exceeded the deter- 5 mined discouraging size. If all of the songs of the album collection have been selected, and the discouraging size criterion has not yet been reached, other data items are selected to accumulate the required discouraging size. For example, data items comprising random data bits may be added to the 10 data set to increase its size. These random bits will typically be stored as out of band data, CD-ROM data, and the like, to prevent it from being rendered as audible sounds by a conventional CD player. Alternatively, the data items may comprise other sample songs that are provided to encourage the 15 sale of other albums, or images and video sections related to the recorded content material. Similarly, promotional material, such as Internet access subscription programs may also be included in the recorded information on the recorded medium. These and other means of adding size to a data set 20 will be evident to one of ordinary skill in the art in view of this invention. In accordance with this invention, the encoder 110 includes a binder 116 that creates a unique identifier for each section, and an identifier for the entirety of the data set. In a preferred embodiment, the identifier of each section is the 25 address that is used for accessing the particular section. The data set identifier can be any somewhat-unique identifier that reduces the likelihood of different data sets having the same identifier, thereby reducing the likelihood of an illicit substitution of sections from different data sets. In a preferred 30 embodiment, for example, the data set identifier includes a 64 bit random number, and a parameter that can be used to determine the total size of the data set. The binder 116 communicates the data set identifier and the unique identifier of each section to the recorder 114 for recording onto the 35 medium 130.
Preferably, the recorder records the data set identifier and the unique identifier of each section as one or more watermarks that are embedded in each section. In a preferred embodiment, the section identifier and data set identifier are 40 encoded as combination of a robust watermark and a fragile watermark. In this manner, a removal of the robust watermark will cause damage to the section, and a modification of the section will cause damage to the fragile watermark. Preferably, the data set identifier is encoded as a fragile watermark, 45 and the section identifier is encoded as a robust watermark, because robust watermarks consume more resources, and the section identifier typically requires fewer bits than the data set identifier. In an alternative preferred embodiment, the aforementioned parameter that facilitates a determination of the 50 size of the data set is encoded as a robust watermark and the remainder of the data set identifier and section identifier are encoded as fragile watermarks. Alternative combinations of robust and fragile watermarks may also be used, as would be evident to one of ordinary skill in the art in view of this 55 invention. For example, the total size of the data set may form the bulk of the data set identifier. Or, the data set identifier or total size of the data set may be encoded as a robust watermark that extends across multiple sections. Other watermarks may also be used in addition to, or in combination with, these 60 watermarks, including for example, watermark "tickets" that limit the number of times a data set may be copied. Such a watermark ticket may form the aforementioned data set identifier. Copending U.S. patent application "Copy Protection by Ticket Encryption", Ser. No. 09/333,628, filed Jun. 15, 1999 65 for Michael Epstein, presents techniques for the protection of copyright material, and is incorporated by reference herein.
The decoder 120 in accordance with this invention comprises a renderer 122 and a gate 124 that is controlled by an entirety checker 126. The renderer 122 is configured to retrieve information from a medium reading device, such as a CD reader 132. As is common in the art, the renderer 122 retrieves the information by specifying a location index, and in response, the reader 132 provides the data located at the specified location index on the medium 130. In a typical memory structure comprising tracks and sections, a section of data is retrieved by specifying a section address.
The dotted lines of FIG. 1 illustrate an example song extractor 142 that extracts a song from the medium 130 and communicates it to an example CD imitator 144, representative of a possible illicit download of the song via the Internet. The CD imitator 144 represents, for example, a software program that provides information in response to a conventional CD-read command. Alternatively, the information received from the song extractor can be written to a CD medium, and provided to the conventional CD reader 132. As noted above, the song extractor 142 is likely to be used because the transmission of the entirety of the contents of the medium 130 is assumed to be discouraged by the purposeful large size of the contents of the medium 130.
In accordance with this invention, the entirety checker 126 is configured to obtain data from the medium 130, typically via the renderer 122, to determine whether the entire data set is present. The renderer 122 is configured to determine the watermark associated with each section of data that is read from the medium 130. The entirety checker 126 uses the watermarks to determine whether the entirety of the data set is available to the renderer 122, as discussed below.
FIG. 2 illustrates an example data structure 200 for storing data items in a data set that facilitates a determination of whether the entirety of the original data set is present. A track T 210 and section S 220 structure is illustrated, consistent with the memory structure of conventional CD and other storage media. As illustrated, each track T 210 may have a different number of sections S 220. In the example data structure 200, each section contains ancillary information 230 that is used by a compliant rendering device to verify that the entirety of the data set is present. As discussed above, in accordance with this invention, the ancillary information 230 of each section S 220 contains a unique identifier of the section and a unique identifier of the data set. The unique identifier of the data set is illustrated as the CDID 232 parameter that is encoded with each section, as discussed above. The unique identifier of each section is illustrated as the track 234 and section 23 6 identifier of each section. FIG. 3 illustrates an alternative data structure 300, wherein the unique identifier 334 of each section 220 is a sequential numbering of each section 220, from 0 to N-l, where N 338 is the total number of sections in the data set. In this example data structure, the value of N 338 is included in the ancillary information 230, to facilitate an access to the sections ranging from 0 to N-l. Preferably, the ancillary information 230 containing these identifiers is encoded as a combination of robust and fragile watermarks that are embedded with each section 220.
FIG. 4 illustrates an example flow diagram for creating the example data structure 300 of FIG. 3. The loop 410-435 accumulates data items to form a data set that is sufficient large so as to discourage a transmission of the data set via a limited bandwidth communications channel, such as a download from the Internet. As each data item is selected, at 410, each section comprising the data item is assigned a section number that is used to identify the section, at 420, and its size is added to the accumulated size of the data set, at 430. After accumulating a sufficiently sized data set, at 435, a some