|Publication number||US20030069898 A1|
|Application number||US 10/208,606|
|Publication date||Apr 10, 2003|
|Filing date||Jul 30, 2002|
|Priority date||Jul 31, 2001|
|Publication number||10208606, 208606, US 2003/0069898 A1, US 2003/069898 A1, US 20030069898 A1, US 20030069898A1, US 2003069898 A1, US 2003069898A1, US-A1-20030069898, US-A1-2003069898, US2003/0069898A1, US2003/069898A1, US20030069898 A1, US20030069898A1, US2003069898 A1, US2003069898A1|
|Inventors||Athena Christodoulou, Richard Taylor, Christopher Tofts|
|Original Assignee||Athena Christodoulou, Richard Taylor, Christopher Tofts|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (5), Referenced by (16), Classifications (6), Legal Events (2)|
|External Links: USPTO, USPTO Assignment, Espacenet|
 It is well known to provide data bases for the storage of individual data items. The data items may, for example, be individual photographs or other images, text documents, items of music or other audio information, personnel records, or any other such data items. The purpose of such data bases is to provide both storage for the data items and also to provide the facility to search through the stored data items in accordance with one or more criteria.
 To allow such searching to be executed, the individual data items are indexed in some manner. Early systems of indexing were relatively simple and included such schemes as grouping a number of data items together within a single category, such that when a search was performed for items within that category the relevant data items could be retrieved.
 However, in time, more advanced indexing schemes have been developed, with one such scheme involving the use of metadata. Metadata describes data, that is, it is data about data. The metadata associated with a data item may be as simple as a set of natural language comments referring to one or more elements of the data item. Alternatively, the metadata may equally be much more complex comments, or tags, that may refer to the structure of the data. For example, the metadata associated with a text document may include a reference to the subject matter of the document, its author, the number of words, the size of the data item etc. As a further example, metadata associated with a graphical image such as a photograph may include tags identifying different elements of the image i.e. that the image is a sunrise/sunset, it includes peoples faces, it is a landscape, and so on.
 The metadata that is generated for each data item stored within a database is ordinarily much more compact than the raw data itself. An analogy would be the use of classification cards in a public library. Each card represents a book stored in the library and contains certain items of information about the book, for example author and title. The cards themselves occupy a relevantly small amount of space compared to the space occupied by the complete library and by searching the cards a set of books, for example by the same author, can be identified.
 However, the metadata may in some circumstances be much more sizeable than the original data item. Maintaining the library analogy, a single novel, such as ‘Pride and Prejudice’, may have a number of books analysing it associated with it. These volumes of analysis are analogous to the metadata to the original novel, yet would take up more shelf space than the novel itself.
 Powerful search tools that exploit metadata are used to augment conventional search tools.
 The increased functionality of databases and their associated search engines is one of the factors in their increased usage. Another factor is the increased usage of networked systems with local or remote network stations being linked to a central database. The link may be provided by a dedicated transmission cable or a shared transmission cable or wireless connection, or other such connection means. The quality and/or speed of the connection may pose a serious restriction on the amount of data that can be transmitted between the central database and the remote stations. It is therefore a problem to provide a large database with powerful search facilities, especially when dealing with non-textual data, that is easily and quickly accessible from remote stations. It may be equally problematic for data items to be exchanged between the database and the remote stations.
 A further problem is the cost in processing terms required to generate the metadata. A large centralised database may require a prohibitive amount of processing power to deal with the generation of increasing amounts of metadata. Equally, local computers, such as domestic PC's, or portable devices such as personal digital assistants or cameras may not ordinarily have the processing power available to perform the metadata generation under all circumstances or in acceptable time frames.
 At least some of the above problems apply to locally held databases. The consideration then is whether to perform the metadata generation locally or request an additional remote facility to do it even though there is no requirement or intention to transmit either the generated metadata or data item to a centralised database.
 According to the present invention there is provided a data processing system comprising at least one data item acquisition unit and at least one data store, the data item acquisition unit including a data tag generator for generating a data tag associated with each data item, and the data item acquisition unit being arranged to transmit at least the data tag to the data store.
 The data item acquisition unit may also transmit the data item itself to the at least one data store.
 Additionally, the at least one data store may also include a data tag generator.
 Each data store may be connected to one or more other data stores so as to provide a hierarchical arrangement of data stores.
 The data item acquisition units may include a decision module that evaluates if the data tag should be generated at the data acquisition unit or if the data item should be sent to the data store and the data tag generated there. The evaluation may take into consideration the size and complexity of the data item and hence the processing power required to generate the data tag, and may also include an evaluation of the efficiency of transmitting the data item to the data store for data processing based on the size of the data item and the quality and/or speed of the connection between data acquisition unit and data store. The evaluation may also take into consideration any previously stipulated privacy requirements.
 A use of the metadata is to preserve the privacy of the original data item. If the data item includes elements that it is desired to keep secret, only the metadata associated with the remaining elements need be transmitted to the data store. In a similar manner, a search query may be transmitted to the data store with only the metadata essential for the search without transmitting the original data item itself.
 The data tag generator at the data acquisition unit is preferably arranged to detect a failure to generate a data tag for a data item. A failure to generate the metadata may occur due to one or more of a number of reasons. It is normal practice not to limit the amount of system memory required during metadata generation. There is therefore the possibility for a failure to occur due to the data acquisition unit running out of memory. Equally, the processing power required to generate the metadata may be greater than that available at the data acquisition unit at that time. A further cause of failure may be that the data acquisition unit is not appropriately configured with the relevant contextual data for the data item being presented.
 The failure may be a ‘hard’ failure, in which case no metadata is generated and the data acquisition unit may transmit the data item to the data store, the data tag (metadata) generation then occurring at the data store. Alternatively, appropriate configuration information held by the data tag generator at the data store may be transmitted to the data acquisition unit to allow successful data tag generation to occur at the data acquisition unit. The failure may alternatively be a ‘soft’ failure, in which case the metadata generated prior to the failure occurring may be transmitted to the data store, or equally simplified metadata may be generated instead and transmitted to the data store.
 Alternatively or additionally, the data acquisition unit may be arranged such that new data tag configuration information may be defined in relation to a presented data item and the new data tag configuration information transmitted to the data store for inclusion in the data tag generator located therein.
 According to a second aspect of the present invention there is provided a method of processing data, the method comprising generating a data tag associated with a data item, said data tag generation occurring at a data acquisition unit, and transmitting at least said data tag to a data store.
 Advantageously the data item may also be transmitted to the data store. Data tag generation may also occur at the data store.
 The method may further include evaluating if it is appropriate to generate the data tags at the or each acquisition unit or to transmit the data item to a data store for data tag generation to occur there. The evaluation procedure may include determining the size and complexity of the data item to be processed, providing an estimate of the processing power required to generate the associated data tag in response to the determination, comparing the estimated processing power with the processing power of the available data tag generator, and in response to the comparison either generating the data tag locally or transmitting the data item to a data store.
 The method may further comprise transmitting configuration information from a data store to a data acquisition device and configuring a data tag generator located at the data acquisition device in accordance with the configuration information.
 Additionally or alternatively, data tag generator configuration information may be transmitted from a data item acquisition device to a data store, said configuration information being defined in relation to a user presented data item.
 According to a third aspect of the present invention there is provided a data item acquisition device comprising a data tag generator and being arranged to transmit a generated data tag associated with an acquired data item to a data store.
 The data item acquisition may device further comprise an evaluation module that is arranged to determine the processing power required to generate a data tag for a data item, compare the required processing power to the processing power available at the data tag generator, and in response to the comparison either enable the data tag generator to generate the data tag or enable the transmission of the data item to a data store.
 The data item acquisition device may form an integral or peripheral part of a personal computer. Additionally one or more data capture devices, for example an electronic camera, scanner or microphone, may be connected to an input of the data acquisition device. Alternatively the data acquisition unit may be integrated within a data capture or data storage device.
 According to a fourth aspect of the present invention there is provided a data store arranged to store a plurality of data tags associated with respective data items, and arranged to export, upon request, a data tag generator configuration information for use by data tag generators.
 Preferably the data store is arranged to store the respective data items associated with the data tags. The data store may additionally comprise a data tag generator for generating data tags associated with data items input to the store.
 Additionally the data store may comprise a search engine arranged to locate data tags conforming to a user search request and cause the data items associated with the located data tags to be output from the data store.
 Preferably the data store is connected to one or more other data stores. Additionally the data store may further comprise an evaluation module that is arranged to determine the processing power required to generate a data tag for a data item, compare the required processing power to the processing power available at the data tag generator, and in response to the comparison either enable the data tag generator to generate the data tag or enable the transmission of the data item to a data store.
 According to a fifth aspect of the present invention, there is provided a computer program product for causing a data processor to execute the method according to the second aspect of the present invention.
 The present invention will now be described, by way of example, with reference to the accompanying drawings, in which:
FIG. 1 shows a schematic representation of a data processing system according to an embodiment of the present invention connected to a number of data input devices: and
FIG. 2 shows a further embodiment of the present invention having a multi-layer structure.
FIG. 1 shows a data acquisition device or unit 2 connected to a data store 4. The data acquisition unit 2 is connected to one or more data input devices. Examples of data input devices that are shown are a discrete data storage unit 6, for example a hard disk, a digital camera 8, and a document scanner 10. Other input devices such as video or sound recorders could also be provided. Located in the data acquisition unit 2 is a datatag generator 12 also known as a metadata generator. The metadata generator is arranged to process data items input from one or more of the data input devices to generate datatags or metadata for each data item. A data store 4 is connected to the data acquisition unit 2. The data store unit includes one or more data storage devices 14, such as known hard disk drives. Connected to the data storage devices 14 is a data query and/or indexing unit that is arranged to perform conventional data searching procedures. The data storage devices 14 are arranged to store either a plurality of data tags, a plurality of individual data items, or both data items and their associated datatags. The data acquisition unit 2 and data store 4 are connected by any suitable data transmission channel, for example by fibre optic cable, or by wireless connections.
 In use, data items will be input to the data acquisition unit 2 from one of the data input devices 6-10. On receipt of the data items the metadata generator 12 will perform data processing to generate metadata associated with the input data items. The metadata may then be transmitted from the data acquisition unit 2 to the data store 4, together with, for example, a request from the data acquisition unit for the data store 4 to provide further data items that have similar metadata associated with them. This method of operation has the advantage that metadata generation is performed locally at the data acquisition unit 2 and not at the data store 4, thus freeing resources at the data store 4 that may be applied more efficiently to searching the contents of the data store 4 for requested data items. Additionally, having generated the associated metadata at the data acquisition unit 2, both the metadata and the associated data item may be transmitted to the data store 4 to be added to the data items and metadata stored there. In this way a database that is stored at the data store 4 may be expanded and updated relying solely on metadata and data items provided by remote stations without utilising the central resources at the central data store.
 In a further embodiment of the present invention the data acquisition unit 2 may include a decision unit 18 connected to the metadata generator 12. The function of the decision unit 18 is to perform an evaluation of whether generation of the metadata for an input data item would be better performed either locally at the data acquisition unit 2 or centrally at the store 4. In these embodiments, the data storage unit 4 also includes a metadata generator 20. The evaluation of where to perform the metadata processing may take into account a number of parameters, for example the size and complexity of the data item(s) and therefore the processing power required to perform the metadata processing, or the size of the data item(s) in comparison with the transmission abilities of the connection between the data acquisition unit 2 and the data store 4. The evaluation may also take into consideration any stipulated privacy requirements. For example, it may be stipulated by a user that the author or originator of a data item(s) is not included in the metadata. This will have an impact on the complexity of the generated metadata. As a further example, if the connection between the data acquisition unit 2 and the data store 4 is of limited capacity, the decision unit 18 may evaluate that it is more efficient to generate the metadata locally at the data acquisition unit 2 using the metadata generator 12 rather than attempt to transmit the relatively large amount of data down the restricted transmission capacity of the connection between the data acquisition unit 2 and the central store 4. Alternatively, an evaluation may be made that it is more efficient to transmit the data item(s) unprocessed to the data store 4 to be processed by the more powerful data generator processor 20 located at the data store 4.
 In certain embodiments of the present invention the metadata generator 12 located at the data acquisition unit 2 is arranged to report any failures to generate metadata for data items. A failure to generate metadata for a data item may occur due to the metadata generator 12 located at the data acquisition unit 2 not being configured to with appropriate contextual data generate metadata for a particular kind of data item. Alternative or additional causes of failure may include running out of memory during generation of the metadata or insufficient processing resources being available. The available processing resources may vary depending on other tasks being performed by the data acquisition unit at any given time. When such a failure occurs, the decision unit 18 may either transmit the particular data item to the data store 4 in order for the metadata generator 20 located at the data store 4 to generate the metadata, request revised configuration information from the metadata generator 20 located at the data store 4 to enable the metadata generator unit 12 located at the data acquisition unit 2 to be reconfigured to enable metadata to be generated for the data item at the data acquisition unit 2, generate simplified metadata that requires less system resources or configuration information, or simply transmit to the data store 4 the metadata that had been successfully generated prior to the failure occurring.
 It will be appreciated by those skilled in the art that the data acquisition unit 2 may be a dedicated processing unit or may be integrated as either hardware or software within a general personal computer or the like. In the latter case, the decision unit 18 may also take into account the demands on the computer's processor when evaluating whether to generate metadata at the data acquisition unit or not. It will be appreciated that metadata processing may be performed at the data acquisition unit 2 during what would otherwise be “idle” processing periods.
FIG. 2 shows an embodiment of the present invention utilising a multi-layered, hierarchical arrangement of data storage units. A number of “first layer” data stores 4 are provided. Each data store includes a metadata generator 20. Connected to each first layer data stores 4 are one or more data acquisition units 2. In FIG. 2 the data acquisition units 2 are schematically shown as being connected to different data input sources. The sources shown include a digital camera 30, a document source 32, audio source 34 and video source 36. Each data acquisition unit 2 also includes a metadata processor 12. The data acquisition units 2 are arranged to operate in the same manner as those described in FIG. 1.
 Each of the first layer data stores 4 is connected to a second layer data store 38. Preferably, but not necessarily, the second layer data store 38 has increased storage capacity in comparison to the first layer data stores 4. Although only two layers of data stores are shown in FIG. 2, it will be appreciated that any number of layers can be used. In use, the decision making process that occurs at the data acquisition units 2, as described with reference to FIG. 1, also takes place at each of the levels of data stores. Thus the system is flexible enough to perform the metadata processing at whichever layer is deemed most appropriate.
 In further embodiments, the data input sources, for example the digital camera 30 shown in FIG. 2, may themselves include a metadata generator. This allows a further sub-layer of metadata processing and decision making to be performed.
 By providing metadata generators at the various different layers of the system the processing is distributed throughout the system. This provides the advantage that both processing power throughout the system and the use of transmission connections can be optimised. The processing power of the metadata generators at the different layers may either be identical, or may increase towards the highest layer. In the latter case, only the more complex or large data items would need to be processed by the more powerful metadata generators, with the simpler data items being processed at lower levels by individual metadata generators.
 In the same manner as described with relation to FIG. 1, the lower level metadata generators may be “updated” with new configuration information provided by metadata generators in the higher levels.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US2151733||May 4, 1936||Mar 28, 1939||American Box Board Co||Container|
|CH283612A *||Title not available|
|FR1392029A *||Title not available|
|FR2166276A1 *||Title not available|
|GB533718A||Title not available|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7370121||Jul 7, 2003||May 6, 2008||Nid Solutions, Inc.||System and method for the capture, storage and manipulation of remote information|
|US7782365||May 23, 2006||Aug 24, 2010||Searete Llc||Enhanced video/still image correlation|
|US7872675||Oct 31, 2005||Jan 18, 2011||The Invention Science Fund I, Llc||Saved-image management|
|US7876357||Jun 2, 2005||Jan 25, 2011||The Invention Science Fund I, Llc||Estimating shared image device operational capabilities or resources|
|US7899949||Apr 16, 2008||Mar 1, 2011||Nid Solutions, Inc.||System and method for the capture, storage and manipulation of remote information|
|US7920169||Apr 26, 2005||Apr 5, 2011||Invention Science Fund I, Llc||Proximity of shared image devices|
|US8072501||Sep 20, 2006||Dec 6, 2011||The Invention Science Fund I, Llc||Preservation and/or degradation of a video/audio data stream|
|US9001215 *||Nov 28, 2007||Apr 7, 2015||The Invention Science Fund I, Llc||Estimating shared image device operational capabilities or resources|
|US9041826||Aug 18, 2006||May 26, 2015||The Invention Science Fund I, Llc||Capturing selected image objects|
|US9076208||Feb 28, 2006||Jul 7, 2015||The Invention Science Fund I, Llc||Imagery processing|
|US9082456||Jul 26, 2005||Jul 14, 2015||The Invention Science Fund I Llc||Shared image device designation|
|US9093121||Jun 29, 2011||Jul 28, 2015||The Invention Science Fund I, Llc||Data management of an audio data stream|
|US20040078578 *||Jul 7, 2003||Apr 22, 2004||Harsch Khandelwal||System and method for the capture, storage and manipulation of remote information|
|US20080219589 *||Nov 28, 2007||Sep 11, 2008||Searete LLC, a liability corporation of the State of Delaware||Estimating shared image device operational capabilities or resources|
|US20140184817 *||Dec 27, 2012||Jul 3, 2014||Brent Chartrand||Enabling a metadata storage subsystem|
|US20140184828 *||Dec 28, 2012||Jul 3, 2014||Brent Chartrand||Enabling a metadata storage subsystem|
|U.S. Classification||1/1, 707/E17.095, 707/999.107|
|Jul 30, 2002||AS||Assignment|
Owner name: HEWLETT-PACKARD COMPANY, CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HEWLETT-PACKARD LIMITED;REEL/FRAME:013154/0746
Effective date: 20020729
|Sep 30, 2003||AS||Assignment|
Owner name: HEWLETT-PACKARD DEVELOPMENT COMPANY L.P.,TEXAS
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HEWLETT-PACKARD COMPANY;REEL/FRAME:014061/0492
Effective date: 20030926