CA2143288A1 - Data storage system with set lists which contain elements associated with parents for defining a logical hierarchy and general record pointers identifying specific data sets - Google Patents
Data storage system with set lists which contain elements associated with parents for defining a logical hierarchy and general record pointers identifying specific data setsInfo
- Publication number
- CA2143288A1 CA2143288A1 CA002143288A CA2143288A CA2143288A1 CA 2143288 A1 CA2143288 A1 CA 2143288A1 CA 002143288 A CA002143288 A CA 002143288A CA 2143288 A CA2143288 A CA 2143288A CA 2143288 A1 CA2143288 A1 CA 2143288A1
- Authority
- CA
- Canada
- Prior art keywords
- data
- list element
- set list
- data set
- parent
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000013500 data storage Methods 0.000 title description 6
- 238000012360 testing method Methods 0.000 claims description 19
- 230000008520 organization Effects 0.000 claims description 8
- 230000001737 promoting effect Effects 0.000 claims 1
- 238000000034 method Methods 0.000 description 102
- 230000008569 process Effects 0.000 description 60
- 238000012545 processing Methods 0.000 description 21
- 238000010586 diagram Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 10
- 238000012423 maintenance Methods 0.000 description 9
- 230000000694 effects Effects 0.000 description 7
- 230000003068 static effect Effects 0.000 description 7
- 239000008186 active pharmaceutical agent Substances 0.000 description 6
- 238000005457 optimization Methods 0.000 description 6
- 238000010276 construction Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- UFULAYFCSOUIOV-UHFFFAOYSA-N cysteamine Chemical compound NCCS UFULAYFCSOUIOV-UHFFFAOYSA-N 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- VLCQZHSMCYCDJL-UHFFFAOYSA-N tribenuron methyl Chemical compound COC(=O)C1=CC=CC=C1S(=O)(=O)NC(=O)N(C)C1=NC(C)=NC(OC)=N1 VLCQZHSMCYCDJL-UHFFFAOYSA-N 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 102000016917 Complement C1 Human genes 0.000 description 1
- 108010028774 Complement C1 Proteins 0.000 description 1
- 241000518994 Conta Species 0.000 description 1
- 241000282320 Panthera leo Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000011989 factory acceptance test Methods 0.000 description 1
- 238000012949 factory acceptance testing Methods 0.000 description 1
- 239000003925 fat Substances 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- PSGAAPLEWMOORI-PEINSRQWSA-N medroxyprogesterone acetate Chemical compound C([C@@]12C)CC(=O)C=C1[C@@H](C)C[C@@H]1[C@@H]2CC[C@]2(C)[C@@](OC(C)=O)(C(C)=O)CC[C@H]21 PSGAAPLEWMOORI-PEINSRQWSA-N 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000012913 prioritisation Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9017—Indexing; Data structures therefor; Storage structures using directory or table look-up
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
Abstract
In a computer having one or more secondary storage devices attached thereto, a Finite Data Environment Processor (FDEP) manages Data Sets (22) residing on the secondary storage devices and in memory using Set Lists (24) (SLs) and General Record Pointers (26) (GRPs). The Data Sets (22) contain either data or logical organizational information. The Set Lists (24) comprise Data Sets (22) organized into a hierarchy. The General Record Pointers (26) identify information in terms of Data Sets (22) and records within them. Using the principal idea that a Data Set is uniquely identifiable, the present invention eliminates problems normally associated with referencing the location of data after the data has been moved.
Description
W094/07209 214 3 ~ ~ 8 PCT/CA93/00338 METHOD AND APPARATUS FOR DATA STORAGE AND RETRIEVAL
BACKGROUND OF THE INVENTION
1. Field of the Invention.
This invention relates in general to computer operating systems, and in particular, to a method for storing and retrieving data in a computer system.
BACKGROUND OF THE INVENTION
1. Field of the Invention.
This invention relates in general to computer operating systems, and in particular, to a method for storing and retrieving data in a computer system.
2. Description of Related Art.
One problem, which has been retained throughout the history of digital computing, is the inability to directly identify a data entity, have the ability to move it, and still maintain all references to the entity.
For example, consider the state-of-the-art methods of information storage. One state-of-the-art method is a cont~inment technique where an entity (directory) contains a list of subordinate entities (directories or files). Although the cont~in~cnt method is highly effective in ordering information, it does not support the general ability to access information after the information has been moved. Moving data within a hierarchy inevitably leads to the invalidation of existing links to the information moved. There are three prior art solutions to this problem.
The first solution is the concept of a current or working directory. A current directory is a state where a program identifies a particular location within secondary storage and all operations assume information is located in this directory or is subordinate to it.
The second solution is an absolute path. An absolute path is where a program identifies information by a stream of hierarchically dependant entities which are concatenated together to form an absolute data specification. Thus, if "C" is within ~B~ and ~B~ is within "A", then an absolute path would be "\A\B\C".
~ 2 The third solution is the concept of search path. A
search path is a list of predefined locations where programs and sometimes program data are stored. If a body of requested data is not within the search path or the current directory, then it is assumed not to exist.
Note many systems use combinations of these methodologies.
However, there rr~in~ the problem, common to all the state-of-the-art methodologies defined above, of maint~ining information links to data which has moved.
Consider a program which accesses data with an absolute path \A\B\C. If the data located in C is moved to a different location, then the program will not be able to locate the data it needs. Further, consider a program which uses the concept of current directory, where the data file is moved out of the current directory.
Finally, consider the program which uses data located in a search path, where the data is moved to a directory not in the search path. In all of these instances the data is effectively lost to the program. The same reasoning applies to data within a database. Consider the examples above but replace directories with files and files with sets of records.
Thus, the pro~lem is how to retain direct information links when data is moved. In almost all cases, the information link is recorded in a static image, such as a program or in a database.
The solution to the problem of directly linking information and still having the ability to move it, is dependant on altering some of the basic perceptions regarding information structure. The state-of-the-art approach is to use volumes, directories and files to identify each data entity. Where volumes and directories are methods of containment. Specifically, a volume physically contains directory identifiers and directories physically contain file identifiers. Thus, all files and directories identified within a volume ~ $
must physically exist within that volume.
The basic premise of the containment method of information structure storage is that a hierarchy is based on more significant entities physically containing less significant entities. Although this premise works, it is not efficient in that it prohibits several of the characteristics identified below. The following are perceptional modifications required to understand the present invention.
First, a data object may logically belong to a larger data object, but that does not mean it has to be physically contained within it. For example, a directory can contain many file identifiers, but that does not mean the file identifiers must be physically contained in the directory. In the present invention, each file identifier can identify the directory it belongs to and a more effective information structure storage method can be used.
Second, all information structure is hierarchical to some degree. When information ceases to have a hierarchical structure (for example relational), the existing containment method (volume, directory, file) of storing hierarchical structure does not work unless additional intermediate processes are used. The present invention can support direct links to information at a fundamental level and can therefore directly support non-hierarchical information structures such as relational or object oriented, where the cont~inment method cannot.
Third, the amount of data structure required by any process may vary in size, depth and width. Therefore, any mechanisms, which cannot handle extremes in size, depth and width are ineffective. For example, consider a directory with lO,000 file identifiers and the problems associated with its use and maintenance when in memory. The present invention experiences similar space restrictions, but does not experience as many or as i7~ 4 severe problems as cont~;nment methodology with regard to use or maintenance.
Fourth, the amount of space required to uniquely identify a data entity deep within a hierarchy can be excessive using standard cont~inm~nt methodology. The containment method of uniquely identifying a data entity becomes progressively less efficient as depth within a hierarchy increases. For example, the following are two strings to identify a body of data deep in a hierarchical structure:
1 - "C:\ACC_PROG\YEAR1992\ACCOUNTS\ONTARIO\TOR
ONTO\CASH.DB"
2 - "C:\MY_DISK12\VOLUME_A\UTILS_AC\ACC_PROG\Y
EAR1992\ACCOUNTS\ONTARIO\TORONTO\CASH.DB"
The first data identifier is over 50 bytes long and could easily be much longer as depicted in the second data identifier, which is over 80 bytes. In the present invention, a direct reference is a max;rllm of 20 bytes long, regardless of how deep the reference is within a hierarchy.
Fifth, it is safe to assume that information moves.
The cont~inment method of storing information structure requires the data identifier to move when the data moves. This produces two problems, the overhead of moving the identifier and more importantly, the problems associated with losing links to the data moved. The present invention ensures that data links are retained, regardless of where the data is moved to.
Finally, information accessed via direct links are faster than indirect links. The containment method of data-structure-storage uses names to identify a data entity. This means that the location of a data identifier is established by a search, which is seldom a binary search and never an aggregate indexed reference.
As a result, the overhead associated with locating a body of data using the cont~inment methodology is slow - ~ 2 1 4 3 ~ 8 8 AME~DE~S~E~
and cumbersome. The present invention of data-structure-storage is efficient because data location is either a binary search or an aggregate indexed reference.
Examples of prior art methods of information storage include European Patent Application No. 0-410-210-A2 by IBM entitled "Method for Dynamically Expanding and Rapidly Accessing File Directories," European Patent Application No. 0-040-694-A2 by IBM entitled "Method and System for Cataloging Data Sets," and European Patent Application No.
0-474-395-A3 by IBM entitled, "Data Storage Hierarchy With Shared Storage Level." These prior art methods are discussed in more detail below.
European Patent Application No. 0-410-210-A2 discloses a computer-implemented method for the name-oriented accessing of files having at least zero records, any access path to files and records through an external store coupling the computer being defined by a pair of related directories. A first directory of record entries is sorted on a two-part token. The token consist of a unique sequence number assigned to the record and the sequence number of any parent record entry. Each record entry includes the token, file or record name, and external store address or pointer. A traverse through the tokens constitutes a leaf-searchable s-tree. Rapid access to target records is by way of a name-sorted, inverted directory of names and tokens as a subset and which is reconstitutable from the first directory in the event of unavailability.
European Patent Application No. 0-040-694-A2 discloses a data set catalog structure that eliminates the requirement for base catalog/data volume synchronization in a multi-processing environment while enabling the operating efficiency directly addressing the data volumes.
The catalog is distributed between a keyed sequential base catalog and, on each data volume, an entry sequential data volume set. Catalog information which must be synchronized with application data sets is stored in AMENDED SHE~T
21~28~
volume records in the volume data set. The method of the invention operates to use and maintain a data base catalog to open a user data set, the data base catalog including a first keyed data set and on each volume containing user data sets, a second keyed data set, comprising the steps of searching said first keyed data set for a direct pointer to a first volume record in said second keyed data set, comparing the key of the user data set to be opened with the key of said first volume record, and if the keys match, opening for access the data set addressed by a direct pointer in said first volume record. If the keys to do not match, searching said second keyed data set for a second volume record cont~;n;ng the direct key, updating the direct pointer in said first keyed data set to address said second volume record, and opening for access the user data set addressed by a direct pointer in said second volume record.
European Patent Application No. 0-474-3g5-A3 discloses a data storage hierarchy which inherently allows for a level 1 storage file to be uniquely identified across an entire network. A directory naming convention is employed which includes an internal identifier and a name for each subdivision on the network. Because each 2S file can be uniquely identified across the network, a single level one storage space in a file space, or a directory therein, can be used for the entire network.
Also, because of the inherent uniqueness of the naming system, common DASD control files, otherwise required to map between level one storage files and their level zero source files can be eliminated.
SUMMARY OF THE lNv~lION
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a 211328~ AMEN~ED SHEET
- specification, the present invention discloses a method and apparatus for data storage and retrleval. In a computer having one or more secondary storage devices attached thereto, a Finite Data Environment Processor (FDEP) manages Data Sets residing on the secondary storage devices and in memory using Set Lists (SLs) and General Record Pointers (GRPs). The Data Sets contain either data or logical organizational information. The Set Lists comprise Data Sets organized into a hierarchy. The General Record Pointers identify information in terms of Data Sets and records within them. Using the principal idea that a Data Set is uniquely identifiable, the present invention eliminates problems normally associated with referencing the location of data after the data has been moved.
BRIEF DESCRIPTION OF THE DRAWINGS
Referring now to the drawings in which like reference numbers represent corresponding parts throughout:
Figure 1 is a block diagram illustrating the environment in which the present invention operates;
Figure 2 illustrates the structure of a Set List;
Figure 3 illustrates the structure of a Set List Element;
Figure 4 illustrates the structure of a Data Set Definition Record and the Data Set Types defined therein;
$~ 6 1~
Figure 5 illustrates the structure of a General Record Pointer;
Figure 6 illustrates the structure of an External General Record Pointer;
Figure 7 illustrates the structure of a Finite Data List;
Figures 8A and 8B illustrate the structure of a Finite Data Set;
Figure 9 is a block diagram illustrating the data structures used to resolve a General Record Pointer;
Figure 10 illustrates the structure of a Memory Maintenance Record;
Figure 11 is a block diagram illustrating the logic used to locate a Finite Data Set;
Figure 12 is a block diagram illustrating the data structures used to resolve an External General Record Pointer;
Figure 13 is a block diagram illustrating the process flow used to resolve a General Record Pointer;
Figure 14 is a block diagram illustrating the process flow used to search memory when resolving a General Record Pointer;
Figure 15 is a block diagram illustrating the process flow used to search secondary storage when resolving a General Record Pointer;
Figure 16 is a block diagram illustrating the process flow used to search Set Lists when resolving a General Record Pointer;
Figure 17 illustrates the path structure of the Finite Data Set in Figures 8A and 8B;
Figure 18 illustrates the results of a simple single Data Set copy operation;
Figure 19 illustrates the results of a Data Set copy operation which includes subordinate hierarchy;
Figure 20 illustrates the results of a Data Set move operation;
W094/07209 ~ ~ 3 ~ ~ X PCT/CA93/00338 Figure 21 illustrates the results of a Data Set promote operation;
Figure 22 illustrates the results of a first Data Set insertion operation;
Figure 23 illustrates the results of a second Data Set insertion operation; and Figure 24 illustrates the structure of a Storage Address Identifier.
DE~ATT~n DESCRIPTION OF THE PREFERRED EMBODIMENT
In the following description of the preferred embodiment, reference is made to the accompanying drawings which form a part hereof, and in which is shown by way of illustration a specific embodiment in which the invention may be practiced. It is to be understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the present invention.
GLOSSARY
CDI Combined Data Identifier A CDI is an 8-byte field consisting of a 4-byte Site Owner Identifier and a 4-byte Data Set Identifier. A CDI is used to uniquely identify a Data Set. A CDI used in conjunction with a Data Record Identifier constitutes a General Record Pointer.
CDS Containment Data Set A CDS is a class of Data Set which may logically contain any number of other Data Sets as children. However, a CDS has exactly one immediate parent. A CDS can contain all Data Set classes. The CDS does not identify its children; instead, all children (Set List Elements) identify the CDS as their parent.
All children of a CDS exist in the same Set W O 94/07209 PC~r/CA93/00338 2~ 8 List. A CDS obeys the Hierarchical Organization Rules when existing in a hierarchy. A CDS is strictly a logical entity in that, no real physical data is ever present.
A CDS is used to define a conventional hierarchical information structure such as the drive/directory/file structures of MS-DOS and UNIX. Each CDS at a Site can be reached via an Set List Element, and has no further physical representation.
DRI Data Record Identifier A DRI is a 4-byte field which identifies a single record within a given Data Set. If a DRI within a General Record Pointer is set to null (represented by a one's complement -1), then it identifies the entire Data Set. A
General Record Pointer which has all fields set to null is a null General Record Pointer.
DSI Data Set Identifier A DSI is a 4-byte field which identifies one Data Set within a given Finite Data Environment. Note that an DSI may not be unique (by itself) when Set List Elements from other Sites are imported. If data is not imported into a Site, then an DSI will be unique.
EGRP External General Record Pointer An EGRP is a 20-byte data identifier. An EGRP <
consists of a 4-byte Finite Data Set Site Owner Identifier, a 4-byte Finite Data Identifier, a 4-byte Data Set Identifier Site Owner Identifier, a 4 byte Data Set Identifier and 4-byte Data Record Identifier as a single record.
An EGRP will uniquely identify a unique Data ~ 21~288 Set absolutely anywhere.
FDEP Finite Data Environment Processor An FDEP is the process which maintains Set Lists and all related structures and information. There is exactly one FDEP for a given Site. An FDEP is assumed to have a uni~ue serial number, which is used as the Site Owner Identifier when a new Finite Data Set (or Set List Element) is created.
FDI Finite Data Identifier An FDI is structurally identical to a Combined Data Identifier, i.e., a 4-byte Site Owner Identifier and a 4-byte Data Set Identifier.
The difference between a Combined Data Identifier and a Finite Data Identifier, is a Finite Data Identifier only identifies Set Lists, never a Containment Data Set or a Physical Data Set. A Combined Data Identifier can identify any Data Set class (and therefore type).
FDL Finite Data List An FDL is an array or list of records which identify a specific Finite Data Set and the location of its Set List in secondary storage.
There is exactly one FDL on a Site. All Finite Data Sets (and therefore Set Lists) at a Site are identified in the FDL.
FDS Finite Data Set An FDS is both a class and a type of Data Set.
An FDS can have all Data Set classes (Containment Data Set, Finite Data Set, and Physical Data Set) as its children (or subordinates). An FDS may have one or more parents. This enables an FDS to exist at several different levels of logical hierarchy, without duplication. Each FDS at a Site can be reached via a unique Finite Data List Element, and is represented by a unique Set List. In addition, an FDS is also represented by the various Set List Elements (in other Set Lists) that point to it. A11 control data (Set List Elements) for all data within an FDS are completely contained by that FDS. However, an FDS represents no real physical data. The FDS
is used throughout the present invention to provide new and efficient ways of accessing and organizing data.
GRP General Record Pointer A GRP is an 12-byte data identifier. A GRP
consists of a 4-byte Site Owner Identifier, a 4-byte Data Set Identifier and a 4-byte Data Record Identifier. A GRP uniquely identifies a given data entity anywhere. However, when a Data Set is referenced which is not in the current Set List, a search must be performed to located it. For this reason the External General Record Pointer is a convenience, added to enhance GRP access speed.
MMR Memory Naintenance Record An NMR is a record used to maintain information specific to a Data Set or set list located in memory.
NRI Memory Record Identifier An MRI is a 4-byte field which contains the number of a specific Memory Maintenance Record.
The MRI is used as the key field in Memory Maintenance Records for sorting and location.
2 1 ~
MSL Master Set List An MSL is a Set List which identifies the overall s~ructure of information at a given Site. An MSL has exactly the same structure as S a Set List. There is exactly one MSL for a Site and all Finite Data Sets within that Site will exist within the Finite Data Set embodied by the NSL.
PDS Physical Data Set A PDS is a class of Data Set which identifies a physical data entity (e.g., a record, a list).
Since it is a physical entity, a PDS may not contain any other Data Sets as children.
However, a PDS still retains a logical connection by identifying exactly one immediate parent. A PDS obeys the Hierarchical Organization Rules when existing in a hierarchy. A PDS is used to store actual bytes (or records) of information. Note that the parent of a PDS may be a Cont~inrcnt Data Set or a Finite Data Set. Each PDS at a Site can be reached via a unique Set List Element, and is represented by a unique physical data body (in secondary storage and RAM).
SAI Storage Address Identifier An SAI is a 12-byte field consisting of a 2-byte Storage Device Number, a 4-byte Remote Disk Number, and a 6-byte Offset Within The Storage Device. The first field allows for a maximum of 2l6-l storage devices. The second field services all storage media with removable "disks". This includes floppy, tape, floptical, removable-hard-drive, etc. By setting the second field to appropriate disk number, data can be located uniquely across ~ 3?~ 12 many disks. This field can be used to automatically indicate a "disk~ number to a program or end-user. The second field supports a maximum of 232-l removable disks per device.
The third and last field in the SAI is a relative offset into the device. If remote disk number is non-null, then this is an offset in that disk. The third field is always an absolute address within that device. The third field supports an address space of 248-1 bytes.
Site A Site is a one or more computers using a common series of secondary storage devices and a common Finite Data Environment Processor.
SL Set List An SL is an array or list of Set List Elements which define the hierarchy of an Finite Data Set. An SL is always sorted based on the Self Identifier field of the Set List Elements.
SLE Set List Element An SLE is a record which contains all control and structure information related to a given Data Set. An SLE contains a Self Identifier field by which it can be located and sorted.
SOI Site Owner Identifier An SOI is a 4-byte field which contains a unique identifier associated with exactly one Site. An SOI is unique and can be used to identify the Site from which a given Finite Data Set originated. An SOI is provided by the Finite Data Environment Processor when a new Finite Data Set or Set List Element is created.
The Finite Data Environment Processor is 21~3~8~
sssumed to have a serial number for each occurrence. The SOI is that serial number.
COMPONENTS AND CONCEPTS
Figure 1 is a block diagram illustrating the environment in which the present invention operates. A
computer 10 has one or more secondary storage devices 12 attached thereto. The computer 10 operates under the control of an operating system 14 resident in the memory 16 of the computer 10. The operating system 14 controls the execution by the computer 10 of both application programs 18 and a Finite Data Environment Processor (FDEP) 20. The FDEP 20 manages Data Sets 22 residing on the secondary storage devices 12 and in memory 16 using Set Lists (SLs) 24 and General Record Pointers (GRPs) 26. The Data Sets 22 are the building blocks on which all the other components operate. The SLs 24 are comprised of Data Sets 22 organized into a hierarchy.
The GRPs 26 identify information in terms of Data Sets 22 and records within them. The FDEP 20 is a process that performs functions on one or more Data Sets 22.
The following sections describe these basic components and concepts in more detail.
DATA SETS
A Data Set 22 may be an array or list of records or contiguous binary data. A Data Set 22 is either a logical entity such as a directory, or a physical entity such as a file. The principle idea of the present invention is that a Data Set 22 is uniquely identifiable, so the problems normally associated with referencing the location of data are not an issue.
2~ 14 SET LISTS
Figure 2 illustrates the structure of a Set List (SL) 24. An SL 24 is a Data Set 22 comprising a list of records 28 which identify a relative relationship between a Data Set 22 and those Data Sets 22 it is logically contained within. Thus, an SL 24 stores hierarchical information by recording the ancestry (parent) of a Data Set 22. Figure 2 also illustrates that there is one Master Set List 30 at each Site.
Figure 3 illustrates the structure of a Set List Element (SLE) 28, which is a record within an SL 24.
The number of SLEs 28 in an SL 24 is only limited by the amount of memory available. Each SLE 28 within an SL 24 is a Data Set 22 and has fields to control all aspects of its behavior.
An SLE 28 contains all information pert~ining to a Data Set 22, including Self Identifier 32, Parent Identifier 34, Memory Record Identifier (MRI) 36, Data Set Name String 38, Number Of Parents 40, Data Set Type (DST) 42, Data Set Status (DSS) 44, Storage Address Identifier (SAI) 46, SLE Checksum 48, Data Set Checksum 50, Archive Checksum 52, Security Code 54, and Record Size 56.
The Self Identifier field 32 provides the identity of the Data Set 22. The Self Identifier field 32 is a key field used for sorting and location activities.
The Parent Identifier field 34 is used to identify a parent SLE 28 within the SL 24 or contains a null. If the Parent Identification field 34 contains a null, then the SLE 28 is a root SLE 28. A root SLE 28 is a Data Set 22 of greatest significance in a given SL 24. An SL
24 can have one or more root SLEs 28. If the Parent Identification field 34 contains a non-null, then the parent SLE 28 is located in the current SL 24.
The Memory Record Identifier (MRI) field 36 identifies where in memory the Data Set 22 is stored.
21~32~8 The Data Set Name String field 38 inside an SLE 28 is the name of the Data Set 22. The Data Set Name String field 38 is primarily for presentation to end-users, although it can be used to locate a Data Set 22 S as well.
The Number Of Parents field 40 identifies how many times the Data Set 22 is referenced within other SLs 24 of greater significance. The Number Of Parents field 40 is used only when the Parent Identification field 26 contains a null.
The Data Set Type (DST) field 42 identifies the type of the Data Set 22. Figure 4 illustrates some example Data Set 22 types and their respective values.
The Data Set Status (DSS) field 44 identifies the status of the Data Set 22.
The SAI field 46 identifies the location in secondary storage of the Data Set 22.
The SLE Checksum field 48, Data Set Checksum field 50 and Archive Checksum field 52 are used for integrity checking at all stages of data movement and storage.
The Security Code field 54 is used to encrypt and decrypt a Data Set 22.
The Record Size field 56 of the SLE 28 to resolve references to records in the Data Set 22.
GENERAL RECORD POINTER
The primary problem with the containment method of storing hierarchy is the difficulty associated with linking data and maint~i n i ng the links when the data moves. The solution to the problem is to break the links into two components: a static and dynamic component. The dynamic component is the SL 24, because the SL 24 identifies the location of information within a relative environment. The static component is a General Record Pointer (GRP).
~ a~ 16 Figure 5 illustrates the structure of the GRP 26.
The GRP 26 is an 12-byte identifier comprising a 4-byte Site Owner Identifier (SOI) 58 containing a unique identifier associated with the Site, a 4-byte Data Set Identifier (DSI) 60 cont~ining a unique identifier associated with a Data Set 22, and a 4-byte Data Record Identifier (DRI) 62 identifying a record within the Data Set 22.
The GRP 26 is a static component, because a Data Set 22 can move within an SL 24 or be moved to a different SL 24, without affecting the validity of the GRP 26.
The GRP 26 r~mAin~ valid wherever the Data Set 22 moves, because the Data Set 22 has the same identifier as the GRP 26 and need only be matched to the GRP 26.
COMBINED DATA IDENTIFIER
The combination of an SOI 58 and DSI 60 is also termed a Combined Data Identifier (CDI) 64, which is used to uniquely identify a Data Set 22 on any Site.
Using the SOI 58 to further qualify the identity of a Data Set 22 allows the Data Set 22 to be moved into and out of various Sites.
For example, Site nA" can import Data Set "1" from Site "B". However, there may already exist Data Set "l"
in Site nA". Therefore, the SOI 58 is required to uniquely identify the Data Set as Site "B" and Data Set "1" .
In Figure 2, each of the single and double line links is a CDI 64. The double line links 32 are CDIs 64 which identify the associated Data Set 22 via the Self Identifier field 32. The single line links 34 are CDIs 64 which identify another SLE 28 via the Parent Identification field 34. In this way, the hierarchical structure of a Set List 24 can be maintained in its simplest possible form.
21~3~
FINITE DATA IDENTIFIER
The combination of an SOI 58 and DSI 60 is also termed a Finite Data Identifier (FDI) 66. The difference between an FDI 66 and a CDI 64 is that an FDI
66 only identifies SLs 24. In contrast, a CDI 64 can identify any class of Data Set 22.
EXTERNAL GENERAL RECORD POINTER
When a Data Set 22 is referenced which is not in the current SL 24, or when the SL 24 is not identified, then a search must be performed to located it. For this reason, an External General Record Pointer (EGRP) is an added convenience to enhance GRP 26 access speed.
Figure 6 illustrates the structure of the EGRP 68.
The EGRP 68 is a 20-byte identifier comprising an 8-byte FDI 66 (including both the SOI 58 and DSI 60 of the desired SL 24) and an 8-byte CDI 64 (including both the SOI 58 and DSI 60 of the desired Data Set 22), and a 4-byte DRI 62. The EGRP 68 identifies a unique Data Set 22 anywhere.
FINITE DATA SETS
Perhaps the most significant of the concepts in the present invention is the Finite Data Set (FDS) concept.
An FDS is a logical entity which identifies relative hierarchical structure. An FDS contains internal hierarchical structure, while external links exist which point to that FDS as a subordinate.
An FDS is represented by an SL 24 which may contain internal hierarchy. Further, an FDS may be pointed to as a subordinate by any number of SLEs 28 in other SLs 24. Therefore, an SLE 28 which points to an FDS can occur in more than one SL 24. Therefore, the FDS may be subordinate to any number of (other) SLs 24. This is how an FDS can have multiple parents in a given information structure hierarchy. Occurrences of SLEs 28 W094/07209 ~ 18 PCT/CA93/00338 which point to the same FDS may reside in a unique and separate SL 24. In this way, an FDS may have sny number and class of Data Sets 22 as its ancestors and descendants within the current structure hierarchy.
The FDS methodology varies substantially from conventional containment methodology because an FDS can have more than one parent. The potential plurality of FDS parents is an important concept in the present invention. The potential plurality of an FDS is a feature not possible in environments which operate using the conventional cont~inr?nt method of information structure storage.
When an FDS has more than one parent, the entire Data Set 22 is, in essence, addressable through all references to it. This does not mean the entire FDS is duplicated. Instead, it means the same FDS can be referenced more than once within one or more SLs 24 of greater significance.
FINITE DATA LIST
Figure 7 shows an example Finite Data List (FDL) 70 for a Site. An FDL 70 is an SL 24 which contains a list of records identifying all FDS's at a Site. Preferably, there is only one FDL 70 at a Site. Each FDL Element (FDLE) 72 is an SLE 28 which points to an SL 24.
For example, the SL 24 pointed to by the first FDLE
72 in Figure 7 contains four SLEs 28. The fourth SLE 28 of the SL 24 represents an FDS. The fourth SLE 28 points to the fourth FDLE 72, which in turn points to another (sub-ordinate) SL 24.
The FDL 70 ignores logical information structure.
The FDL 70 physical structure shown in Figure 7 could also represent an FDS with a logical structure shown in Figures 8A and 8B. In fact, the FDL 70 physical structure could be used for any logical hierarchy whatsoever. This illustrates an important property of the present invention - physical structure is made 2 1 ~
independent of logical structure, thereby supporting any and all logical information structures.
LOCATING DATA SETS WITH THE GRP AND EGRP
A GRP 26 identifies information in terms of: an FDI
66 to identify the SL 24, a CDI 64 to identify a Data Set 22 within the SL 24, and a DRI 62 to identify a record within the Data Set 22. The FDI 66 is not necessary to identify a Data Set 22, because the CDI 64 and DRI 62 can be used alone. However, if an FDI 66 is not provided, then locating a Data Set 22 might require an exhaustive search of all SLs 24 at a given Site for an SLE 28 matching the CDI 64, and potentially at other Sites as well. Therefore, an FDI 66 is a speed saving feature, but is treated as a required component to provide superior data access characteristics.
There are two major sources of an FDI 66: (1) from within the code of an executable program, or (2) as data for that program. When a program provides an FDI 66, data can be assumed to exist within the SL 24 specified by the FDI 66.
The factors which determine the correct GRP 26 format (GRP 26 or EGRP 68) to use, under various conditions, are very straightforward. A GRP 26 is used when a program has a small number of SLs 24 to maintain, and ideally just one SL 24. Data access speed is m~ximi zed when a program provides an FDI 66 and all data required for that program exists within the SL 24 specified by that FDI 66.
However, many programs contain or use data which can span several SLs 24. For programs with these data requirements, the EGRP 68 is ideal. An EGRP 68 identifies the correct SL 24, Data Set 22 within that SL
24, and record within the Data Set 22. However, an EGRP
68 is 8 bytes larger than a GRP 26 and should be used sparingly.
W094/072~ ~ ~3 ~ ~ ~ 20 PCT/CA93/00338 Figure 9 depicts the data structures used in the process performed by the FDEP 20 to resolve a GRP 26.
The FDI 66 is used as the key to search the FDL 70 for the appropriate FDLE 72, indicated by reference element 5 n ( STEP 1)" in Figure 9. If the FDI 66 does not match any FDLE 72 in the FDL 70, then the FDEP 20 may or may not perform an exhaustive search for the Data Set 22 matching the CDI 64 in all SLs 24. If the FDLE 72 is found, then it will contain an MRI field 36. The MRI
field 36 identifies a specific Memory Maintenance Record (NNR) 74, which is illustrated in Figure 10, and contains a Self Identifier field 36 (an MRI 36), a Memory Address Identifier 76, and Memory Status Flag 78.
The MMR 74 contains the address of the Data Set 22 when 15 it is in memory. The MMR 74 resides in the Memory Maintenance List (MML) 80, which is illustrated in Figure 11. The NML 80 is a contiguous list of MMRs 74 sorted by the NRI field 36. The MMR 74 then identifies a location in memory where that Set List 24 has been loaded.
If the SL 24 has not already been loaded into memory, then the FDLE 72 will contain a null MRI field 36. If the specified SL 24 is not in memory, then it is loaded into memory by the FDEP 20.
The SL 24 is then searched using the CDI 64 as the key, to locate the SLE 28 representing the Data Set 22.
This is indicated by reference element "(STEP 2)" in Figure 9. If the CDI 64 does not match any SLE 28 in the SL 24, then the FDEP 20 may begin searching other SLs 24. If the appropriate SLE 28 is found, then it will contain an MRI field 36.
If the Data Set 22 has not already been loaded into memory, then the SL~ 28 will contain a null MRI field 36. If the specified Data Set 22 is not in memory, then it is loaded into memory by the FDEP 20.
The Data Set 22 is then accessed using the DRI 62 as an index to locate a specific record, wherein the DRI 62 W094/07209 21~ 3 2 8 8 PCT/CA93/00338 is multiplied by the record size given in the SLE 28 to determine the offset into the Data Set 22 when records in a Data Set 22 are stored as an array. The DRI 62 can also be used as a search key when records in a Data Set 22 are stored as a sorted list. This is indicated by reference element "(STEP 3)" 92 in Figure 9.
Figure 12 illustrates a similar process for an EGRP
68. The only variation between GRP 26 and EGRP 68 resolution is the source of the FDI 66. In the case of a GRP 26, the program provides the F~I 66, and in the case of an EGRP 68, the EGRP 68 provides the FDI 66.
These relationships between GRP 26 and EGRP 68 resolution can be seen by comparing Figure 9 and Figure 12.
Figure 13 is a flow chart illustrating the logic flow of the GRP resolution process. At each step of the process, a search may fail. Each failure may result in further and more comprehensive searches. Figure 13 is the top-level process diagram and has three boxes 96, 102, and 86, to indicate the various searching processes illustrated in Figures 14, 15 and 16.
One purpose of the process depicted in Figure 13 is to locate data as quickly as possible. To save time, the current SL 24 is always searched first, as indicated by reference elements 96 and 100 in Figure 13. If that fails, then the next fastest candidates are all the Data Sets 22 (SLs 24) in memory. Searching these SLs 24 is faster than those not yet in memory. This is shown as the search RAM process in Figure 14. If the RAM search still fails, then the only remaining possibility is secondary storage. This is shown as the search storage process in Figure 15. Both the RAM and secondary storage processes call the search SL 24 process, to perform a search in an SL 24. The search SL 24 process is shown in Figure 16. This process searches a given SL
24 for an SLE 28 with a matching CDI 64.
W094/07209 ~ ~ 22 PCT/CA93/00338 In the present invention, when an SL 24 sesrch for 8 specified Data Set 22 fails, FDEP 20 will normally take steps to locate the Data Set 22 in other SLs 24. The nature of the search depends on the nature of the user request for data. The FDEP 20 can easily provide several different data location alternatives, including:
(1) fail when Data Set 22 is not found in current SL 24;
(2) fail when Data Set 22 is not found in any SL 24 currently in memory; and (3) fail when Data Set 22 is not found in any SL 24 at the Site. In selecting the third alternative, the FDEP 20 performs an exhaustive search all SLs 24 to locate the specified Data Set 22.
A failure on an FDLE 72 search is a different matter. Assume a program requests a Data Set "X" within the SL '~Yll via an EGRP 68. If the Data Set "Xll had been moved to a different SL and the SL nyll had been deleted, then an FDLE 72 search failure would occur. The failure would occur during the FDLE 72 search because the FDLE
72 for SL "Y" would no longer exist and all SLs are represented in the FDL 70. Therefore, the FDEP 20 would have two alternatives, which are controlled by flags or alternative data request functions. The first alternative is to fail and inform the program the SL "Y"
not longer exists. The second is to perform an exhaustive search of all SLs for the Data Set "X".
The GRP 26 resolution process described above assumes that all SLs 24 and Data Sets 22 are mobile in memory. Specifically, the addresses of these entities change depending on memory availability and the requirement of memory optimization. Memory optimization is a process whereby data (SLs 24 and Data Sets 22~ are moved around in memory to increase the size of available contiguous memory blocks. When memory optimization occurs, the address of any given data entity can change as a direct result of an optimization.
21432$8 However, the present invention supports more evolved memory control devices which allow a memory block to be flagged as static. A static memory block is one which will not move during an optimization. When an MMR 74 for an Data Set 22 is marked "static in memory", a program using the present invention can request the address of a Data Set 22 and thereafter directly access that data in memory. This reduces access overhead, but makes memory optimization less efficient.
DATA SET CLASS
All Data Sets 22 may be qualified by a ~class". In general terms, the Data Set 22 class defines the storage classification for a Data Set 22: physical, logical, etc. The currently defined Data Set 22 classes are:
Cont~in~-~t Data Set (CDS), Physicsl Data Set (PDS) and Finite Data Set (FDS).
CONTAINMENT DATA SET
A Containment Data Set (CDS) is a class of Data Sets 22 that is a logical entity used to define a conventional hierarchical information structure, such as the drive/directory/file structures of MS-DOS and UNIX.
Thus, a CDS is similar to volumes, directories, etc., found in conventional containment methodology.
A CDS has exactly one immediate parent. However, the CDS may logically contain any number of other Data Sets 22 as children. The CDS does not identify its children. Instead, all children identify the CDS as their parent. Further, all the children of a CDS exist in the same SL 24.
A CDS is represented by an SLE 28 within an SL 24.
The SAI field 46 of the SLE 28 contains a null value indicating that there is no physical data associated therewith.
W094/07209 ~ PCT/CA93/00338 3 ~
PHYSICAL DATA SET
A Physical Data Set (PDS) is a class of Data Set 22 that identifies a physical data entity (e.g., a record, a list~. A PDS is used to store actual bytes (or records) of information. A PDS is pointed to by an SLE
28 in an SL 24.
A PDS does not contain any other Data Sets 22 as children. However, a PDS still retains a logical connection by identifying its parent. The parent of a PDS may be a CDS or an FDS.
The location of a PDS is identified by the SAI field 46 of the SLE 28. The SAI field 46 is non-null and typically identifies an address in secondary storage media.
HIERARCHICAL ORGANIZATION RULES (HOR) There are several rules regarding hierarchical construction in the present invention, these are collectively called Hierarchical Organization Rules (HORs). The first HOR is the Hierarchical Scope Rule (HSR). The HSR states:
"The priority level of a Data Set must be less than or equal to its parent."
The HSR is a simple rule where each Data Set 22 must have an equal or lower priority number than its immediate parent. As mentioned earlier, each Data Set 22 class is prioritized. The prioritization is independent from the class of data. Thus, all Data Sets 22 will be associated with a unique scalar priority number from 1 to N. Gaps in the priority numbers are possible and even desirable.
The second HOR is the Hierarchical Containment Rule (HCR). The HCR states:
"The absolute path to any Data Set is always a PDS, a CDS, or an FDS, where the CDS or PDS is preceded by zero or more CDS's and/or FDS's".
W094/07209 214 3 2 8 ~ ~ PCT/CA93/00338 The HCR is an extension of the HSR which ensures that a PDS will never contain another PDS. For example, a file cannot contain another file. A file is a physical entity (a PDS), and therefore the immediate parent of the file is a CDS or FDS. Although it is possible to have PDS's also act as CDS's, it creates chaos and complicates all FDEP 20 processing~.
The third HOR is the Hierarchical Recursion Rule (HRR). The HRR states:
"A Data Set cannot directly or indirectly contain any of its ancestors".
The HRR is required to prevent an endless loop during any FDEP 20 process.
To clarify any misconceptions, consider the absolute path example in Figure 8A and consider the effects on the FDEP 20 if Data Set "F" is contained Data Set ~A".
The process would be as indicated in Figure 17 until "An is to be resolved. Instead of t~r~;n~ting, the entire process would repeat from Data Set "F" again. The net effect (of any infinite loop) is that the computer would hang. Although it is true that a process could be established to prevent an endless loop or recursion, this same process would have to exist or be called by all programs and processes which traverse a hierarchy.
Further, these cyclic hierarchies would introduce unnecessary complications for most users.
FINITE DATA SETS AND THE HOR
FDS's do not obey the HOR. However, when FDS's are introduced into an existing hierarchical structure, they do not disturb and/or affect any other Data Sets 22 which do obey the HOR. When a check is made to ensure a new Data Set 22 conforms to the HOR, any FDS's encountered during the test are processed to establish the next SL 24, but are invisible to HOR testing.
W094/07209 ~ ~ 26 PCT/CA93/00338 FINITE DATA ENVIRONMENT PROCESSOR
A Finite Data Environment Processor (FDEP) 20 is a process which creates, modifies and maintains FDS's and all structures ContA i neA within or related to FDS~s.
Further, the FDEP 20 also maintains any memory images of SLs 24 or Data Sets 22 requested by a process.
There is exactly one FDEP 20 for a given Site. An FDEP 20 is assumed to have a unique serial number, iOe., the Site Owner Identifier (SOI) 58. Each process using the present invention must access the FDEP 20 to modify the structure and content of information at a Site.
The FDEP 20 is a background process which other processes activate to create, delete or access information, including the following functions:
CREATE DATA SET
DELETE DATA SET
COPY DATA SET
MOVE DATA SET
PROMOTE DATA SET
INSERT DATA SET
The FDEP 20 must be a resident process, meAning it must remain in memory while any processes require it.
If the FDEP 20 is an operating system, then it will remain in memory until the computer is powered down. If the FDEP 20 is a resident driver for a program, then it will have to remain in memory as long as the program r~mA i n~ in memory. Further, the FDEP 20 must have the ability to allocate memory and utilize physical storage devices. Finally, the FDEP 20 should have complete access to all information that may be requested or used through it. The FDEP 20 can exist and operate within a conventional contAinmcnt-style operating system, although it will operate at processing speeds normally associated with that operating system.
Data movement can occur at many processing levels and can involve anything from moving a single record to moving an entire Data Set 22. In most cases, the movement of data less than a Data Set 22 is the W094/07209 ~ 32 ~ PCT/CA93/00338 responsibility of the program using the data.
Therefore, this aspect of data movement is covered in the section entitled nInformation Structure." The movement of Data Sets 22 can be experienced at any processing level and is the primary focus of this section.
One of the cornerstones of the present invention is the simplicity of redefining relationships between Data Sets 22. A Data Set 22 is moved by altering the Parent Identifier field 34 of an FDLE 72 or SLE 28. Thus, Data Set 22 movement is always performed in terms of the parent of a target location. Further, any entities logically contained within the Data Set 22 being moved are also transported, without actually modifying their SLEs 28 in any way. All contAinm~nt-style operating systems performing the same activity would have to load, update and save, potentially large variable length child lists. For example, the File Allocation Table (FAT) in MS-DOS is used to access and locate the files in a directory. When a file is moved, two FATs have to be updated, the old parent directory and the new parent directory.
There are currently four fundamental classes of Data Set 22 movement: create, delete, copy and relocate.
Copy and delete functions can be implemented as singular functions where there is only one kind o Data Set 22 copy and one kind of delete. This is not true for relocation operations. When a Data Set 22 is relocated, several distinct operations are possible. Relocation can include operations such as: move, promote, and insert. Although all of these operations involve the movement of one or more Data Sets 22, they have different logical characteristics. For example, the promote function, would locate the parent of the current parent and reset the Parent Identifier field 34 of the current Data Set 22. The location of a parent's parent is a logical activity because both the old and new W094/07209 ~ PCT/CA93/00338 ~3~ 28 parents can exist in the same SL 24.
In all the following Data Set 22 operation explanations, the name, type and other Data Set 22 information not directly related to lin~king is assumed to be input. When EGRP 68 is specified as a required input, this means it can also absorb a GRP 26 and a parent CDI 64 provided by the program. Each operation performs HOR testing to ensure the validity of all hierarchical modifications. HOR testing is performed on each entity modified, including subordinate Data Sets 22 when they are modified. If a subordinate is not modified, then HOR testing is not necessary. Note that all of the operations described below must be re-entrant.
CREATE DATA SET
The ncreate" operation introduces a new Data Set 22 into a Site. The create operation requires exactly one input: an EGRP 68 to identify the target parent for the new Data Set 22. The create operation is so straightforward that a diagram is not required.
However, it is necessary to point out that when data is imported from other Sites, the create function, is not directly used to establish new Data Sets 22. Nhen one or more Data Sets 22 are imported from another Site, they are assumed to already exist. Therefore, one of the RELOCATE class operations would be used. The RELOCATE class operations may in turn invoke the create operation, but the process which triggered the FDEP 20 to activate a RELOCATE class operation would not. If another Site is specifically creating a new Data Set 22 at the current Site, then the create operation would be used.
29 21~32~
DELETE DATA DELETE
The "delete" operation deletes a Data Set 22 and all subordinates from a Site. The delete operstion requires exactly one input: an EGRP 68 of the Data Set 22 to delete. The delete operation is so straightforward that a diagram is not required. However, it is necessary to point out that when data is moved to a different Site, using one of the RELOCATE class of operations, the delete command is not directly used. The RELOCATE class operation may call it, but the process which triggered the FDEP 20 to activate this function would not. Note that the delete operation does not physically delete FDLEs 72 or SLEs 28, it simply marks them as deleted.
This is common to almost all data maintenance systems.
COPY DATA SET
A Data Set 22 can be duplicated via the "copy"
operation. When a Data Set 22 is copied, a new Data Set 22 is created. The new Data Set 22 is exactly the same in terms of data content and control linking. However, the new Data Set 22 is assigned a new and unique Self Identifier field 32 (which is a CDI 64). Note that the change in the Self Identifier field 32 is critical to the present invention, because every Data Set 22 must have a unique Self Identifier field 32.
After a successful copy, space consumption is doubled because an exact copy of the Data Set 22 is added to the existing Site. Figure 18 depicts a simple copy operation. Data Set nF" is being copied directly under Data Set "A". After the operation a new Data Set nH~ will be an exact duplicate of Data Set "F".
Figure 19 depicts a copy operation which includes subordinate hierarchy. In this operation, the subordinate Data Sets 22 are also duplicated and each will also have a new Self Identifier field 32. Further, the Parent Identifier field 34 of the subordinate Data Sets 22 identifies the new parent. In Figure 19, Data W094/07209 4~ PCT/CA93/00338 Sets "I", "J", "K" sre identical to Data Sets "E", "F"
and "G", respectively, but "I", ~J~ and nK" all have a parent of "H". The copy operation performs a test to establish if other Data Sets 22 are i~^~i~tely subordinate to the Data Set 22 being copied. When a subordinate is located, the Data Set 22 copy is also performed on it. This process can be recursive or iterative. When a copy is performed on a root Data Set 22, it can be used to transfer entire volumes of information to another storage media, or to duplicate that information in the same media. Under all conditions the original Data Set 22 rem~i n~ unchanged.
MOVE DATA SET
The "move" operation moves a Data Set 22, and all subordinates, from one location to another within a single SL 24 or to a different SL 24. Figure 20 depicts Data Set "E" being moved from a parent of "C" to a parent of "Bn. None of the subordinate Data Sets 22 are changed in any way during a move operation. The Parent Identifier field 34 in each SLE 28 still identifies the same parent. However, all subordinates are logically promoted in terms of future references.
PROMOTE DATA SET
The "promote" operation moves a Data Set 22 to the same level as its current parent. This is similar to the move operation, except the only input for this function is an EGRP 68 for the source Data Set 22. The promote operation uses the Parent identifier field 34 of the source Data Set 22 to locate the target location.
Figure 21 depicts Data Set "E" and its subordinates being promoted. None of the subordinate Data Sets 22 are changed in any way during a promote operation. The Parent Identifier field 34 in each SLE 28 still identifies the same parent. However, all subordinates are logically promoted in terms of future references.
-~ 31 21~32~
INSERT DATA SET
The "insert" operation inserts a Data Set 22 between an existing Data Set 22 and its parent. This demotes all subordinate Data Sets 22, but does not modify any of the control records for these Data Sets 22. In essence, the insert operation creates a new Data Set 22 and changes the Parent Identifier field 34 of the target Data Set 22 to that of the newly created Data Set 22.
Figure 22 depicts a Data Set "X" being inserted in front of Data Set "Bn. The new Data Set "X" absorbs the Parent Identifier field 34 of Data Set "B" and creates "X" as a child of nAn. The Data Set nBn is modified to have a parent of nXn. The input for this example of insert is an EGRP 68 for Data Set nB".
The same logic applies when a Data Set 22 is to be inserted in front of a root Data Set 22, as shown in Figure 23. The only thing to note is that the Parent Identifier field 34 of Data Set nA" is null. Thus, the parent of Data Set nyn is null sfter the insert operation is complete.
INFORMATION STRUCTURE
A Data Set 22 is a logical and physical information structure. However, a Data Set 22 is also a data entity because it can be created, moved, duplicated, deleted, linked, or changed for content. In general, information structure is affected by a multitude of requirements and restrictions. The issues pert~; n; ng to the present invention are discussed in the following paragraphs.
When any information structure is moved, the presence of information references to data outside the structure must be considered. If the structure is totally self contained, meaning there are no references to data outside the structure, then it can be moved without incident. However, an information structure which contains references to data not contained within the same structure, may have restrictions regarding W094/07209 ~32~ 32 PCT/CA93/00~38 where the structure can be moved. Further, references to external data may have spacial dependencies which prohibit data movement outside of a defined scope. When data movement is limited in this way, there were previously only three solutions, prior to the present invention. First, do not move the data outside of the defined scope. Second, also move the data related to the external references. Third, make sure the data related to the external references exists at the target location. The present invention solves this problem by ret~ining the validity of all data links, even after the data has moved.
Different developers may use different schemes of information structure to represent data, where the data may or may not be of the same kind. Also, the same developer often uses different information structure schemes to represent different and general classifications of data. Any system that uses more than one database faces the possibility that the databases are completely different in structure. For example, a system that provides both accounting and task management could be faced with a relational, as well as a hierarchical database. The processing requirements of these two database types are different, which means the system contains particular logic (code) for both database types. In other words, the system must be aware of the database type and take conscious steps to retrieve data according to that type. The present invention directly supports all cont~in~Q~t (e.g., hierarchical database, directory-file) as well as non-cont~inment (e.g., relational database, network database) logical organizations. The present invention enables all logical organizations directly because each GRP 26 is a DIRECT link to data, no matter what logical organization is present and no matter where that GRP 26 occurs in that logical organization. By adding new Data Set 22 Types to the existing set, the system developer W094/07209 ~ l~ 3 2 ~ ~ PCT/CA93/00338 can introduce and use new logical organizations. This would inform both the system (program) and the FDEP 20 when that data is accessed in the future, but would not alter GRP 26 processing by the FDEP 20 or access by the program. In this way, the number of potential information structures (or logical organizations) directly supported by the present invention is virtually unlimited. Where cont~inment methodology is highly limited and requires special processes to control more complex information structures.
The direct linking capability of the present invention enables direct linking of incongruent data structures. For example, a binary tree where each node is also an element of a linked list. The record can directly identify next and previous elements, as well as the left and right nodes (in the tree). The system would still contain the logic to traverse the tree as well as the linked-list. However, the code required to get (locate, load, return address) a node in the tree would be identical to get an element of the linked list.
Further, the processing performed by the FDEP 20 to locate, load and return the address, would also be identical. The same would be true for data structures of any complexity with any number of compound links, where a compound link is a field in a record capable of identifying at least two distinct and different information bodies. For example, a binary tree whose every node identifies either the root of another tree or the head of a linked list. To identify the secondary tree or the list, the same GRP 26 field, in the node's record, can be used. This is possible because relationships to data can be duplicated (several GRPs pointing at same data) while any one GRP 26 field in a record can point at any data regardless of that data's logical organization.
W094/07209 ~ PCT/CA93/00338 Cyclic information patterns can be accurately represented. For example, a linked list which is linked to another linked list in a three dimensional model, but cycles back on one or more of the ~ ions.
When using the present invention, the program designer must be aware of the specific and overall differences between GRP 26 and EGRP 68 usage.
Physically, when GRPs 26 are used as data links (or pointers), the overall and individual size of data is less than when EGRPs 68 are used. This is a superficial difference and should not be used as a sole deciding factor between the two methods. First, the designer must decide what the general relationship between his program and its data are. This can be expressed as either the program ndrives" the data, or the data "drives" the program. In the first case, the program consciously directs the steps to locate, retrieve, and maintain the data. In the second, the program accesses data by referencing its identifier without taking any conscious steps to otherwise identify that data. When GRPs 26 are used, the program is driving the data. This could be done by a program when it saves the parent FDI
66 of the currently requested Data Set 22, and supplies it to the FDEP 20 on each subsequent GRP 26 request.
Note that the program is not required to save this information; it is optionally saved to reduce GRP 26 processing overhead. On the other hand, when EGRPs 68 are used, the data ndrives" the program. The program simply passes the identifier (the EGRP 68) to the FDEP
20. The parent FDI 66 is already inside the EGRP 68.
So, the same ends as saving the parent by a program are achieved without the program even knowing about this information or its related processing. Another advantage of EGRPs 68 is that a program can access a completely alien piece of data without any special knowledge. The data may be on another Site and of a form not known to the program accessing that data.
21~3~
Using EGRPs 68, this is not a problem because all information about the data tand how to retrieve it) is already contained in the associated SLE 28. This enables the FDEP 20 to properly retrieve that data without program interference, pre-knowledge or specialized code.
THE PATH CONCEPT
The present invention eliminates the need for paths for identifying the location of data. However, in some instances the construction of an absolute path is necessary for end-users. Unlike a program, end users may not understand GRP 26. Therefore, they may be presented with names (text) to identify the Data Sets 22.
In the present invention, path construction can occur from two directions: top-down and bottom-up. Top-down means all descendants of a given Data Set 22 are traversed (and recorded on ~em~n~). Bottom-up occurs when ancestors of a given Data Set 22 are traversed (and recorded). The current Data Set 22 must be supplied to both path traversal processes to identify the start of traversal. This can be passed as a GRP 26, EGRP 68, or FDI 66. For simplicity, the rest of this document only mentions FDIs 66 when referring to process input(s).
Note that FDI 66 is the smallest number of input bytes (or parameters) to perform exactly the same processes.
In top-down, all descendants of an initial Data Set 22 must be found and accumulated. This traversal is dependant on the class of the Data Set 22. If the initially supplied Data Set 22 is of class PDS, then the process successfully terminates because a PDS cannot have any descendants. If the initially supplied Data Set 22 is a CDS or FDS, then the process continues in a recursive manner.
W094/07209 ~3~ 36 PCT/CA93/00338 This process is shown in as the method "TOP_DOWN" in Table I. If a CDS is encountered in trsversal, then the current SL 24 is searched for all SLEs 28 whose parent is the given CDS. Once all these Data Sets 22 have been located, the process recurses on each of these Data Sets 22 to determine their descendants.
If an FDS is encountered in traversal, then the SL
24 is searched for all root SLEs 28 as indicated by a null Parent Identifier field 34. The root SLEs 28 are the immediate children of the FDS, and again the process recurses on each one.
At each recursive level, a PDS class causes the successful termination of that recursive call. The process t~r~in~tes when all descendants have been found (i.e., all recursion levels are terr;n~ted).
Also, as shown in Table I, the step nDISPLAY OR
ACCUMULATE NAME OF CURRENT_DSn is performed for all Data Set 22 classes. Depending on the purpose of a top-down function, the Data Set 22 names could be immediately displayed or accumulated in a structure for later use.
If accumulated, then the output from a top-down process is large, even if only names were recorded at each step.
This process is useful when operating system shell programs (e.g., Norton Commander, MS-DOS 5.0, etc.) need to display and process information in secondary storage (and memory) according to a hierarchy.
In top-down traversals, FDS's with multiple parents do not effect processing because the direction of traversal makes the process unaware of ~blind to) parent links altogether.
In bottom-up, the ancestors of a given Data Set 22 must be traversed, until a root SLE 28 in the MSL 30 is reached. This establishes the absolute path to the Data Set 22. The input FDI 66 locates an SLE 28 within the current SL 24, which has a parent. The SLE 28 for the parent is then searched for and located. The parent SLE
28 may be in the same SL 24 or it may exist in another W094/07209 214 ~ ~ 8 ~ PCT/CA93/00338 SL 24. In this way, traversal encompasses internal as well as external SL 24 processing. This process continues until a true root (the root SLE 28 in the MSL
30) is encountered. At that time, the fully qualified path has been established by concatenating the Data Set Name String field 38 of the SLEs 28 at each step of traversal. Note that the concatenation at each step is performed at the head of the current accumulated textual string. At each step, the current name is concatenated with the existing accumulated textual string. The current name string becomes the accumulated string for the next step of the process.
The construction of an absolute path in the bottom-up direction can present a problem when an FDS with more than one parent is encountered. The construction of an absolute path is only a problem if an FDS chain has not been established. An FDS chain is a hierarchical set of FDS ancestors for a CDS and FDS, which lead to a current FDS.
An example of this is a user traversing through the hierarchy of several FDS's to reach a desired or target FDS. Each FDS traversal is recorded and constitutes part of the FDS chain. When an absolute path is required, the current or target FDS would use the FDS's identified in the chain as parents.
Note that FDS chain methodology is only necessary when an occurrence of an FDS with more than one parent is encountered. FDS chain methodology does not affect GRP 26 resolution in any way because a root SLE 28 of class (PDS or FDS) in the current FDS contains all information required to resolve the reference.
Note also that the bottom-up process to construct a path only occurs when a user consciously requests a path string. This process has no effect on critical GRP 26 processing.
W094/07209 PCT/C~93/00338 2 ~ ~3~d 38 NAMING CONVENTIONS
In existing systems, names for data (and/or paths) are always strings, where a string is a sequence of characters. String formats in current industry range from just a terminator, to complicated control areas at start of (or before) the string. In most prior art, data is located by using the name. This makes data access dependant on strings. Witness to this fact are the large number of patents held by IBM, APPLE, etc., that provide new and efficient ways of string processing. To these developers, strings are not a convenience, they are an integral and crucial part of data location, access, and maint~n~nce. While efficient string processing algorithms are extremely useful for word processing and formatting applications, they are not efficient when data access and location are considered. The problem is not the algorithms, it is that string processing will always be slower than a process which uses binary identifiers. Further, if strings are used as identifiers, then they must be stored in RAM by any program using the data, as well as stored in any data record which wants to point to another piece of data. This makes the memory required for a program and the size of data records much larger.
This increase is further complicated by the fact that in most instances, the name strings are variable length.
Ultimately, strings used as data identifiers are always bulky and awkward to maintain and process.
The present invention enables a Data Set 22 to be uniquely identified (across any number of Sites) via a GRP 26. As a result, no string processing is ever required. For the present invention, names are a convenience for the user's sake. A program or user adopting the present invention may still use strings as identifiers, but processing overhead will be greater than a strict GRP 26 or EGRP 68 access. Locating a Data Set 22 by its name is generally the same process as GRP
~ 21~32~
26 resolution, except the Data Set Name String field 38 is used as the key instead of the Self identifier field 32. The increase in processing overhead is substantial because a string comparison is notably slower than a binary number comparison.
However, a program which presents data to a user on a strictly name basis may encounter problems. When such a program locates the first SLE 28 (or FDLE 72) with a matching name, it would stop and assume it has found the correct data. However, this is an incorrect assumption for two reasons. First, a single directory may contain multiple occurrences of data with the same name.
Second, an SL 24 can contain multiple occurrences of data with the same name, but located in different directories. Recall, an SL 24 is a list. Therefore, a direct search would yield an occurrence of data with that name, not necesssrily the correct occurrence or even the correct directory. Note that a program or FDEP
20 can prevent name duplication at specific levels in an SL 24 or FDL 70 hierarchy, Thus, eliminating this problem. The FDEP 20 can accomplished this by setting a flag in the FDLE 72 to prevent name duplication. A
program can prevent name duplication when required by searching for that name before saving data to secondary storage. A program can hierarchically order information such that name oriented conflicts are prevented and a different area in the hierarchy can still have several bodies of data with the same name.
Information ordering, method of access and resultant problems are a direct responsibility of the program using the data. The FDEP 20 does not perform any program specific logic. This is not a limitation of the present invention, it is a reality of computing; a program does exactly what you tell it to do, whether it is what you wanted or not.
3~3 Problems related to having multiple occurrences of data with the same name is an old problem and is found in all digital computing environments, which identify data by name. For instance, UNIX, MS-DOS, OS/2 and WINDOWS use the concepts of CURRENT WORKING DIRECTORY
and SEARCH PATH. Assume a file nFRED.XXX" exists in the CURRENT WORKING DIRECTORY and one or more of the directories identified in the SEARCH PATH. Further assume that ".XXX" is an extension which will cause the search path to be used. Under these conditions the copy of "FRED.XXXn in the CURRENT WORKING DIRECTORY would execute. However, if nFRED.XXXn does not exist in the CURRENT WORKING DIRECTORY, then the first occurrence encountered in the search path would be executed. Note that the operating system would execute the first occurrence, not necessarily the correct one. Therefore, the problems associated with having multiple files with the same name is common to all operating systems, which identify data by name. Also note that if nFRED.XXXn was not found in the CURRENT WORKING DIRECTORY or in the SEARCH PATH, then the operating system will return an error, normally as nFILE NOT FOUNDn.
The present invention has the ability to support the concepts of CURRENT WORKING DIRECTORY and SEARCH PATH, but they are accomplished differently. In the present invention the concept of CURRENT WORKING DIRECTORY can be accomplished at the SL 24 or directory level, by limiting SL 24 searching to those Data Sets 22 with a specified parent CDI 64. The same is accomplished in an SL 24 by limiting a search to a specific SL 24.
However, under normal conditions the present invention goes beyond the limitations imposed by the use of CURRENT WORKING DIRECTORY or SEARCH PATH, to enable a feasible search of all available storage media to locate a requested entity. In this case, nfeasible" means such a search is possible in less time than the average program is willing to wait for data. Clearly, this 21~28~
search is generally only "feasible" when the present invention is used (i.e., GRPs 26, not name strings).
In any valid data linking methodology, a non-unique identifier can only occur once at any given level in an information hierarchy. When data is identified by name, two entities may not have the same name at the same position in the hierarchy. For example, in MS-DOS or UNIX, if the file FRED.XXX is in the directory ALICE, then no other file in that directory can be called FRED.XXX.
However, in the present invention it is possible for two or more entities to have the same name, since the name is not the key used to identify data. Rather, the GRP 26 is. This property of the present invention may seem alien, but it does have an additional benefit.
Assume that two files called FRED.XXX are created by two separate users in two different directory areas. In the prior art, if one of the files is copied to the other's directory area, then the target file will be overwritten. This is common in networks, mini and mainframe computers, especially when more than one user is working in the same directory.
This problem can be easily avoided in the present invention. Even if the supplied identifier was a name string, eventually an SLE 28 will be found (if data exists). The SLE 28 contains the Self Identifier field 32 which is a CDI 64 uniquely identifying the Data Set 22. If the CDI 64 of the located Data Set 22 and the target Data Set 22 are not the same, then no overwrite occurs, and a new Data Set 22 with the same name is created. In this way, accidents as a result of different users using the same name are almost impossible in the present invention.
The present invention has the ability to maintain several files with the same name, which contain different data or different versions of the same data.
For example, a program may edit a given body of data and W094/07209 ~ 42 PCT/CA93/00338 the user can have the option of saving that data on an arbitrary basis. If the program stores each copy of the data as a discrete Data Set 22, with the same name, then the user can access previous versions of that data with a minimum of effort. In essence the ability to maintain several versions of data with the same name supports or gives rise to a very powerful UNDO capability.
DATA TRANSIENCE
Using SLs 24, the present invention permits contiguous hierarchical structures to exist across one or more physical storage devices. An SL 24 may or may not be on the same physical storage device as the information it represents. Further, each SLE 28 within an SL 24 can identify information existing on different physical storage devices. The principle is to allow structure information and physical data to be mobile while keeping all links valid.
The capability of a Data Set 22 (or SL 24) to encompass more than one physical device, has many direct applications. Consider a stAn~lone computer which contains the following I/O devices: Floppy, Hard-drive, CD-ROM drive, Floptical, Tape-drive, ROM, RAM.
Normally, a separate device driver would have to be used for each device.
Further, to transfer data between the devices, often several systems (programs) are required. Using SLs 24, all such drivers and low-level systems are centralized through a unique driver, namely the FDEP 20. The present invention accomplishes this centralization while reducing the access overhead and simplifying the memory maintenance duties of the FDEP 20.
REMOVABLE STORAGE MEDIA
Removable Storage Media (RSM) refers to a combination of one receptacle (or disk drive) and a potentially large number of storage disks (e.g., floppy, ~ 211~8 tape-drive, removable hard-drive, etc). Currently many ways of having one logical data entity across multiple RSM disks exist. All such techniques are particular to the specific needs of a single program (or group of programs). As a result, the transfer of data between various RSM is cumbersome and complicated. In the present invention, part or all of a Data Set 22 can be defined to be an RSM.
When a Data Set 22 exists on an RSM, its identifier tFDLE 72 or SLE 28) can exist on perr-nent media, which allows each RSM disk to be defined for direct access.
In Figure 24, an SAI field 46 is further defined as having three distinct fields: Device Number 166 (2 bytes), Remote Disk Number 168 (4 bytes) and Offset Within Device 170 (6 bytes). The Device Number field 166 provides for up to 2l6 (roughly 65 thousand) possible devices, more than sufficient for even the largest mainframes. The Remote Disk Number field 168 provides for up to 232 (roughly 4 billion) disks per physical device. The Off et Within Device field 170 provides for 248 (or trillions) bytes within each physical device.
Although these numbers may seems excessive now, consider the increase in the capacity of storage devices over the last 10 years.
In most microcomputer environments it would be possible to assign a unique device number to each st~n~rd or RSM disk. Ho~ever, this is a highly limited solution because a micro-computer will normally have only a fraction of the physical devices or RSM disk found on a mainframe-computer. The idea is to keep the structures stAn~Ard across all computing environments.
When a request is made for data on that RSN disk, the FDEP 20 can directly or indirectly cause a message to appear, which prompts the user to insert a specific disk into a specific device to complete a data request.
Once, the RSM disk had been inserted, processing could commence normally.
W094/07209 ~ PCT/CA93/00338 TABLE I
;;CURRENT_DS FDI, GRP, OR EGRP TO STARTING AND ON-GOING
DATA SET FOR TRAVERSAL
;;CUR_CLASS DATA SET CLASS FOR CURRENT_DS
CAT~T TOP_DOWN ( MY DATA_SET ) ;;START RECURSION
PROCEDURE TOP_DOWN ( CURRENT_DS ) DISPLAY OR ACCUMULATE NAME OF CURRENT_DS
CUR_CLASS = DETERMINE CLASS OF DATA SET CURRENT_DS
SELECT ( CUR_CLASS ) WHEN ( PDS ) RETURN
NHEN ( CDS ) SEARCH FOR ALL SLES
WHOSE PARENT IS CURRENT_DS.
FOR ( EACH LOCATED SLE ) CATT TOP_DOWN ( SLE ) RETURN
WHEN ( FDS ) LOAD SL IF NOT ALREADY IN MEMORY
SEARCH FOR ALL SLES WITH A NULL
PARENT ( ROOT SLEs ) FOR ( EACH LOCATED SLE ) CALL TOP_DOWN ( SLE ) RETURN
END_SELECT
END_PROCEDURE
W094/07209 214 3 2 ~ ~ PCT/CA93/00338 CONCLUSION
This concludes the description of the preferred embodiment of the invention. In summary, a method and apparatus for data storage and retrieval have been described. In a computer having one or more secondary storage devices attached thereto, a Finite Data Environment Processor (FDEP) manages Data Sets residing on the secondary storage devices and in memory using Set Lists (SLs) and General Record Pointers (GRPs). The Data Sets contain either data or logical organizational information. The Set Lists comprise Data Sets organized into a hierarchy. The General Record Pointers identify information in terms of Data Sets and records within them. Using the principal idea that a Data Set is uniquely identifiable, the present invention eliminates problems normally associated with referencing the location of data after the data has been moved.
The foregoing description of the preferred embodiment of the invention has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching.
It is intended that the scope of the invention be limited not by this detailed description, but rather by the claims appended hereto.
One problem, which has been retained throughout the history of digital computing, is the inability to directly identify a data entity, have the ability to move it, and still maintain all references to the entity.
For example, consider the state-of-the-art methods of information storage. One state-of-the-art method is a cont~inment technique where an entity (directory) contains a list of subordinate entities (directories or files). Although the cont~in~cnt method is highly effective in ordering information, it does not support the general ability to access information after the information has been moved. Moving data within a hierarchy inevitably leads to the invalidation of existing links to the information moved. There are three prior art solutions to this problem.
The first solution is the concept of a current or working directory. A current directory is a state where a program identifies a particular location within secondary storage and all operations assume information is located in this directory or is subordinate to it.
The second solution is an absolute path. An absolute path is where a program identifies information by a stream of hierarchically dependant entities which are concatenated together to form an absolute data specification. Thus, if "C" is within ~B~ and ~B~ is within "A", then an absolute path would be "\A\B\C".
~ 2 The third solution is the concept of search path. A
search path is a list of predefined locations where programs and sometimes program data are stored. If a body of requested data is not within the search path or the current directory, then it is assumed not to exist.
Note many systems use combinations of these methodologies.
However, there rr~in~ the problem, common to all the state-of-the-art methodologies defined above, of maint~ining information links to data which has moved.
Consider a program which accesses data with an absolute path \A\B\C. If the data located in C is moved to a different location, then the program will not be able to locate the data it needs. Further, consider a program which uses the concept of current directory, where the data file is moved out of the current directory.
Finally, consider the program which uses data located in a search path, where the data is moved to a directory not in the search path. In all of these instances the data is effectively lost to the program. The same reasoning applies to data within a database. Consider the examples above but replace directories with files and files with sets of records.
Thus, the pro~lem is how to retain direct information links when data is moved. In almost all cases, the information link is recorded in a static image, such as a program or in a database.
The solution to the problem of directly linking information and still having the ability to move it, is dependant on altering some of the basic perceptions regarding information structure. The state-of-the-art approach is to use volumes, directories and files to identify each data entity. Where volumes and directories are methods of containment. Specifically, a volume physically contains directory identifiers and directories physically contain file identifiers. Thus, all files and directories identified within a volume ~ $
must physically exist within that volume.
The basic premise of the containment method of information structure storage is that a hierarchy is based on more significant entities physically containing less significant entities. Although this premise works, it is not efficient in that it prohibits several of the characteristics identified below. The following are perceptional modifications required to understand the present invention.
First, a data object may logically belong to a larger data object, but that does not mean it has to be physically contained within it. For example, a directory can contain many file identifiers, but that does not mean the file identifiers must be physically contained in the directory. In the present invention, each file identifier can identify the directory it belongs to and a more effective information structure storage method can be used.
Second, all information structure is hierarchical to some degree. When information ceases to have a hierarchical structure (for example relational), the existing containment method (volume, directory, file) of storing hierarchical structure does not work unless additional intermediate processes are used. The present invention can support direct links to information at a fundamental level and can therefore directly support non-hierarchical information structures such as relational or object oriented, where the cont~inment method cannot.
Third, the amount of data structure required by any process may vary in size, depth and width. Therefore, any mechanisms, which cannot handle extremes in size, depth and width are ineffective. For example, consider a directory with lO,000 file identifiers and the problems associated with its use and maintenance when in memory. The present invention experiences similar space restrictions, but does not experience as many or as i7~ 4 severe problems as cont~;nment methodology with regard to use or maintenance.
Fourth, the amount of space required to uniquely identify a data entity deep within a hierarchy can be excessive using standard cont~inm~nt methodology. The containment method of uniquely identifying a data entity becomes progressively less efficient as depth within a hierarchy increases. For example, the following are two strings to identify a body of data deep in a hierarchical structure:
1 - "C:\ACC_PROG\YEAR1992\ACCOUNTS\ONTARIO\TOR
ONTO\CASH.DB"
2 - "C:\MY_DISK12\VOLUME_A\UTILS_AC\ACC_PROG\Y
EAR1992\ACCOUNTS\ONTARIO\TORONTO\CASH.DB"
The first data identifier is over 50 bytes long and could easily be much longer as depicted in the second data identifier, which is over 80 bytes. In the present invention, a direct reference is a max;rllm of 20 bytes long, regardless of how deep the reference is within a hierarchy.
Fifth, it is safe to assume that information moves.
The cont~inment method of storing information structure requires the data identifier to move when the data moves. This produces two problems, the overhead of moving the identifier and more importantly, the problems associated with losing links to the data moved. The present invention ensures that data links are retained, regardless of where the data is moved to.
Finally, information accessed via direct links are faster than indirect links. The containment method of data-structure-storage uses names to identify a data entity. This means that the location of a data identifier is established by a search, which is seldom a binary search and never an aggregate indexed reference.
As a result, the overhead associated with locating a body of data using the cont~inment methodology is slow - ~ 2 1 4 3 ~ 8 8 AME~DE~S~E~
and cumbersome. The present invention of data-structure-storage is efficient because data location is either a binary search or an aggregate indexed reference.
Examples of prior art methods of information storage include European Patent Application No. 0-410-210-A2 by IBM entitled "Method for Dynamically Expanding and Rapidly Accessing File Directories," European Patent Application No. 0-040-694-A2 by IBM entitled "Method and System for Cataloging Data Sets," and European Patent Application No.
0-474-395-A3 by IBM entitled, "Data Storage Hierarchy With Shared Storage Level." These prior art methods are discussed in more detail below.
European Patent Application No. 0-410-210-A2 discloses a computer-implemented method for the name-oriented accessing of files having at least zero records, any access path to files and records through an external store coupling the computer being defined by a pair of related directories. A first directory of record entries is sorted on a two-part token. The token consist of a unique sequence number assigned to the record and the sequence number of any parent record entry. Each record entry includes the token, file or record name, and external store address or pointer. A traverse through the tokens constitutes a leaf-searchable s-tree. Rapid access to target records is by way of a name-sorted, inverted directory of names and tokens as a subset and which is reconstitutable from the first directory in the event of unavailability.
European Patent Application No. 0-040-694-A2 discloses a data set catalog structure that eliminates the requirement for base catalog/data volume synchronization in a multi-processing environment while enabling the operating efficiency directly addressing the data volumes.
The catalog is distributed between a keyed sequential base catalog and, on each data volume, an entry sequential data volume set. Catalog information which must be synchronized with application data sets is stored in AMENDED SHE~T
21~28~
volume records in the volume data set. The method of the invention operates to use and maintain a data base catalog to open a user data set, the data base catalog including a first keyed data set and on each volume containing user data sets, a second keyed data set, comprising the steps of searching said first keyed data set for a direct pointer to a first volume record in said second keyed data set, comparing the key of the user data set to be opened with the key of said first volume record, and if the keys match, opening for access the data set addressed by a direct pointer in said first volume record. If the keys to do not match, searching said second keyed data set for a second volume record cont~;n;ng the direct key, updating the direct pointer in said first keyed data set to address said second volume record, and opening for access the user data set addressed by a direct pointer in said second volume record.
European Patent Application No. 0-474-3g5-A3 discloses a data storage hierarchy which inherently allows for a level 1 storage file to be uniquely identified across an entire network. A directory naming convention is employed which includes an internal identifier and a name for each subdivision on the network. Because each 2S file can be uniquely identified across the network, a single level one storage space in a file space, or a directory therein, can be used for the entire network.
Also, because of the inherent uniqueness of the naming system, common DASD control files, otherwise required to map between level one storage files and their level zero source files can be eliminated.
SUMMARY OF THE lNv~lION
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a 211328~ AMEN~ED SHEET
- specification, the present invention discloses a method and apparatus for data storage and retrleval. In a computer having one or more secondary storage devices attached thereto, a Finite Data Environment Processor (FDEP) manages Data Sets residing on the secondary storage devices and in memory using Set Lists (SLs) and General Record Pointers (GRPs). The Data Sets contain either data or logical organizational information. The Set Lists comprise Data Sets organized into a hierarchy. The General Record Pointers identify information in terms of Data Sets and records within them. Using the principal idea that a Data Set is uniquely identifiable, the present invention eliminates problems normally associated with referencing the location of data after the data has been moved.
BRIEF DESCRIPTION OF THE DRAWINGS
Referring now to the drawings in which like reference numbers represent corresponding parts throughout:
Figure 1 is a block diagram illustrating the environment in which the present invention operates;
Figure 2 illustrates the structure of a Set List;
Figure 3 illustrates the structure of a Set List Element;
Figure 4 illustrates the structure of a Data Set Definition Record and the Data Set Types defined therein;
$~ 6 1~
Figure 5 illustrates the structure of a General Record Pointer;
Figure 6 illustrates the structure of an External General Record Pointer;
Figure 7 illustrates the structure of a Finite Data List;
Figures 8A and 8B illustrate the structure of a Finite Data Set;
Figure 9 is a block diagram illustrating the data structures used to resolve a General Record Pointer;
Figure 10 illustrates the structure of a Memory Maintenance Record;
Figure 11 is a block diagram illustrating the logic used to locate a Finite Data Set;
Figure 12 is a block diagram illustrating the data structures used to resolve an External General Record Pointer;
Figure 13 is a block diagram illustrating the process flow used to resolve a General Record Pointer;
Figure 14 is a block diagram illustrating the process flow used to search memory when resolving a General Record Pointer;
Figure 15 is a block diagram illustrating the process flow used to search secondary storage when resolving a General Record Pointer;
Figure 16 is a block diagram illustrating the process flow used to search Set Lists when resolving a General Record Pointer;
Figure 17 illustrates the path structure of the Finite Data Set in Figures 8A and 8B;
Figure 18 illustrates the results of a simple single Data Set copy operation;
Figure 19 illustrates the results of a Data Set copy operation which includes subordinate hierarchy;
Figure 20 illustrates the results of a Data Set move operation;
W094/07209 ~ ~ 3 ~ ~ X PCT/CA93/00338 Figure 21 illustrates the results of a Data Set promote operation;
Figure 22 illustrates the results of a first Data Set insertion operation;
Figure 23 illustrates the results of a second Data Set insertion operation; and Figure 24 illustrates the structure of a Storage Address Identifier.
DE~ATT~n DESCRIPTION OF THE PREFERRED EMBODIMENT
In the following description of the preferred embodiment, reference is made to the accompanying drawings which form a part hereof, and in which is shown by way of illustration a specific embodiment in which the invention may be practiced. It is to be understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the present invention.
GLOSSARY
CDI Combined Data Identifier A CDI is an 8-byte field consisting of a 4-byte Site Owner Identifier and a 4-byte Data Set Identifier. A CDI is used to uniquely identify a Data Set. A CDI used in conjunction with a Data Record Identifier constitutes a General Record Pointer.
CDS Containment Data Set A CDS is a class of Data Set which may logically contain any number of other Data Sets as children. However, a CDS has exactly one immediate parent. A CDS can contain all Data Set classes. The CDS does not identify its children; instead, all children (Set List Elements) identify the CDS as their parent.
All children of a CDS exist in the same Set W O 94/07209 PC~r/CA93/00338 2~ 8 List. A CDS obeys the Hierarchical Organization Rules when existing in a hierarchy. A CDS is strictly a logical entity in that, no real physical data is ever present.
A CDS is used to define a conventional hierarchical information structure such as the drive/directory/file structures of MS-DOS and UNIX. Each CDS at a Site can be reached via an Set List Element, and has no further physical representation.
DRI Data Record Identifier A DRI is a 4-byte field which identifies a single record within a given Data Set. If a DRI within a General Record Pointer is set to null (represented by a one's complement -1), then it identifies the entire Data Set. A
General Record Pointer which has all fields set to null is a null General Record Pointer.
DSI Data Set Identifier A DSI is a 4-byte field which identifies one Data Set within a given Finite Data Environment. Note that an DSI may not be unique (by itself) when Set List Elements from other Sites are imported. If data is not imported into a Site, then an DSI will be unique.
EGRP External General Record Pointer An EGRP is a 20-byte data identifier. An EGRP <
consists of a 4-byte Finite Data Set Site Owner Identifier, a 4-byte Finite Data Identifier, a 4-byte Data Set Identifier Site Owner Identifier, a 4 byte Data Set Identifier and 4-byte Data Record Identifier as a single record.
An EGRP will uniquely identify a unique Data ~ 21~288 Set absolutely anywhere.
FDEP Finite Data Environment Processor An FDEP is the process which maintains Set Lists and all related structures and information. There is exactly one FDEP for a given Site. An FDEP is assumed to have a uni~ue serial number, which is used as the Site Owner Identifier when a new Finite Data Set (or Set List Element) is created.
FDI Finite Data Identifier An FDI is structurally identical to a Combined Data Identifier, i.e., a 4-byte Site Owner Identifier and a 4-byte Data Set Identifier.
The difference between a Combined Data Identifier and a Finite Data Identifier, is a Finite Data Identifier only identifies Set Lists, never a Containment Data Set or a Physical Data Set. A Combined Data Identifier can identify any Data Set class (and therefore type).
FDL Finite Data List An FDL is an array or list of records which identify a specific Finite Data Set and the location of its Set List in secondary storage.
There is exactly one FDL on a Site. All Finite Data Sets (and therefore Set Lists) at a Site are identified in the FDL.
FDS Finite Data Set An FDS is both a class and a type of Data Set.
An FDS can have all Data Set classes (Containment Data Set, Finite Data Set, and Physical Data Set) as its children (or subordinates). An FDS may have one or more parents. This enables an FDS to exist at several different levels of logical hierarchy, without duplication. Each FDS at a Site can be reached via a unique Finite Data List Element, and is represented by a unique Set List. In addition, an FDS is also represented by the various Set List Elements (in other Set Lists) that point to it. A11 control data (Set List Elements) for all data within an FDS are completely contained by that FDS. However, an FDS represents no real physical data. The FDS
is used throughout the present invention to provide new and efficient ways of accessing and organizing data.
GRP General Record Pointer A GRP is an 12-byte data identifier. A GRP
consists of a 4-byte Site Owner Identifier, a 4-byte Data Set Identifier and a 4-byte Data Record Identifier. A GRP uniquely identifies a given data entity anywhere. However, when a Data Set is referenced which is not in the current Set List, a search must be performed to located it. For this reason the External General Record Pointer is a convenience, added to enhance GRP access speed.
MMR Memory Naintenance Record An NMR is a record used to maintain information specific to a Data Set or set list located in memory.
NRI Memory Record Identifier An MRI is a 4-byte field which contains the number of a specific Memory Maintenance Record.
The MRI is used as the key field in Memory Maintenance Records for sorting and location.
2 1 ~
MSL Master Set List An MSL is a Set List which identifies the overall s~ructure of information at a given Site. An MSL has exactly the same structure as S a Set List. There is exactly one MSL for a Site and all Finite Data Sets within that Site will exist within the Finite Data Set embodied by the NSL.
PDS Physical Data Set A PDS is a class of Data Set which identifies a physical data entity (e.g., a record, a list).
Since it is a physical entity, a PDS may not contain any other Data Sets as children.
However, a PDS still retains a logical connection by identifying exactly one immediate parent. A PDS obeys the Hierarchical Organization Rules when existing in a hierarchy. A PDS is used to store actual bytes (or records) of information. Note that the parent of a PDS may be a Cont~inrcnt Data Set or a Finite Data Set. Each PDS at a Site can be reached via a unique Set List Element, and is represented by a unique physical data body (in secondary storage and RAM).
SAI Storage Address Identifier An SAI is a 12-byte field consisting of a 2-byte Storage Device Number, a 4-byte Remote Disk Number, and a 6-byte Offset Within The Storage Device. The first field allows for a maximum of 2l6-l storage devices. The second field services all storage media with removable "disks". This includes floppy, tape, floptical, removable-hard-drive, etc. By setting the second field to appropriate disk number, data can be located uniquely across ~ 3?~ 12 many disks. This field can be used to automatically indicate a "disk~ number to a program or end-user. The second field supports a maximum of 232-l removable disks per device.
The third and last field in the SAI is a relative offset into the device. If remote disk number is non-null, then this is an offset in that disk. The third field is always an absolute address within that device. The third field supports an address space of 248-1 bytes.
Site A Site is a one or more computers using a common series of secondary storage devices and a common Finite Data Environment Processor.
SL Set List An SL is an array or list of Set List Elements which define the hierarchy of an Finite Data Set. An SL is always sorted based on the Self Identifier field of the Set List Elements.
SLE Set List Element An SLE is a record which contains all control and structure information related to a given Data Set. An SLE contains a Self Identifier field by which it can be located and sorted.
SOI Site Owner Identifier An SOI is a 4-byte field which contains a unique identifier associated with exactly one Site. An SOI is unique and can be used to identify the Site from which a given Finite Data Set originated. An SOI is provided by the Finite Data Environment Processor when a new Finite Data Set or Set List Element is created.
The Finite Data Environment Processor is 21~3~8~
sssumed to have a serial number for each occurrence. The SOI is that serial number.
COMPONENTS AND CONCEPTS
Figure 1 is a block diagram illustrating the environment in which the present invention operates. A
computer 10 has one or more secondary storage devices 12 attached thereto. The computer 10 operates under the control of an operating system 14 resident in the memory 16 of the computer 10. The operating system 14 controls the execution by the computer 10 of both application programs 18 and a Finite Data Environment Processor (FDEP) 20. The FDEP 20 manages Data Sets 22 residing on the secondary storage devices 12 and in memory 16 using Set Lists (SLs) 24 and General Record Pointers (GRPs) 26. The Data Sets 22 are the building blocks on which all the other components operate. The SLs 24 are comprised of Data Sets 22 organized into a hierarchy.
The GRPs 26 identify information in terms of Data Sets 22 and records within them. The FDEP 20 is a process that performs functions on one or more Data Sets 22.
The following sections describe these basic components and concepts in more detail.
DATA SETS
A Data Set 22 may be an array or list of records or contiguous binary data. A Data Set 22 is either a logical entity such as a directory, or a physical entity such as a file. The principle idea of the present invention is that a Data Set 22 is uniquely identifiable, so the problems normally associated with referencing the location of data are not an issue.
2~ 14 SET LISTS
Figure 2 illustrates the structure of a Set List (SL) 24. An SL 24 is a Data Set 22 comprising a list of records 28 which identify a relative relationship between a Data Set 22 and those Data Sets 22 it is logically contained within. Thus, an SL 24 stores hierarchical information by recording the ancestry (parent) of a Data Set 22. Figure 2 also illustrates that there is one Master Set List 30 at each Site.
Figure 3 illustrates the structure of a Set List Element (SLE) 28, which is a record within an SL 24.
The number of SLEs 28 in an SL 24 is only limited by the amount of memory available. Each SLE 28 within an SL 24 is a Data Set 22 and has fields to control all aspects of its behavior.
An SLE 28 contains all information pert~ining to a Data Set 22, including Self Identifier 32, Parent Identifier 34, Memory Record Identifier (MRI) 36, Data Set Name String 38, Number Of Parents 40, Data Set Type (DST) 42, Data Set Status (DSS) 44, Storage Address Identifier (SAI) 46, SLE Checksum 48, Data Set Checksum 50, Archive Checksum 52, Security Code 54, and Record Size 56.
The Self Identifier field 32 provides the identity of the Data Set 22. The Self Identifier field 32 is a key field used for sorting and location activities.
The Parent Identifier field 34 is used to identify a parent SLE 28 within the SL 24 or contains a null. If the Parent Identification field 34 contains a null, then the SLE 28 is a root SLE 28. A root SLE 28 is a Data Set 22 of greatest significance in a given SL 24. An SL
24 can have one or more root SLEs 28. If the Parent Identification field 34 contains a non-null, then the parent SLE 28 is located in the current SL 24.
The Memory Record Identifier (MRI) field 36 identifies where in memory the Data Set 22 is stored.
21~32~8 The Data Set Name String field 38 inside an SLE 28 is the name of the Data Set 22. The Data Set Name String field 38 is primarily for presentation to end-users, although it can be used to locate a Data Set 22 S as well.
The Number Of Parents field 40 identifies how many times the Data Set 22 is referenced within other SLs 24 of greater significance. The Number Of Parents field 40 is used only when the Parent Identification field 26 contains a null.
The Data Set Type (DST) field 42 identifies the type of the Data Set 22. Figure 4 illustrates some example Data Set 22 types and their respective values.
The Data Set Status (DSS) field 44 identifies the status of the Data Set 22.
The SAI field 46 identifies the location in secondary storage of the Data Set 22.
The SLE Checksum field 48, Data Set Checksum field 50 and Archive Checksum field 52 are used for integrity checking at all stages of data movement and storage.
The Security Code field 54 is used to encrypt and decrypt a Data Set 22.
The Record Size field 56 of the SLE 28 to resolve references to records in the Data Set 22.
GENERAL RECORD POINTER
The primary problem with the containment method of storing hierarchy is the difficulty associated with linking data and maint~i n i ng the links when the data moves. The solution to the problem is to break the links into two components: a static and dynamic component. The dynamic component is the SL 24, because the SL 24 identifies the location of information within a relative environment. The static component is a General Record Pointer (GRP).
~ a~ 16 Figure 5 illustrates the structure of the GRP 26.
The GRP 26 is an 12-byte identifier comprising a 4-byte Site Owner Identifier (SOI) 58 containing a unique identifier associated with the Site, a 4-byte Data Set Identifier (DSI) 60 cont~ining a unique identifier associated with a Data Set 22, and a 4-byte Data Record Identifier (DRI) 62 identifying a record within the Data Set 22.
The GRP 26 is a static component, because a Data Set 22 can move within an SL 24 or be moved to a different SL 24, without affecting the validity of the GRP 26.
The GRP 26 r~mAin~ valid wherever the Data Set 22 moves, because the Data Set 22 has the same identifier as the GRP 26 and need only be matched to the GRP 26.
COMBINED DATA IDENTIFIER
The combination of an SOI 58 and DSI 60 is also termed a Combined Data Identifier (CDI) 64, which is used to uniquely identify a Data Set 22 on any Site.
Using the SOI 58 to further qualify the identity of a Data Set 22 allows the Data Set 22 to be moved into and out of various Sites.
For example, Site nA" can import Data Set "1" from Site "B". However, there may already exist Data Set "l"
in Site nA". Therefore, the SOI 58 is required to uniquely identify the Data Set as Site "B" and Data Set "1" .
In Figure 2, each of the single and double line links is a CDI 64. The double line links 32 are CDIs 64 which identify the associated Data Set 22 via the Self Identifier field 32. The single line links 34 are CDIs 64 which identify another SLE 28 via the Parent Identification field 34. In this way, the hierarchical structure of a Set List 24 can be maintained in its simplest possible form.
21~3~
FINITE DATA IDENTIFIER
The combination of an SOI 58 and DSI 60 is also termed a Finite Data Identifier (FDI) 66. The difference between an FDI 66 and a CDI 64 is that an FDI
66 only identifies SLs 24. In contrast, a CDI 64 can identify any class of Data Set 22.
EXTERNAL GENERAL RECORD POINTER
When a Data Set 22 is referenced which is not in the current SL 24, or when the SL 24 is not identified, then a search must be performed to located it. For this reason, an External General Record Pointer (EGRP) is an added convenience to enhance GRP 26 access speed.
Figure 6 illustrates the structure of the EGRP 68.
The EGRP 68 is a 20-byte identifier comprising an 8-byte FDI 66 (including both the SOI 58 and DSI 60 of the desired SL 24) and an 8-byte CDI 64 (including both the SOI 58 and DSI 60 of the desired Data Set 22), and a 4-byte DRI 62. The EGRP 68 identifies a unique Data Set 22 anywhere.
FINITE DATA SETS
Perhaps the most significant of the concepts in the present invention is the Finite Data Set (FDS) concept.
An FDS is a logical entity which identifies relative hierarchical structure. An FDS contains internal hierarchical structure, while external links exist which point to that FDS as a subordinate.
An FDS is represented by an SL 24 which may contain internal hierarchy. Further, an FDS may be pointed to as a subordinate by any number of SLEs 28 in other SLs 24. Therefore, an SLE 28 which points to an FDS can occur in more than one SL 24. Therefore, the FDS may be subordinate to any number of (other) SLs 24. This is how an FDS can have multiple parents in a given information structure hierarchy. Occurrences of SLEs 28 W094/07209 ~ 18 PCT/CA93/00338 which point to the same FDS may reside in a unique and separate SL 24. In this way, an FDS may have sny number and class of Data Sets 22 as its ancestors and descendants within the current structure hierarchy.
The FDS methodology varies substantially from conventional containment methodology because an FDS can have more than one parent. The potential plurality of FDS parents is an important concept in the present invention. The potential plurality of an FDS is a feature not possible in environments which operate using the conventional cont~inr?nt method of information structure storage.
When an FDS has more than one parent, the entire Data Set 22 is, in essence, addressable through all references to it. This does not mean the entire FDS is duplicated. Instead, it means the same FDS can be referenced more than once within one or more SLs 24 of greater significance.
FINITE DATA LIST
Figure 7 shows an example Finite Data List (FDL) 70 for a Site. An FDL 70 is an SL 24 which contains a list of records identifying all FDS's at a Site. Preferably, there is only one FDL 70 at a Site. Each FDL Element (FDLE) 72 is an SLE 28 which points to an SL 24.
For example, the SL 24 pointed to by the first FDLE
72 in Figure 7 contains four SLEs 28. The fourth SLE 28 of the SL 24 represents an FDS. The fourth SLE 28 points to the fourth FDLE 72, which in turn points to another (sub-ordinate) SL 24.
The FDL 70 ignores logical information structure.
The FDL 70 physical structure shown in Figure 7 could also represent an FDS with a logical structure shown in Figures 8A and 8B. In fact, the FDL 70 physical structure could be used for any logical hierarchy whatsoever. This illustrates an important property of the present invention - physical structure is made 2 1 ~
independent of logical structure, thereby supporting any and all logical information structures.
LOCATING DATA SETS WITH THE GRP AND EGRP
A GRP 26 identifies information in terms of: an FDI
66 to identify the SL 24, a CDI 64 to identify a Data Set 22 within the SL 24, and a DRI 62 to identify a record within the Data Set 22. The FDI 66 is not necessary to identify a Data Set 22, because the CDI 64 and DRI 62 can be used alone. However, if an FDI 66 is not provided, then locating a Data Set 22 might require an exhaustive search of all SLs 24 at a given Site for an SLE 28 matching the CDI 64, and potentially at other Sites as well. Therefore, an FDI 66 is a speed saving feature, but is treated as a required component to provide superior data access characteristics.
There are two major sources of an FDI 66: (1) from within the code of an executable program, or (2) as data for that program. When a program provides an FDI 66, data can be assumed to exist within the SL 24 specified by the FDI 66.
The factors which determine the correct GRP 26 format (GRP 26 or EGRP 68) to use, under various conditions, are very straightforward. A GRP 26 is used when a program has a small number of SLs 24 to maintain, and ideally just one SL 24. Data access speed is m~ximi zed when a program provides an FDI 66 and all data required for that program exists within the SL 24 specified by that FDI 66.
However, many programs contain or use data which can span several SLs 24. For programs with these data requirements, the EGRP 68 is ideal. An EGRP 68 identifies the correct SL 24, Data Set 22 within that SL
24, and record within the Data Set 22. However, an EGRP
68 is 8 bytes larger than a GRP 26 and should be used sparingly.
W094/072~ ~ ~3 ~ ~ ~ 20 PCT/CA93/00338 Figure 9 depicts the data structures used in the process performed by the FDEP 20 to resolve a GRP 26.
The FDI 66 is used as the key to search the FDL 70 for the appropriate FDLE 72, indicated by reference element 5 n ( STEP 1)" in Figure 9. If the FDI 66 does not match any FDLE 72 in the FDL 70, then the FDEP 20 may or may not perform an exhaustive search for the Data Set 22 matching the CDI 64 in all SLs 24. If the FDLE 72 is found, then it will contain an MRI field 36. The MRI
field 36 identifies a specific Memory Maintenance Record (NNR) 74, which is illustrated in Figure 10, and contains a Self Identifier field 36 (an MRI 36), a Memory Address Identifier 76, and Memory Status Flag 78.
The MMR 74 contains the address of the Data Set 22 when 15 it is in memory. The MMR 74 resides in the Memory Maintenance List (MML) 80, which is illustrated in Figure 11. The NML 80 is a contiguous list of MMRs 74 sorted by the NRI field 36. The MMR 74 then identifies a location in memory where that Set List 24 has been loaded.
If the SL 24 has not already been loaded into memory, then the FDLE 72 will contain a null MRI field 36. If the specified SL 24 is not in memory, then it is loaded into memory by the FDEP 20.
The SL 24 is then searched using the CDI 64 as the key, to locate the SLE 28 representing the Data Set 22.
This is indicated by reference element "(STEP 2)" in Figure 9. If the CDI 64 does not match any SLE 28 in the SL 24, then the FDEP 20 may begin searching other SLs 24. If the appropriate SLE 28 is found, then it will contain an MRI field 36.
If the Data Set 22 has not already been loaded into memory, then the SL~ 28 will contain a null MRI field 36. If the specified Data Set 22 is not in memory, then it is loaded into memory by the FDEP 20.
The Data Set 22 is then accessed using the DRI 62 as an index to locate a specific record, wherein the DRI 62 W094/07209 21~ 3 2 8 8 PCT/CA93/00338 is multiplied by the record size given in the SLE 28 to determine the offset into the Data Set 22 when records in a Data Set 22 are stored as an array. The DRI 62 can also be used as a search key when records in a Data Set 22 are stored as a sorted list. This is indicated by reference element "(STEP 3)" 92 in Figure 9.
Figure 12 illustrates a similar process for an EGRP
68. The only variation between GRP 26 and EGRP 68 resolution is the source of the FDI 66. In the case of a GRP 26, the program provides the F~I 66, and in the case of an EGRP 68, the EGRP 68 provides the FDI 66.
These relationships between GRP 26 and EGRP 68 resolution can be seen by comparing Figure 9 and Figure 12.
Figure 13 is a flow chart illustrating the logic flow of the GRP resolution process. At each step of the process, a search may fail. Each failure may result in further and more comprehensive searches. Figure 13 is the top-level process diagram and has three boxes 96, 102, and 86, to indicate the various searching processes illustrated in Figures 14, 15 and 16.
One purpose of the process depicted in Figure 13 is to locate data as quickly as possible. To save time, the current SL 24 is always searched first, as indicated by reference elements 96 and 100 in Figure 13. If that fails, then the next fastest candidates are all the Data Sets 22 (SLs 24) in memory. Searching these SLs 24 is faster than those not yet in memory. This is shown as the search RAM process in Figure 14. If the RAM search still fails, then the only remaining possibility is secondary storage. This is shown as the search storage process in Figure 15. Both the RAM and secondary storage processes call the search SL 24 process, to perform a search in an SL 24. The search SL 24 process is shown in Figure 16. This process searches a given SL
24 for an SLE 28 with a matching CDI 64.
W094/07209 ~ ~ 22 PCT/CA93/00338 In the present invention, when an SL 24 sesrch for 8 specified Data Set 22 fails, FDEP 20 will normally take steps to locate the Data Set 22 in other SLs 24. The nature of the search depends on the nature of the user request for data. The FDEP 20 can easily provide several different data location alternatives, including:
(1) fail when Data Set 22 is not found in current SL 24;
(2) fail when Data Set 22 is not found in any SL 24 currently in memory; and (3) fail when Data Set 22 is not found in any SL 24 at the Site. In selecting the third alternative, the FDEP 20 performs an exhaustive search all SLs 24 to locate the specified Data Set 22.
A failure on an FDLE 72 search is a different matter. Assume a program requests a Data Set "X" within the SL '~Yll via an EGRP 68. If the Data Set "Xll had been moved to a different SL and the SL nyll had been deleted, then an FDLE 72 search failure would occur. The failure would occur during the FDLE 72 search because the FDLE
72 for SL "Y" would no longer exist and all SLs are represented in the FDL 70. Therefore, the FDEP 20 would have two alternatives, which are controlled by flags or alternative data request functions. The first alternative is to fail and inform the program the SL "Y"
not longer exists. The second is to perform an exhaustive search of all SLs for the Data Set "X".
The GRP 26 resolution process described above assumes that all SLs 24 and Data Sets 22 are mobile in memory. Specifically, the addresses of these entities change depending on memory availability and the requirement of memory optimization. Memory optimization is a process whereby data (SLs 24 and Data Sets 22~ are moved around in memory to increase the size of available contiguous memory blocks. When memory optimization occurs, the address of any given data entity can change as a direct result of an optimization.
21432$8 However, the present invention supports more evolved memory control devices which allow a memory block to be flagged as static. A static memory block is one which will not move during an optimization. When an MMR 74 for an Data Set 22 is marked "static in memory", a program using the present invention can request the address of a Data Set 22 and thereafter directly access that data in memory. This reduces access overhead, but makes memory optimization less efficient.
DATA SET CLASS
All Data Sets 22 may be qualified by a ~class". In general terms, the Data Set 22 class defines the storage classification for a Data Set 22: physical, logical, etc. The currently defined Data Set 22 classes are:
Cont~in~-~t Data Set (CDS), Physicsl Data Set (PDS) and Finite Data Set (FDS).
CONTAINMENT DATA SET
A Containment Data Set (CDS) is a class of Data Sets 22 that is a logical entity used to define a conventional hierarchical information structure, such as the drive/directory/file structures of MS-DOS and UNIX.
Thus, a CDS is similar to volumes, directories, etc., found in conventional containment methodology.
A CDS has exactly one immediate parent. However, the CDS may logically contain any number of other Data Sets 22 as children. The CDS does not identify its children. Instead, all children identify the CDS as their parent. Further, all the children of a CDS exist in the same SL 24.
A CDS is represented by an SLE 28 within an SL 24.
The SAI field 46 of the SLE 28 contains a null value indicating that there is no physical data associated therewith.
W094/07209 ~ PCT/CA93/00338 3 ~
PHYSICAL DATA SET
A Physical Data Set (PDS) is a class of Data Set 22 that identifies a physical data entity (e.g., a record, a list~. A PDS is used to store actual bytes (or records) of information. A PDS is pointed to by an SLE
28 in an SL 24.
A PDS does not contain any other Data Sets 22 as children. However, a PDS still retains a logical connection by identifying its parent. The parent of a PDS may be a CDS or an FDS.
The location of a PDS is identified by the SAI field 46 of the SLE 28. The SAI field 46 is non-null and typically identifies an address in secondary storage media.
HIERARCHICAL ORGANIZATION RULES (HOR) There are several rules regarding hierarchical construction in the present invention, these are collectively called Hierarchical Organization Rules (HORs). The first HOR is the Hierarchical Scope Rule (HSR). The HSR states:
"The priority level of a Data Set must be less than or equal to its parent."
The HSR is a simple rule where each Data Set 22 must have an equal or lower priority number than its immediate parent. As mentioned earlier, each Data Set 22 class is prioritized. The prioritization is independent from the class of data. Thus, all Data Sets 22 will be associated with a unique scalar priority number from 1 to N. Gaps in the priority numbers are possible and even desirable.
The second HOR is the Hierarchical Containment Rule (HCR). The HCR states:
"The absolute path to any Data Set is always a PDS, a CDS, or an FDS, where the CDS or PDS is preceded by zero or more CDS's and/or FDS's".
W094/07209 214 3 2 8 ~ ~ PCT/CA93/00338 The HCR is an extension of the HSR which ensures that a PDS will never contain another PDS. For example, a file cannot contain another file. A file is a physical entity (a PDS), and therefore the immediate parent of the file is a CDS or FDS. Although it is possible to have PDS's also act as CDS's, it creates chaos and complicates all FDEP 20 processing~.
The third HOR is the Hierarchical Recursion Rule (HRR). The HRR states:
"A Data Set cannot directly or indirectly contain any of its ancestors".
The HRR is required to prevent an endless loop during any FDEP 20 process.
To clarify any misconceptions, consider the absolute path example in Figure 8A and consider the effects on the FDEP 20 if Data Set "F" is contained Data Set ~A".
The process would be as indicated in Figure 17 until "An is to be resolved. Instead of t~r~;n~ting, the entire process would repeat from Data Set "F" again. The net effect (of any infinite loop) is that the computer would hang. Although it is true that a process could be established to prevent an endless loop or recursion, this same process would have to exist or be called by all programs and processes which traverse a hierarchy.
Further, these cyclic hierarchies would introduce unnecessary complications for most users.
FINITE DATA SETS AND THE HOR
FDS's do not obey the HOR. However, when FDS's are introduced into an existing hierarchical structure, they do not disturb and/or affect any other Data Sets 22 which do obey the HOR. When a check is made to ensure a new Data Set 22 conforms to the HOR, any FDS's encountered during the test are processed to establish the next SL 24, but are invisible to HOR testing.
W094/07209 ~ ~ 26 PCT/CA93/00338 FINITE DATA ENVIRONMENT PROCESSOR
A Finite Data Environment Processor (FDEP) 20 is a process which creates, modifies and maintains FDS's and all structures ContA i neA within or related to FDS~s.
Further, the FDEP 20 also maintains any memory images of SLs 24 or Data Sets 22 requested by a process.
There is exactly one FDEP 20 for a given Site. An FDEP 20 is assumed to have a unique serial number, iOe., the Site Owner Identifier (SOI) 58. Each process using the present invention must access the FDEP 20 to modify the structure and content of information at a Site.
The FDEP 20 is a background process which other processes activate to create, delete or access information, including the following functions:
CREATE DATA SET
DELETE DATA SET
COPY DATA SET
MOVE DATA SET
PROMOTE DATA SET
INSERT DATA SET
The FDEP 20 must be a resident process, meAning it must remain in memory while any processes require it.
If the FDEP 20 is an operating system, then it will remain in memory until the computer is powered down. If the FDEP 20 is a resident driver for a program, then it will have to remain in memory as long as the program r~mA i n~ in memory. Further, the FDEP 20 must have the ability to allocate memory and utilize physical storage devices. Finally, the FDEP 20 should have complete access to all information that may be requested or used through it. The FDEP 20 can exist and operate within a conventional contAinmcnt-style operating system, although it will operate at processing speeds normally associated with that operating system.
Data movement can occur at many processing levels and can involve anything from moving a single record to moving an entire Data Set 22. In most cases, the movement of data less than a Data Set 22 is the W094/07209 ~ 32 ~ PCT/CA93/00338 responsibility of the program using the data.
Therefore, this aspect of data movement is covered in the section entitled nInformation Structure." The movement of Data Sets 22 can be experienced at any processing level and is the primary focus of this section.
One of the cornerstones of the present invention is the simplicity of redefining relationships between Data Sets 22. A Data Set 22 is moved by altering the Parent Identifier field 34 of an FDLE 72 or SLE 28. Thus, Data Set 22 movement is always performed in terms of the parent of a target location. Further, any entities logically contained within the Data Set 22 being moved are also transported, without actually modifying their SLEs 28 in any way. All contAinm~nt-style operating systems performing the same activity would have to load, update and save, potentially large variable length child lists. For example, the File Allocation Table (FAT) in MS-DOS is used to access and locate the files in a directory. When a file is moved, two FATs have to be updated, the old parent directory and the new parent directory.
There are currently four fundamental classes of Data Set 22 movement: create, delete, copy and relocate.
Copy and delete functions can be implemented as singular functions where there is only one kind o Data Set 22 copy and one kind of delete. This is not true for relocation operations. When a Data Set 22 is relocated, several distinct operations are possible. Relocation can include operations such as: move, promote, and insert. Although all of these operations involve the movement of one or more Data Sets 22, they have different logical characteristics. For example, the promote function, would locate the parent of the current parent and reset the Parent Identifier field 34 of the current Data Set 22. The location of a parent's parent is a logical activity because both the old and new W094/07209 ~ PCT/CA93/00338 ~3~ 28 parents can exist in the same SL 24.
In all the following Data Set 22 operation explanations, the name, type and other Data Set 22 information not directly related to lin~king is assumed to be input. When EGRP 68 is specified as a required input, this means it can also absorb a GRP 26 and a parent CDI 64 provided by the program. Each operation performs HOR testing to ensure the validity of all hierarchical modifications. HOR testing is performed on each entity modified, including subordinate Data Sets 22 when they are modified. If a subordinate is not modified, then HOR testing is not necessary. Note that all of the operations described below must be re-entrant.
CREATE DATA SET
The ncreate" operation introduces a new Data Set 22 into a Site. The create operation requires exactly one input: an EGRP 68 to identify the target parent for the new Data Set 22. The create operation is so straightforward that a diagram is not required.
However, it is necessary to point out that when data is imported from other Sites, the create function, is not directly used to establish new Data Sets 22. Nhen one or more Data Sets 22 are imported from another Site, they are assumed to already exist. Therefore, one of the RELOCATE class operations would be used. The RELOCATE class operations may in turn invoke the create operation, but the process which triggered the FDEP 20 to activate a RELOCATE class operation would not. If another Site is specifically creating a new Data Set 22 at the current Site, then the create operation would be used.
29 21~32~
DELETE DATA DELETE
The "delete" operation deletes a Data Set 22 and all subordinates from a Site. The delete operstion requires exactly one input: an EGRP 68 of the Data Set 22 to delete. The delete operation is so straightforward that a diagram is not required. However, it is necessary to point out that when data is moved to a different Site, using one of the RELOCATE class of operations, the delete command is not directly used. The RELOCATE class operation may call it, but the process which triggered the FDEP 20 to activate this function would not. Note that the delete operation does not physically delete FDLEs 72 or SLEs 28, it simply marks them as deleted.
This is common to almost all data maintenance systems.
COPY DATA SET
A Data Set 22 can be duplicated via the "copy"
operation. When a Data Set 22 is copied, a new Data Set 22 is created. The new Data Set 22 is exactly the same in terms of data content and control linking. However, the new Data Set 22 is assigned a new and unique Self Identifier field 32 (which is a CDI 64). Note that the change in the Self Identifier field 32 is critical to the present invention, because every Data Set 22 must have a unique Self Identifier field 32.
After a successful copy, space consumption is doubled because an exact copy of the Data Set 22 is added to the existing Site. Figure 18 depicts a simple copy operation. Data Set nF" is being copied directly under Data Set "A". After the operation a new Data Set nH~ will be an exact duplicate of Data Set "F".
Figure 19 depicts a copy operation which includes subordinate hierarchy. In this operation, the subordinate Data Sets 22 are also duplicated and each will also have a new Self Identifier field 32. Further, the Parent Identifier field 34 of the subordinate Data Sets 22 identifies the new parent. In Figure 19, Data W094/07209 4~ PCT/CA93/00338 Sets "I", "J", "K" sre identical to Data Sets "E", "F"
and "G", respectively, but "I", ~J~ and nK" all have a parent of "H". The copy operation performs a test to establish if other Data Sets 22 are i~^~i~tely subordinate to the Data Set 22 being copied. When a subordinate is located, the Data Set 22 copy is also performed on it. This process can be recursive or iterative. When a copy is performed on a root Data Set 22, it can be used to transfer entire volumes of information to another storage media, or to duplicate that information in the same media. Under all conditions the original Data Set 22 rem~i n~ unchanged.
MOVE DATA SET
The "move" operation moves a Data Set 22, and all subordinates, from one location to another within a single SL 24 or to a different SL 24. Figure 20 depicts Data Set "E" being moved from a parent of "C" to a parent of "Bn. None of the subordinate Data Sets 22 are changed in any way during a move operation. The Parent Identifier field 34 in each SLE 28 still identifies the same parent. However, all subordinates are logically promoted in terms of future references.
PROMOTE DATA SET
The "promote" operation moves a Data Set 22 to the same level as its current parent. This is similar to the move operation, except the only input for this function is an EGRP 68 for the source Data Set 22. The promote operation uses the Parent identifier field 34 of the source Data Set 22 to locate the target location.
Figure 21 depicts Data Set "E" and its subordinates being promoted. None of the subordinate Data Sets 22 are changed in any way during a promote operation. The Parent Identifier field 34 in each SLE 28 still identifies the same parent. However, all subordinates are logically promoted in terms of future references.
-~ 31 21~32~
INSERT DATA SET
The "insert" operation inserts a Data Set 22 between an existing Data Set 22 and its parent. This demotes all subordinate Data Sets 22, but does not modify any of the control records for these Data Sets 22. In essence, the insert operation creates a new Data Set 22 and changes the Parent Identifier field 34 of the target Data Set 22 to that of the newly created Data Set 22.
Figure 22 depicts a Data Set "X" being inserted in front of Data Set "Bn. The new Data Set "X" absorbs the Parent Identifier field 34 of Data Set "B" and creates "X" as a child of nAn. The Data Set nBn is modified to have a parent of nXn. The input for this example of insert is an EGRP 68 for Data Set nB".
The same logic applies when a Data Set 22 is to be inserted in front of a root Data Set 22, as shown in Figure 23. The only thing to note is that the Parent Identifier field 34 of Data Set nA" is null. Thus, the parent of Data Set nyn is null sfter the insert operation is complete.
INFORMATION STRUCTURE
A Data Set 22 is a logical and physical information structure. However, a Data Set 22 is also a data entity because it can be created, moved, duplicated, deleted, linked, or changed for content. In general, information structure is affected by a multitude of requirements and restrictions. The issues pert~; n; ng to the present invention are discussed in the following paragraphs.
When any information structure is moved, the presence of information references to data outside the structure must be considered. If the structure is totally self contained, meaning there are no references to data outside the structure, then it can be moved without incident. However, an information structure which contains references to data not contained within the same structure, may have restrictions regarding W094/07209 ~32~ 32 PCT/CA93/00~38 where the structure can be moved. Further, references to external data may have spacial dependencies which prohibit data movement outside of a defined scope. When data movement is limited in this way, there were previously only three solutions, prior to the present invention. First, do not move the data outside of the defined scope. Second, also move the data related to the external references. Third, make sure the data related to the external references exists at the target location. The present invention solves this problem by ret~ining the validity of all data links, even after the data has moved.
Different developers may use different schemes of information structure to represent data, where the data may or may not be of the same kind. Also, the same developer often uses different information structure schemes to represent different and general classifications of data. Any system that uses more than one database faces the possibility that the databases are completely different in structure. For example, a system that provides both accounting and task management could be faced with a relational, as well as a hierarchical database. The processing requirements of these two database types are different, which means the system contains particular logic (code) for both database types. In other words, the system must be aware of the database type and take conscious steps to retrieve data according to that type. The present invention directly supports all cont~in~Q~t (e.g., hierarchical database, directory-file) as well as non-cont~inment (e.g., relational database, network database) logical organizations. The present invention enables all logical organizations directly because each GRP 26 is a DIRECT link to data, no matter what logical organization is present and no matter where that GRP 26 occurs in that logical organization. By adding new Data Set 22 Types to the existing set, the system developer W094/07209 ~ l~ 3 2 ~ ~ PCT/CA93/00338 can introduce and use new logical organizations. This would inform both the system (program) and the FDEP 20 when that data is accessed in the future, but would not alter GRP 26 processing by the FDEP 20 or access by the program. In this way, the number of potential information structures (or logical organizations) directly supported by the present invention is virtually unlimited. Where cont~inment methodology is highly limited and requires special processes to control more complex information structures.
The direct linking capability of the present invention enables direct linking of incongruent data structures. For example, a binary tree where each node is also an element of a linked list. The record can directly identify next and previous elements, as well as the left and right nodes (in the tree). The system would still contain the logic to traverse the tree as well as the linked-list. However, the code required to get (locate, load, return address) a node in the tree would be identical to get an element of the linked list.
Further, the processing performed by the FDEP 20 to locate, load and return the address, would also be identical. The same would be true for data structures of any complexity with any number of compound links, where a compound link is a field in a record capable of identifying at least two distinct and different information bodies. For example, a binary tree whose every node identifies either the root of another tree or the head of a linked list. To identify the secondary tree or the list, the same GRP 26 field, in the node's record, can be used. This is possible because relationships to data can be duplicated (several GRPs pointing at same data) while any one GRP 26 field in a record can point at any data regardless of that data's logical organization.
W094/07209 ~ PCT/CA93/00338 Cyclic information patterns can be accurately represented. For example, a linked list which is linked to another linked list in a three dimensional model, but cycles back on one or more of the ~ ions.
When using the present invention, the program designer must be aware of the specific and overall differences between GRP 26 and EGRP 68 usage.
Physically, when GRPs 26 are used as data links (or pointers), the overall and individual size of data is less than when EGRPs 68 are used. This is a superficial difference and should not be used as a sole deciding factor between the two methods. First, the designer must decide what the general relationship between his program and its data are. This can be expressed as either the program ndrives" the data, or the data "drives" the program. In the first case, the program consciously directs the steps to locate, retrieve, and maintain the data. In the second, the program accesses data by referencing its identifier without taking any conscious steps to otherwise identify that data. When GRPs 26 are used, the program is driving the data. This could be done by a program when it saves the parent FDI
66 of the currently requested Data Set 22, and supplies it to the FDEP 20 on each subsequent GRP 26 request.
Note that the program is not required to save this information; it is optionally saved to reduce GRP 26 processing overhead. On the other hand, when EGRPs 68 are used, the data ndrives" the program. The program simply passes the identifier (the EGRP 68) to the FDEP
20. The parent FDI 66 is already inside the EGRP 68.
So, the same ends as saving the parent by a program are achieved without the program even knowing about this information or its related processing. Another advantage of EGRPs 68 is that a program can access a completely alien piece of data without any special knowledge. The data may be on another Site and of a form not known to the program accessing that data.
21~3~
Using EGRPs 68, this is not a problem because all information about the data tand how to retrieve it) is already contained in the associated SLE 28. This enables the FDEP 20 to properly retrieve that data without program interference, pre-knowledge or specialized code.
THE PATH CONCEPT
The present invention eliminates the need for paths for identifying the location of data. However, in some instances the construction of an absolute path is necessary for end-users. Unlike a program, end users may not understand GRP 26. Therefore, they may be presented with names (text) to identify the Data Sets 22.
In the present invention, path construction can occur from two directions: top-down and bottom-up. Top-down means all descendants of a given Data Set 22 are traversed (and recorded on ~em~n~). Bottom-up occurs when ancestors of a given Data Set 22 are traversed (and recorded). The current Data Set 22 must be supplied to both path traversal processes to identify the start of traversal. This can be passed as a GRP 26, EGRP 68, or FDI 66. For simplicity, the rest of this document only mentions FDIs 66 when referring to process input(s).
Note that FDI 66 is the smallest number of input bytes (or parameters) to perform exactly the same processes.
In top-down, all descendants of an initial Data Set 22 must be found and accumulated. This traversal is dependant on the class of the Data Set 22. If the initially supplied Data Set 22 is of class PDS, then the process successfully terminates because a PDS cannot have any descendants. If the initially supplied Data Set 22 is a CDS or FDS, then the process continues in a recursive manner.
W094/07209 ~3~ 36 PCT/CA93/00338 This process is shown in as the method "TOP_DOWN" in Table I. If a CDS is encountered in trsversal, then the current SL 24 is searched for all SLEs 28 whose parent is the given CDS. Once all these Data Sets 22 have been located, the process recurses on each of these Data Sets 22 to determine their descendants.
If an FDS is encountered in traversal, then the SL
24 is searched for all root SLEs 28 as indicated by a null Parent Identifier field 34. The root SLEs 28 are the immediate children of the FDS, and again the process recurses on each one.
At each recursive level, a PDS class causes the successful termination of that recursive call. The process t~r~in~tes when all descendants have been found (i.e., all recursion levels are terr;n~ted).
Also, as shown in Table I, the step nDISPLAY OR
ACCUMULATE NAME OF CURRENT_DSn is performed for all Data Set 22 classes. Depending on the purpose of a top-down function, the Data Set 22 names could be immediately displayed or accumulated in a structure for later use.
If accumulated, then the output from a top-down process is large, even if only names were recorded at each step.
This process is useful when operating system shell programs (e.g., Norton Commander, MS-DOS 5.0, etc.) need to display and process information in secondary storage (and memory) according to a hierarchy.
In top-down traversals, FDS's with multiple parents do not effect processing because the direction of traversal makes the process unaware of ~blind to) parent links altogether.
In bottom-up, the ancestors of a given Data Set 22 must be traversed, until a root SLE 28 in the MSL 30 is reached. This establishes the absolute path to the Data Set 22. The input FDI 66 locates an SLE 28 within the current SL 24, which has a parent. The SLE 28 for the parent is then searched for and located. The parent SLE
28 may be in the same SL 24 or it may exist in another W094/07209 214 ~ ~ 8 ~ PCT/CA93/00338 SL 24. In this way, traversal encompasses internal as well as external SL 24 processing. This process continues until a true root (the root SLE 28 in the MSL
30) is encountered. At that time, the fully qualified path has been established by concatenating the Data Set Name String field 38 of the SLEs 28 at each step of traversal. Note that the concatenation at each step is performed at the head of the current accumulated textual string. At each step, the current name is concatenated with the existing accumulated textual string. The current name string becomes the accumulated string for the next step of the process.
The construction of an absolute path in the bottom-up direction can present a problem when an FDS with more than one parent is encountered. The construction of an absolute path is only a problem if an FDS chain has not been established. An FDS chain is a hierarchical set of FDS ancestors for a CDS and FDS, which lead to a current FDS.
An example of this is a user traversing through the hierarchy of several FDS's to reach a desired or target FDS. Each FDS traversal is recorded and constitutes part of the FDS chain. When an absolute path is required, the current or target FDS would use the FDS's identified in the chain as parents.
Note that FDS chain methodology is only necessary when an occurrence of an FDS with more than one parent is encountered. FDS chain methodology does not affect GRP 26 resolution in any way because a root SLE 28 of class (PDS or FDS) in the current FDS contains all information required to resolve the reference.
Note also that the bottom-up process to construct a path only occurs when a user consciously requests a path string. This process has no effect on critical GRP 26 processing.
W094/07209 PCT/C~93/00338 2 ~ ~3~d 38 NAMING CONVENTIONS
In existing systems, names for data (and/or paths) are always strings, where a string is a sequence of characters. String formats in current industry range from just a terminator, to complicated control areas at start of (or before) the string. In most prior art, data is located by using the name. This makes data access dependant on strings. Witness to this fact are the large number of patents held by IBM, APPLE, etc., that provide new and efficient ways of string processing. To these developers, strings are not a convenience, they are an integral and crucial part of data location, access, and maint~n~nce. While efficient string processing algorithms are extremely useful for word processing and formatting applications, they are not efficient when data access and location are considered. The problem is not the algorithms, it is that string processing will always be slower than a process which uses binary identifiers. Further, if strings are used as identifiers, then they must be stored in RAM by any program using the data, as well as stored in any data record which wants to point to another piece of data. This makes the memory required for a program and the size of data records much larger.
This increase is further complicated by the fact that in most instances, the name strings are variable length.
Ultimately, strings used as data identifiers are always bulky and awkward to maintain and process.
The present invention enables a Data Set 22 to be uniquely identified (across any number of Sites) via a GRP 26. As a result, no string processing is ever required. For the present invention, names are a convenience for the user's sake. A program or user adopting the present invention may still use strings as identifiers, but processing overhead will be greater than a strict GRP 26 or EGRP 68 access. Locating a Data Set 22 by its name is generally the same process as GRP
~ 21~32~
26 resolution, except the Data Set Name String field 38 is used as the key instead of the Self identifier field 32. The increase in processing overhead is substantial because a string comparison is notably slower than a binary number comparison.
However, a program which presents data to a user on a strictly name basis may encounter problems. When such a program locates the first SLE 28 (or FDLE 72) with a matching name, it would stop and assume it has found the correct data. However, this is an incorrect assumption for two reasons. First, a single directory may contain multiple occurrences of data with the same name.
Second, an SL 24 can contain multiple occurrences of data with the same name, but located in different directories. Recall, an SL 24 is a list. Therefore, a direct search would yield an occurrence of data with that name, not necesssrily the correct occurrence or even the correct directory. Note that a program or FDEP
20 can prevent name duplication at specific levels in an SL 24 or FDL 70 hierarchy, Thus, eliminating this problem. The FDEP 20 can accomplished this by setting a flag in the FDLE 72 to prevent name duplication. A
program can prevent name duplication when required by searching for that name before saving data to secondary storage. A program can hierarchically order information such that name oriented conflicts are prevented and a different area in the hierarchy can still have several bodies of data with the same name.
Information ordering, method of access and resultant problems are a direct responsibility of the program using the data. The FDEP 20 does not perform any program specific logic. This is not a limitation of the present invention, it is a reality of computing; a program does exactly what you tell it to do, whether it is what you wanted or not.
3~3 Problems related to having multiple occurrences of data with the same name is an old problem and is found in all digital computing environments, which identify data by name. For instance, UNIX, MS-DOS, OS/2 and WINDOWS use the concepts of CURRENT WORKING DIRECTORY
and SEARCH PATH. Assume a file nFRED.XXX" exists in the CURRENT WORKING DIRECTORY and one or more of the directories identified in the SEARCH PATH. Further assume that ".XXX" is an extension which will cause the search path to be used. Under these conditions the copy of "FRED.XXXn in the CURRENT WORKING DIRECTORY would execute. However, if nFRED.XXXn does not exist in the CURRENT WORKING DIRECTORY, then the first occurrence encountered in the search path would be executed. Note that the operating system would execute the first occurrence, not necessarily the correct one. Therefore, the problems associated with having multiple files with the same name is common to all operating systems, which identify data by name. Also note that if nFRED.XXXn was not found in the CURRENT WORKING DIRECTORY or in the SEARCH PATH, then the operating system will return an error, normally as nFILE NOT FOUNDn.
The present invention has the ability to support the concepts of CURRENT WORKING DIRECTORY and SEARCH PATH, but they are accomplished differently. In the present invention the concept of CURRENT WORKING DIRECTORY can be accomplished at the SL 24 or directory level, by limiting SL 24 searching to those Data Sets 22 with a specified parent CDI 64. The same is accomplished in an SL 24 by limiting a search to a specific SL 24.
However, under normal conditions the present invention goes beyond the limitations imposed by the use of CURRENT WORKING DIRECTORY or SEARCH PATH, to enable a feasible search of all available storage media to locate a requested entity. In this case, nfeasible" means such a search is possible in less time than the average program is willing to wait for data. Clearly, this 21~28~
search is generally only "feasible" when the present invention is used (i.e., GRPs 26, not name strings).
In any valid data linking methodology, a non-unique identifier can only occur once at any given level in an information hierarchy. When data is identified by name, two entities may not have the same name at the same position in the hierarchy. For example, in MS-DOS or UNIX, if the file FRED.XXX is in the directory ALICE, then no other file in that directory can be called FRED.XXX.
However, in the present invention it is possible for two or more entities to have the same name, since the name is not the key used to identify data. Rather, the GRP 26 is. This property of the present invention may seem alien, but it does have an additional benefit.
Assume that two files called FRED.XXX are created by two separate users in two different directory areas. In the prior art, if one of the files is copied to the other's directory area, then the target file will be overwritten. This is common in networks, mini and mainframe computers, especially when more than one user is working in the same directory.
This problem can be easily avoided in the present invention. Even if the supplied identifier was a name string, eventually an SLE 28 will be found (if data exists). The SLE 28 contains the Self Identifier field 32 which is a CDI 64 uniquely identifying the Data Set 22. If the CDI 64 of the located Data Set 22 and the target Data Set 22 are not the same, then no overwrite occurs, and a new Data Set 22 with the same name is created. In this way, accidents as a result of different users using the same name are almost impossible in the present invention.
The present invention has the ability to maintain several files with the same name, which contain different data or different versions of the same data.
For example, a program may edit a given body of data and W094/07209 ~ 42 PCT/CA93/00338 the user can have the option of saving that data on an arbitrary basis. If the program stores each copy of the data as a discrete Data Set 22, with the same name, then the user can access previous versions of that data with a minimum of effort. In essence the ability to maintain several versions of data with the same name supports or gives rise to a very powerful UNDO capability.
DATA TRANSIENCE
Using SLs 24, the present invention permits contiguous hierarchical structures to exist across one or more physical storage devices. An SL 24 may or may not be on the same physical storage device as the information it represents. Further, each SLE 28 within an SL 24 can identify information existing on different physical storage devices. The principle is to allow structure information and physical data to be mobile while keeping all links valid.
The capability of a Data Set 22 (or SL 24) to encompass more than one physical device, has many direct applications. Consider a stAn~lone computer which contains the following I/O devices: Floppy, Hard-drive, CD-ROM drive, Floptical, Tape-drive, ROM, RAM.
Normally, a separate device driver would have to be used for each device.
Further, to transfer data between the devices, often several systems (programs) are required. Using SLs 24, all such drivers and low-level systems are centralized through a unique driver, namely the FDEP 20. The present invention accomplishes this centralization while reducing the access overhead and simplifying the memory maintenance duties of the FDEP 20.
REMOVABLE STORAGE MEDIA
Removable Storage Media (RSM) refers to a combination of one receptacle (or disk drive) and a potentially large number of storage disks (e.g., floppy, ~ 211~8 tape-drive, removable hard-drive, etc). Currently many ways of having one logical data entity across multiple RSM disks exist. All such techniques are particular to the specific needs of a single program (or group of programs). As a result, the transfer of data between various RSM is cumbersome and complicated. In the present invention, part or all of a Data Set 22 can be defined to be an RSM.
When a Data Set 22 exists on an RSM, its identifier tFDLE 72 or SLE 28) can exist on perr-nent media, which allows each RSM disk to be defined for direct access.
In Figure 24, an SAI field 46 is further defined as having three distinct fields: Device Number 166 (2 bytes), Remote Disk Number 168 (4 bytes) and Offset Within Device 170 (6 bytes). The Device Number field 166 provides for up to 2l6 (roughly 65 thousand) possible devices, more than sufficient for even the largest mainframes. The Remote Disk Number field 168 provides for up to 232 (roughly 4 billion) disks per physical device. The Off et Within Device field 170 provides for 248 (or trillions) bytes within each physical device.
Although these numbers may seems excessive now, consider the increase in the capacity of storage devices over the last 10 years.
In most microcomputer environments it would be possible to assign a unique device number to each st~n~rd or RSM disk. Ho~ever, this is a highly limited solution because a micro-computer will normally have only a fraction of the physical devices or RSM disk found on a mainframe-computer. The idea is to keep the structures stAn~Ard across all computing environments.
When a request is made for data on that RSN disk, the FDEP 20 can directly or indirectly cause a message to appear, which prompts the user to insert a specific disk into a specific device to complete a data request.
Once, the RSM disk had been inserted, processing could commence normally.
W094/07209 ~ PCT/CA93/00338 TABLE I
;;CURRENT_DS FDI, GRP, OR EGRP TO STARTING AND ON-GOING
DATA SET FOR TRAVERSAL
;;CUR_CLASS DATA SET CLASS FOR CURRENT_DS
CAT~T TOP_DOWN ( MY DATA_SET ) ;;START RECURSION
PROCEDURE TOP_DOWN ( CURRENT_DS ) DISPLAY OR ACCUMULATE NAME OF CURRENT_DS
CUR_CLASS = DETERMINE CLASS OF DATA SET CURRENT_DS
SELECT ( CUR_CLASS ) WHEN ( PDS ) RETURN
NHEN ( CDS ) SEARCH FOR ALL SLES
WHOSE PARENT IS CURRENT_DS.
FOR ( EACH LOCATED SLE ) CATT TOP_DOWN ( SLE ) RETURN
WHEN ( FDS ) LOAD SL IF NOT ALREADY IN MEMORY
SEARCH FOR ALL SLES WITH A NULL
PARENT ( ROOT SLEs ) FOR ( EACH LOCATED SLE ) CALL TOP_DOWN ( SLE ) RETURN
END_SELECT
END_PROCEDURE
W094/07209 214 3 2 ~ ~ PCT/CA93/00338 CONCLUSION
This concludes the description of the preferred embodiment of the invention. In summary, a method and apparatus for data storage and retrieval have been described. In a computer having one or more secondary storage devices attached thereto, a Finite Data Environment Processor (FDEP) manages Data Sets residing on the secondary storage devices and in memory using Set Lists (SLs) and General Record Pointers (GRPs). The Data Sets contain either data or logical organizational information. The Set Lists comprise Data Sets organized into a hierarchy. The General Record Pointers identify information in terms of Data Sets and records within them. Using the principal idea that a Data Set is uniquely identifiable, the present invention eliminates problems normally associated with referencing the location of data after the data has been moved.
The foregoing description of the preferred embodiment of the invention has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching.
It is intended that the scope of the invention be limited not by this detailed description, but rather by the claims appended hereto.
Claims (39)
1. An apparatus for managing data in a computer (10) having one or more memory devices (12, 16), characterized by:
(a) one or more Data Sets (22) stored in one or more of the memory devices (12, 16), wherein both data and logical organization are recorded in the Data Sets (22);
(b) one or more Set Lists (24) stored in one or more Data Sets (22), wherein the Set List (24) comprises a list of Set List Elements (28) stored in an arbitrary order in any of the Set Lists (24);
(c) each Set List Element (28) comprising:
(1) a Self Identification (32) used to identify the Set List Element (28) and an associated Data Set (22);
(2) an identification (34) of a parent Set List Element (28) in any of the Set Lists (24) for defining a logical hierarchy of the Set List Elements (28), wherein a predefined value for the identification (34) of the parent Set List Element (28) indicates that the Set List Element (28) is a root Set List Element (28) in the logical hierarchy, and a value other than the predefined value for the identification (34) of the parent Set List Element (28) indicates the Set List Element (28) is logically subordinate to the parent Set List Element (28) in the logical hierarchy;
(d) means (20) for manipulating the logical hierarchy by modifying the identifications (34) of the parent Set List Elements (28);
(e) one or more General Record Pointers (26) stored in one or more of the memory devices (12, 16), wherein each General Record Pointer (26) identifies a specific one of the Data Sets (22); and (f) means (20) for accessing the Data Sets (22), further comprising means (20) for searching the Set Lists (24) to locate a desired Set List Element (28) therein having a Self Identification (32) matching a specified General Record Pointer (26).
(a) one or more Data Sets (22) stored in one or more of the memory devices (12, 16), wherein both data and logical organization are recorded in the Data Sets (22);
(b) one or more Set Lists (24) stored in one or more Data Sets (22), wherein the Set List (24) comprises a list of Set List Elements (28) stored in an arbitrary order in any of the Set Lists (24);
(c) each Set List Element (28) comprising:
(1) a Self Identification (32) used to identify the Set List Element (28) and an associated Data Set (22);
(2) an identification (34) of a parent Set List Element (28) in any of the Set Lists (24) for defining a logical hierarchy of the Set List Elements (28), wherein a predefined value for the identification (34) of the parent Set List Element (28) indicates that the Set List Element (28) is a root Set List Element (28) in the logical hierarchy, and a value other than the predefined value for the identification (34) of the parent Set List Element (28) indicates the Set List Element (28) is logically subordinate to the parent Set List Element (28) in the logical hierarchy;
(d) means (20) for manipulating the logical hierarchy by modifying the identifications (34) of the parent Set List Elements (28);
(e) one or more General Record Pointers (26) stored in one or more of the memory devices (12, 16), wherein each General Record Pointer (26) identifies a specific one of the Data Sets (22); and (f) means (20) for accessing the Data Sets (22), further comprising means (20) for searching the Set Lists (24) to locate a desired Set List Element (28) therein having a Self Identification (32) matching a specified General Record Pointer (26).
2. The apparatus as defined in claim 1, wherein each General Record Pointer (26) comprises an identification (58) of a site and a value (60) unique for the site.
3. The apparatus as defined in claim 1, further characterized by means (20) for creating a new Data Set (22) within the logical hierarchy, comprising means (20) for generating and assigning a unique General Record Pointer (26) to the new Data Set (22), means (20) for allocating a storage location for physical data associated with the new Data Set (22), means (20) for creating a new Set List Element (28) containing the General Record Pointer (26) as its Self Identification (32), and means (20) for recording the storage location in the newly created Set List Element (28).
4. The apparatus as defined in claim 1, further characterized by means (20) for deleting an existing Data Set (22) from the logical hierarchy by setting an indicator in the Set List Element (28) associated therewith to a deleted state.
5. The apparatus as defined in claim 1, wherein the means (20) for manipulating is characterized by means (20) for moving a Data Set (22) in the logical hierarchy by altering the identification (34) of the parent Set List Element (28) stored in the Set List Element (28) associated with the Data Set (22).
6. The apparatus as defined in claim 1, wherein the means (20) for manipulating is characterized by means (20) for copying a Data Set (22) and all immediate subordinate Data Sets (22) by copying the Set List Elements (28) associated therewith.
7. The apparatus as defined in claim 6, wherein the means (20) for copying is further characterized by:
(i) means (20) for creating a new Set List Element (28) associated with the copied Data Set (22);
(ii) means (20) for generating a new General Record Pointer (26) and recording it as a Self Identification (32) in the copied Set List Element (28);
(iii) means (20) for duplicating the contents of the copied Data Set (22) to a new Data Set (22) and identifying the Data Set (22) with the new General Record Pointer (26);
(iv) means (20) for replacing the storage location in the new Set List Element (28) with a storage location for the new Data Set (22); and (v) means (20) for repeating the means (20) for creating (i) through the means (20) for replacing (iv) for subordinate Set List Elements (28).
(i) means (20) for creating a new Set List Element (28) associated with the copied Data Set (22);
(ii) means (20) for generating a new General Record Pointer (26) and recording it as a Self Identification (32) in the copied Set List Element (28);
(iii) means (20) for duplicating the contents of the copied Data Set (22) to a new Data Set (22) and identifying the Data Set (22) with the new General Record Pointer (26);
(iv) means (20) for replacing the storage location in the new Set List Element (28) with a storage location for the new Data Set (22); and (v) means (20) for repeating the means (20) for creating (i) through the means (20) for replacing (iv) for subordinate Set List Elements (28).
8. The apparatus as defined in claim 1, wherein the mean for manipulating is characterized by means (20) for promoting a Data Set (22) within the logical hierarchy by replacing the identification (34) of the parent Set List Element (28) in the Set List Element (28) associated with the Data Set (22) with an identification (34) of a parent Set List Element (28) stored in the parent Set List Element (28).
9. The apparatus as defined in claim 1, wherein the means (20) for manipulating is characterized by means (20) for inserting a Data Set (22) within the logical hierarchy to a position between a subordinate Set List Element (28) and a superior Set List Element (28), further comprising means (20) for creating a new Set List Element (28), means (20) for recording a General Record Pointer (26) associated with the Data Set (22) as the Self Identification (32) in the new Set List Element (28), means (20) for recording the Self Identification (32) of the superior Set List Element (28) as the identification (34) of the parent Set List Element (28) in the new Set List Element (28), and means (20) for recording the Self Identification (32) of the new Set List Element (28) as the identification (34) of the parent Set List Element (28) in the subordinate Set List Element (28).
10. The apparatus as defined in claim 1, further characterized by means (20) for storing a Set List Element (28) in an arbitrary one of the Set Lists (24), wherein the Set List Element (28) contains an identification (34) of a parent Set List Element (28) residing in another Set List (24).
11. The apparatus as defined in claim 1, further characterized by means (20) for locating a specific Set List Element (28) by searching all Set Lists (24), regardless of the logical hierarchies defined by the Set List Elements (28).
12. The apparatus as defined in claim 1, wherein a Data Set (22) is identified solely by a General Record Pointer (26), the apparatus further characterized by means (20) for locating a desired Data Set (22) by arbitrarily selecting Set Lists (24), means (20) for arbitrarily searching Set List Elements (28) within the selected Set Lists (24) to identify a Set List Element (28) with a Self Identification (32) matching the General Record Pointer (26) for the desired Data Set, wherein the identified Set List Element (28) contains a storage location for the desired Data Set (22).
13. The apparatus as defined in claim 1, further characterized by means (20) for associating a plurality of Set List Elements (28) with a given Data Set (22), wherein each of the Set List Elements (28) are in a different logical hierarchy.
14. The apparatus as defined in claim 1, further characterized by means (20) for associating a plurality of Set List Elements (28) with a given Data Set (22), wherein each of the Set List Elements (28) are in a different position in the logical hierarchy.
15. The apparatus as defined in claim 14, wherein each Set List Element (28) associated with the Data Set (22) has an identical Self Identification (32).
16. The apparatus as defined in claim 14, wherein each Set List Element (28) associated with the Data Set (22) has a different identification (34) of a parent Set List Element (28) thereby resulting in the different position in the logical hierarchy.
17. The apparatus as defined in claim 1, further characterized by descendant view means (20) for identifying a starting Set List Element (28) and for locating one or more subordinate Set List Elements (28) having an identification (34) of a parent Set List Element (28) matching the Self Identification (32) of the starting Set List Element (28), further comprising means (20) for repeating the means (20) for identifying and means (20) for locating using the matching Set List Elements (28) for all subordinate Set List Elements (28).
18. The apparatus as defined in claim 1, further characterized by ancestor view means (20) for identifying a starting Set List Element (28) and for locating one or more Set List Elements (28) having a Self Identification (32) matching an identification (34) of a parent Set List Element (28) of the starting Set List Element (28), further comprising means (20) for repeating the means (20) for identifying and means (20) for locating using the matching Set List Elements (28) until a root Set List Element (28) is encountered.
19. The apparatus as defined in claim 1, further characterized by means (20) for detecting erroneous control data.
20. The apparatus as defined in claim 19, wherein the means (20) for detecting is characterized by means (20) for detecting corruption in the Set Lists (24) and Set List Elements (28).
21. The apparatus as defined in claim 20, wherein the means (20) for detecting is characterized by means (20) for searching a Set List (24) for a desired Set List Element (28) and means (20) for performing a checksum test on the desired Set List Element (28) to establish a test result, wherein a match between the test result and a checksum value stored in the desired Set List Element (28) indicates a lack of corruption.
22. The apparatus as defined in claim 20, wherein the means (20) for detecting is characterized by means (20) for searching a Set List (24) for a desired Set List Element (28) and means (20) for searching for a Set List Element (28) corresponding to the identification (34) of the parent Set List Element (28) stored in the desired Set List Element (28), wherein a failure to find the parent Set List Element (28) indicates corruption.
23. The apparatus as defined in claim 20, wherein the means (20) for detecting is characterized by means (20) for searching a Set List (24) for a desired Set List Element (28) and means (20) for performing a content validity test on the associated Data Set (22) to establish a test result, wherein a match between the test result and a checksum value stored in the Set List Element (28) indicates a lack of corruption.
24. The apparatus as defined in claim 20, wherein the means (20) for detecting corruption is characterized by means (20) for correcting Set List Element (28) corruption, comprising:
(i) means (20) for creating a new Set List Element (28);
(ii) means (20) for generating a new General Record Pointer (26) and recording it as the Self Identification (32) of the new Set List Element (28);
(iii) means (20) for recording the storage location of the associated Data Set (22) in the new Set List Element (28); and (iv) means (20) for identifying the new Set List Element (28) as a recovered Set List Element (28).
(i) means (20) for creating a new Set List Element (28);
(ii) means (20) for generating a new General Record Pointer (26) and recording it as the Self Identification (32) of the new Set List Element (28);
(iii) means (20) for recording the storage location of the associated Data Set (22) in the new Set List Element (28); and (iv) means (20) for identifying the new Set List Element (28) as a recovered Set List Element (28).
25. The apparatus as defined in claim 1, further comprising a plurality of sites, each site having one or more computers and memory devices (12, 16) situated thereat, wherein each site is identified by a unique site identification.
26. The apparatus as defined in claim 25, further characterized by means (20) for uniquely identifying any given Data Set (22) across any number of sites.
27. The apparatus as defined in claim 25, further characterized by means (20) for locating a Data Set (22) beyond the confines of a single site, further comprising means (20) for searching for the Data Set (22) beginning at a starting site and proceeding to other sites.
28. The apparatus as defined in claim 25, further characterized by means (20) for moving a Data Set (22) within the logical hierarchy by altering the identification (34) of the parent Set List Element (28) associated therewith, wherein the parent Set List Element (28) exists in a Set List (24) at a different site.
29. The apparatus as defined in claims 25, further characterized by means (20) of copying a Data Set (22) within the logical hierarchy from a first site to a different site.
30. The apparatus as defined in claim 25, further comprising means (20) for storing the Data Sets (22) identified within a given Set List (24) in differing ones of the memory devices (12, 16).
31. The apparatus as defined in claim 1, further comprising means (20) for storing associated Data Sets (22) and Set List Elements (28) in differing ones of the memory devices (12, 16).
32. The apparatus as defined in claim 31, wherein the means (20) for storing comprises:
(i) means (20) for selecting a first memory device (12, 16) for storing a Data Set (22);
(ii) means (20) for determining that the first memory device (12, 16) does not contain sufficient storage for the Data Set (22);
(iii) means (20) for selecting a second memory device (12, 16) for storing the Data Set (22);
(iv) means (20) for repeating the means (20) for selecting (i) through the means (20) for selecting (iii) until a memory device (12, 16) with sufficient memory is found; and (v) means (20) for updating the storage location contained in the Set List Element (28) associated with the Data Set (22) to reflect the selected memory device (12, 16).
(i) means (20) for selecting a first memory device (12, 16) for storing a Data Set (22);
(ii) means (20) for determining that the first memory device (12, 16) does not contain sufficient storage for the Data Set (22);
(iii) means (20) for selecting a second memory device (12, 16) for storing the Data Set (22);
(iv) means (20) for repeating the means (20) for selecting (i) through the means (20) for selecting (iii) until a memory device (12, 16) with sufficient memory is found; and (v) means (20) for updating the storage location contained in the Set List Element (28) associated with the Data Set (22) to reflect the selected memory device (12, 16).
33. The apparatus as defined in claim 1, further comprising a Finite Data Environment Processor (20) performed by the computer, the Finite Data Environment Processor (20) comprising means (20) for maintaining a Finite Data List (70) containing one or more Finite Data List Elements (72), wherein each Finite Data List Element (72) identifies a storage location of one of the Set Lists (24), and the Finite Data List Elements (72) are stored in an arbitrary order in the Finite Data List (70).
34. The apparatus as defined in claim 33, further comprising means (20) for locating a desired Data Set (22) by arbitrarily selecting Set Lists (24) identified by the Finite Data List Elements (72).
35. The apparatus as defined in claim 33, further comprising means (20) for detecting erroneous control data.
36. The apparatus as defined in claim 35, wherein the means (20) for detecting comprises means (20) for detecting corruption in the Finite Data List (70) and Finite Data List Elements (72).
37. The apparatus as defined in claim 36, wherein the means (20) for detecting corruption comprises means (20) for searching the Finite Data List (70) for a Finite Data List Element (72) and means (20) for performing a checksum test on each Finite Data List Element (72) to establish a test result, wherein a match between the test result and a checksum value stored in the Finite Data List Element (72) indicates a lack of corruption.
38. The apparatus as defined in claim 36, wherein the means (20) for detecting corruption comprises means (20) for searching the Finite Data List (70) for a Finite Data List Element (72) and means (20) for performing a Set List (24) content validity test, wherein the Set List (24) is loaded and a checksum test performed on the contents thereof to establish a test result, wherein a match between the test result and a checksum value found in the Finite Data List Element indicates a lack of corruption.
39. The apparatus as defined in claim 36, wherein the means (20) for detecting corruption comprises means (20) for correcting Finite Data List Element (72) corruption, comprising:
(i) means (20) for creating a new Finite Data List Element (72);
(ii) means (20) for generating a new General Record Pointer (26) and recording it in a Self Identification (32) for the Finite Data List Element (72);
(iii) means (20) for recording the storage location for the Data Set (22) in the Finite Data List Element (72); and (iv) means (20) for identifying the Finite Data List Element (72) as a recovered Finite Data List Element (72).
(i) means (20) for creating a new Finite Data List Element (72);
(ii) means (20) for generating a new General Record Pointer (26) and recording it in a Self Identification (32) for the Finite Data List Element (72);
(iii) means (20) for recording the storage location for the Data Set (22) in the Finite Data List Element (72); and (iv) means (20) for identifying the Finite Data List Element (72) as a recovered Finite Data List Element (72).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US07/945,266 US5454101A (en) | 1992-09-15 | 1992-09-15 | Data storage system with set lists which contain elements associated with parents for defining a logical hierarchy and general record pointers identifying specific data sets |
US07/945,266 | 1992-09-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2143288A1 true CA2143288A1 (en) | 1994-03-31 |
Family
ID=25482879
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002143288A Abandoned CA2143288A1 (en) | 1992-09-15 | 1993-08-25 | Data storage system with set lists which contain elements associated with parents for defining a logical hierarchy and general record pointers identifying specific data sets |
Country Status (5)
Country | Link |
---|---|
US (1) | US5454101A (en) |
EP (1) | EP0662228B1 (en) |
CA (1) | CA2143288A1 (en) |
DE (1) | DE69302908D1 (en) |
WO (1) | WO1994007209A1 (en) |
Families Citing this family (735)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5555409A (en) * | 1990-12-04 | 1996-09-10 | Applied Technical Sysytem, Inc. | Data management systems and methods including creation of composite views of data |
JPH0652161A (en) * | 1992-08-03 | 1994-02-25 | Fuji Xerox Co Ltd | Method and device for processing document |
US5813000A (en) * | 1994-02-15 | 1998-09-22 | Sun Micro Systems | B tree structure and method |
US5551020A (en) * | 1994-03-28 | 1996-08-27 | Flextech Systems, Inc. | System for the compacting and logical linking of data blocks in files to optimize available physical storage |
US5668970A (en) * | 1994-06-20 | 1997-09-16 | Cd Rom, U.S.A., Inc. | Method and apparatus for generating a file allocation table for a storage medium with no file allocation table using file storage information |
DE19513308A1 (en) * | 1994-10-04 | 1996-04-11 | Hewlett Packard Co | Virtual node file system for computer data system |
US5684985A (en) * | 1994-12-15 | 1997-11-04 | Ufil Unified Data Technologies Ltd. | Method and apparatus utilizing bond identifiers executed upon accessing of an endo-dynamic information node (EDIN) |
US5724512A (en) * | 1995-04-17 | 1998-03-03 | Lucent Technologies Inc. | Methods and apparatus for storage and retrieval of name space information in a distributed computing system |
US5797139A (en) * | 1995-12-14 | 1998-08-18 | International Business Machines Corporation | Method, memory and apparatus for designating a file's type by building unique icon borders |
US5765165A (en) * | 1996-02-29 | 1998-06-09 | Sun Microsystems, Inc. | Fast method of determining duplicates on a linked list |
US5857184A (en) * | 1996-05-03 | 1999-01-05 | Walden Media, Inc. | Language and method for creating, organizing, and retrieving data from a database |
US5884315A (en) * | 1996-05-09 | 1999-03-16 | Philips Electronics North America Corporation | System and method for handling technical hardware related information |
US7770230B2 (en) * | 2002-04-22 | 2010-08-03 | Arvato Digital Services Canada, Inc. | System for dynamically encrypting content for secure internet commerce and providing embedded fulfillment software |
AU4495597A (en) | 1996-09-23 | 1998-04-14 | Lowrie Mcintosh | Defining a uniform subject classification system incorporating document management/records retention functions |
US5778394A (en) * | 1996-12-23 | 1998-07-07 | Emc Corporation | Space reclamation system and method for use in connection with tape logging system |
AP9901621A0 (en) | 1997-01-13 | 1999-09-30 | John Overton | Automated system for image archiving. |
US6549901B1 (en) * | 1997-05-30 | 2003-04-15 | Oracle Corporation | Using transportable tablespaces for hosting data of multiple users |
US6272503B1 (en) * | 1997-05-30 | 2001-08-07 | Oracle Corporation | Tablespace-relative database pointers |
US5940835A (en) * | 1997-11-04 | 1999-08-17 | Opendata Systems | Methods and apparatus for a universal tracking system |
US6192460B1 (en) * | 1997-12-16 | 2001-02-20 | Compaq Computer Corporation | Method and apparatus for accessing data in a shadow set after a failed data operation |
US5978814A (en) * | 1998-01-21 | 1999-11-02 | Microsoft Corporation | Native data signatures in a file system |
US6105062A (en) * | 1998-02-26 | 2000-08-15 | Novell, Inc. | Method and system for pruning and grafting trees in a directory service |
US6223288B1 (en) | 1998-05-22 | 2001-04-24 | Protexis Inc. | System for persistently encrypting critical software file to prevent installation of software program on unauthorized computers |
US7103640B1 (en) * | 1999-09-14 | 2006-09-05 | Econnectix, Llc | Network distributed tracking wire transfer protocol |
US6266661B1 (en) * | 1998-11-30 | 2001-07-24 | Platinum Technology Ip, Inc. | Method and apparatus for maintaining multi-instance database management systems with hierarchical inheritance and cross-hierarchy overrides |
US6751622B1 (en) * | 1999-01-21 | 2004-06-15 | Oracle International Corp. | Generic hierarchical structure with hard-pegging of nodes with dependencies implemented in a relational database |
US6427123B1 (en) * | 1999-02-18 | 2002-07-30 | Oracle Corporation | Hierarchical indexing for accessing hierarchically organized information in a relational system |
EP1247221A4 (en) | 1999-09-20 | 2005-01-19 | Quintiles Transnat Corp | System and method for analyzing de-identified health care data |
US20080005275A1 (en) * | 2000-06-02 | 2008-01-03 | Econnectix, Llc | Method and apparatus for managing location information in a network separate from the data to which the location information pertains |
US20030101139A1 (en) * | 2000-09-14 | 2003-05-29 | Kaag Bjoern Christiaan Wouter | Method of and system for storing a data item |
US6757673B2 (en) * | 2000-10-09 | 2004-06-29 | Town Compass Llc | Displaying hierarchial relationship of data accessed via subject index |
EP1446720B1 (en) * | 2001-10-24 | 2006-03-22 | Koninklijke Philips Electronics N.V. | Security device for a mass storage |
EA200400873A1 (en) * | 2001-12-28 | 2005-12-29 | Джеффри Джэймс Джонас | REAL-TIME DATA STORAGE |
US7548935B2 (en) * | 2002-05-09 | 2009-06-16 | Robert Pecherer | Method of recursive objects for representing hierarchies in relational database systems |
US8557743B2 (en) * | 2002-09-05 | 2013-10-15 | Dyax Corp. | Display library process |
EP1563628A4 (en) | 2002-11-06 | 2010-03-10 | Ibm | Confidential data sharing and anonymous entity resolution |
US8620937B2 (en) * | 2002-12-27 | 2013-12-31 | International Business Machines Corporation | Real time data warehousing |
CN100541443C (en) * | 2002-12-31 | 2009-09-16 | 国际商业机器公司 | The method and system that is used for deal with data |
US7200602B2 (en) * | 2003-02-07 | 2007-04-03 | International Business Machines Corporation | Data set comparison and net change processing |
EP1631908A4 (en) * | 2003-03-24 | 2012-01-25 | Ibm | Secure coordinate identification method, system and program |
US7480798B2 (en) * | 2003-06-05 | 2009-01-20 | International Business Machines Corporation | System and method for representing multiple security groups as a single data object |
US7873684B2 (en) * | 2003-08-14 | 2011-01-18 | Oracle International Corporation | Automatic and dynamic provisioning of databases |
US8229932B2 (en) | 2003-09-04 | 2012-07-24 | Oracle International Corporation | Storing XML documents efficiently in an RDBMS |
US8694510B2 (en) | 2003-09-04 | 2014-04-08 | Oracle International Corporation | Indexing XML documents efficiently |
US8311974B2 (en) | 2004-02-20 | 2012-11-13 | Oracle International Corporation | Modularized extraction, transformation, and loading for a database |
US7493305B2 (en) | 2004-04-09 | 2009-02-17 | Oracle International Corporation | Efficient queribility and manageability of an XML index with path subsetting |
US7499915B2 (en) * | 2004-04-09 | 2009-03-03 | Oracle International Corporation | Index for accessing XML data |
US7366735B2 (en) * | 2004-04-09 | 2008-04-29 | Oracle International Corporation | Efficient extraction of XML content stored in a LOB |
US7930277B2 (en) | 2004-04-21 | 2011-04-19 | Oracle International Corporation | Cost-based optimizer for an XML data repository within a database |
US8554806B2 (en) | 2004-05-14 | 2013-10-08 | Oracle International Corporation | Cross platform transportable tablespaces |
US7571173B2 (en) * | 2004-05-14 | 2009-08-04 | Oracle International Corporation | Cross-platform transportable database |
US8566300B2 (en) | 2004-07-02 | 2013-10-22 | Oracle International Corporation | Mechanism for efficient maintenance of XML index structures in a database system |
US7885980B2 (en) | 2004-07-02 | 2011-02-08 | Oracle International Corporation | Mechanism for improving performance on XML over XML data using path subsetting |
US20060112121A1 (en) * | 2004-11-23 | 2006-05-25 | Mckenney Paul E | Atomically moving list elements between lists using read-copy update |
US7627547B2 (en) * | 2004-11-29 | 2009-12-01 | Oracle International Corporation | Processing path-based database operations |
US7774308B2 (en) * | 2004-12-15 | 2010-08-10 | Applied Minds, Inc. | Anti-item for deletion of content in a distributed datastore |
US8996486B2 (en) * | 2004-12-15 | 2015-03-31 | Applied Invention, Llc | Data store with lock-free stateless paging capability |
US7921076B2 (en) | 2004-12-15 | 2011-04-05 | Oracle International Corporation | Performing an action in response to a file system event |
US7590635B2 (en) * | 2004-12-15 | 2009-09-15 | Applied Minds, Inc. | Distributed data store with an orderstamp to ensure progress |
US8131766B2 (en) | 2004-12-15 | 2012-03-06 | Oracle International Corporation | Comprehensive framework to integrate business logic into a repository |
US8275804B2 (en) | 2004-12-15 | 2012-09-25 | Applied Minds, Llc | Distributed data store with a designated master to ensure consistency |
US11321408B2 (en) | 2004-12-15 | 2022-05-03 | Applied Invention, Llc | Data store with lock-free stateless paging capacity |
US10270858B2 (en) | 2005-09-30 | 2019-04-23 | International Business Machines Corporation | Inducing memory device idle time through rolling read prioritizations |
US11841770B2 (en) | 2005-09-30 | 2023-12-12 | Pure Storage, Inc. | Storage unit connection security in a storage network and methods for use therewith |
US8694668B2 (en) * | 2005-09-30 | 2014-04-08 | Cleversafe, Inc. | Streaming media software interface to a dispersed data storage network |
US10044807B2 (en) | 2005-09-30 | 2018-08-07 | International Business Machines Corporation | Optimistic checked writes |
US8555109B2 (en) * | 2009-07-30 | 2013-10-08 | Cleversafe, Inc. | Method and apparatus for distributed storage integrity processing |
US10154034B2 (en) | 2010-04-26 | 2018-12-11 | International Business Machines Corporation | Cooperative data access request authorization in a dispersed storage network |
US7574570B2 (en) * | 2005-09-30 | 2009-08-11 | Cleversafe Inc | Billing system for information dispersal system |
US7953937B2 (en) | 2005-09-30 | 2011-05-31 | Cleversafe, Inc. | Systems, methods, and apparatus for subdividing data for storage in a dispersed data storage grid |
US8209363B2 (en) | 2007-10-09 | 2012-06-26 | Cleversafe, Inc. | File system adapted for use with a dispersed data storage network |
US9996413B2 (en) * | 2007-10-09 | 2018-06-12 | International Business Machines Corporation | Ensuring data integrity on a dispersed storage grid |
US9632872B2 (en) | 2012-06-05 | 2017-04-25 | International Business Machines Corporation | Reprioritizing pending dispersed storage network requests |
US10855769B2 (en) | 2005-09-30 | 2020-12-01 | Pure Storage, Inc. | Prioritizing memory devices to replace based on namespace health |
US7904475B2 (en) * | 2007-10-09 | 2011-03-08 | Cleversafe, Inc. | Virtualized data storage vaults on a dispersed data storage network |
US10866754B2 (en) | 2010-04-26 | 2020-12-15 | Pure Storage, Inc. | Content archiving in a distributed storage network |
US10282440B2 (en) | 2015-03-31 | 2019-05-07 | International Business Machines Corporation | Prioritizing rebuilding of encoded data slices |
US9632722B2 (en) | 2010-05-19 | 2017-04-25 | International Business Machines Corporation | Balancing storage unit utilization within a dispersed storage network |
US10257276B2 (en) | 2005-09-30 | 2019-04-09 | International Business Machines Corporation | Predictive rebalancing according to future usage expectations |
US11340988B2 (en) | 2005-09-30 | 2022-05-24 | Pure Storage, Inc. | Generating integrity information in a vast storage system |
US11909418B1 (en) | 2005-09-30 | 2024-02-20 | Pure Storage, Inc. | Access authentication in a dispersed storage network |
US10250686B2 (en) | 2005-09-30 | 2019-04-02 | International Business Machines Corporation | Finding alternate storage locations to support failing disk migration |
US10938418B2 (en) | 2005-09-30 | 2021-03-02 | Pure Storage, Inc. | Online disk replacement/removal |
US8595435B2 (en) * | 2009-07-30 | 2013-11-26 | Cleversafe, Inc. | Dispersed storage write process |
US9027080B2 (en) | 2008-03-31 | 2015-05-05 | Cleversafe, Inc. | Proxy access to a dispersed storage network |
US11416339B1 (en) | 2005-09-30 | 2022-08-16 | Pure Storage, Inc. | Validating requests based on stored vault information |
US8880799B2 (en) * | 2005-09-30 | 2014-11-04 | Cleversafe, Inc. | Rebuilding data on a dispersed storage network |
US10747616B2 (en) | 2015-03-31 | 2020-08-18 | Pure Storage, Inc. | Adapting rebuilding of encoded data slices in a dispersed storage network |
US10051057B2 (en) | 2005-09-30 | 2018-08-14 | International Business Machines Corporation | Prioritizing read locations based on an error history |
US10169229B2 (en) | 2012-06-05 | 2019-01-01 | International Business Machines Corporation | Protocols for expanding existing sites in a dispersed storage network |
US8171101B2 (en) * | 2005-09-30 | 2012-05-01 | Cleversafe, Inc. | Smart access to a dispersed data storage network |
US10860424B1 (en) | 2005-09-30 | 2020-12-08 | Pure Storage, Inc. | Background verification processing in a storage network |
US11272009B1 (en) | 2005-09-30 | 2022-03-08 | Pure Storage, Inc. | Managed data slice maintenance in a distributed storage system |
US11620185B2 (en) | 2005-09-30 | 2023-04-04 | Pure Storage, Inc. | Integrity processing in a dispersed storage network |
US11080138B1 (en) | 2010-04-26 | 2021-08-03 | Pure Storage, Inc. | Storing integrity information in a vast storage system |
US11474903B1 (en) | 2005-09-30 | 2022-10-18 | Pure Storage, Inc. | Rebuilding of encoded data slices using locally decodable code segments |
US10432726B2 (en) | 2005-09-30 | 2019-10-01 | Pure Storage, Inc. | Last-resort operations to save at-risk-data |
US10389814B2 (en) | 2005-09-30 | 2019-08-20 | Pure Storage, Inc. | Prioritizing memory devices to replace based on namespace health |
US11327674B2 (en) | 2012-06-05 | 2022-05-10 | Pure Storage, Inc. | Storage vault tiering and data migration in a distributed storage network |
US10356177B2 (en) | 2005-09-30 | 2019-07-16 | International Business Machines Corporation | Prioritizing ranges to rebuild based on namespace health |
US9774684B2 (en) | 2005-09-30 | 2017-09-26 | International Business Machines Corporation | Storing data in a dispersed storage network |
US8352782B2 (en) * | 2005-09-30 | 2013-01-08 | Cleversafe, Inc. | Range based rebuilder for use with a dispersed data storage network |
US11221917B1 (en) | 2005-09-30 | 2022-01-11 | Pure Storage, Inc. | Integrity processing in a dispersed storage network |
US7610314B2 (en) * | 2005-10-07 | 2009-10-27 | Oracle International Corporation | Online tablespace recovery for export |
US8073841B2 (en) | 2005-10-07 | 2011-12-06 | Oracle International Corporation | Optimizing correlated XML extracts |
US8356053B2 (en) | 2005-10-20 | 2013-01-15 | Oracle International Corporation | Managing relationships between resources stored within a repository |
US8949455B2 (en) | 2005-11-21 | 2015-02-03 | Oracle International Corporation | Path-caching mechanism to improve performance of path-related operations in a repository |
US8012542B2 (en) * | 2005-12-30 | 2011-09-06 | E.I. Du Pont De Nemours And Company | Fluoropolymer coating compositions containing adhesive polymers and substrate coating process |
US8510292B2 (en) | 2006-05-25 | 2013-08-13 | Oracle International Coporation | Isolation for applications working on shared XML data |
US7765244B2 (en) * | 2006-06-30 | 2010-07-27 | Broadcom Corporation | Fast and efficient method for deleting very large files from a filesystem |
US7660837B2 (en) * | 2006-06-30 | 2010-02-09 | Broadcom Corporation | Method for automatically managing disk fragmentation |
US7499909B2 (en) | 2006-07-03 | 2009-03-03 | Oracle International Corporation | Techniques of using a relational caching framework for efficiently handling XML queries in the mid-tier data caching |
US7606817B2 (en) * | 2006-08-02 | 2009-10-20 | Entity Labs, Ltd. | Primenet data management system |
US9183321B2 (en) | 2006-10-16 | 2015-11-10 | Oracle International Corporation | Managing compound XML documents in a repository |
US7827177B2 (en) * | 2006-10-16 | 2010-11-02 | Oracle International Corporation | Managing compound XML documents in a repository |
US7797310B2 (en) | 2006-10-16 | 2010-09-14 | Oracle International Corporation | Technique to estimate the cost of streaming evaluation of XPaths |
US8204831B2 (en) | 2006-11-13 | 2012-06-19 | International Business Machines Corporation | Post-anonymous fuzzy comparisons without the use of pre-anonymization variants |
US8909599B2 (en) * | 2006-11-16 | 2014-12-09 | Oracle International Corporation | Efficient migration of binary XML across databases |
US9355273B2 (en) | 2006-12-18 | 2016-05-31 | Bank Of America, N.A., As Collateral Agent | System and method for the protection and de-identification of health care data |
US7836098B2 (en) * | 2007-07-13 | 2010-11-16 | Oracle International Corporation | Accelerating value-based lookup of XML document in XQuery |
US7840609B2 (en) * | 2007-07-31 | 2010-11-23 | Oracle International Corporation | Using sibling-count in XML indexes to optimize single-path queries |
US20090063517A1 (en) * | 2007-08-30 | 2009-03-05 | Microsoft Corporation | User interfaces for scoped hierarchical data sets |
US8819179B2 (en) | 2007-10-09 | 2014-08-26 | Cleversafe, Inc. | Data revision synchronization in a dispersed storage network |
US8549351B2 (en) * | 2007-10-09 | 2013-10-01 | Cleversafe, Inc. | Pessimistic data reading in a dispersed storage network |
US8478865B2 (en) * | 2007-10-09 | 2013-07-02 | Cleversafe, Inc. | Systems, methods, and apparatus for matching a connection request with a network interface adapted for use with a dispersed data storage network |
US8572429B2 (en) * | 2007-10-09 | 2013-10-29 | Cleversafe, Inc. | Optimistic data writing in a dispersed storage network |
US8533256B2 (en) * | 2007-10-09 | 2013-09-10 | Cleversafe, Inc. | Object interface to a dispersed data storage network |
US10027478B2 (en) | 2007-10-09 | 2018-07-17 | International Business Machines Corporation | Differential key backup |
US9959076B2 (en) | 2007-10-09 | 2018-05-01 | International Business Machines Corporation | Optimized disk load distribution |
US8185614B2 (en) * | 2007-10-09 | 2012-05-22 | Cleversafe, Inc. | Systems, methods, and apparatus for identifying accessible dispersed digital storage vaults utilizing a centralized registry |
US9697171B2 (en) | 2007-10-09 | 2017-07-04 | Internaitonal Business Machines Corporation | Multi-writer revision synchronization in a dispersed storage network |
US9888076B2 (en) | 2007-10-09 | 2018-02-06 | International Business Machines Corporation | Encoded data slice caching in a distributed storage network |
US8965956B2 (en) | 2007-10-09 | 2015-02-24 | Cleversafe, Inc. | Integrated client for use with a dispersed data storage network |
US20090106764A1 (en) * | 2007-10-22 | 2009-04-23 | Microsoft Corporation | Support for globalization in test automation |
US7991768B2 (en) | 2007-11-08 | 2011-08-02 | Oracle International Corporation | Global query normalization to improve XML index based rewrites for path subsetted index |
US10142115B2 (en) * | 2008-03-31 | 2018-11-27 | International Business Machines Corporation | Distributed storage network data revision control |
US9501355B2 (en) | 2008-03-31 | 2016-11-22 | International Business Machines Corporation | Storing data and directory information in a distributed storage network |
US8856552B2 (en) * | 2008-03-31 | 2014-10-07 | Cleversafe, Inc. | Directory synchronization of a dispersed storage network |
WO2009139650A1 (en) * | 2008-05-12 | 2009-11-19 | Business Intelligence Solutions Safe B.V. | A data obfuscation system, method, and computer implementation of data obfuscation for secret databases |
US8108419B2 (en) * | 2008-06-13 | 2012-01-31 | International Business Machines Corporation | Virtually applying modifications |
US8819011B2 (en) * | 2008-07-16 | 2014-08-26 | Cleversafe, Inc. | Command line interpreter for accessing a data object stored in a distributed storage network |
US8630987B2 (en) * | 2008-07-16 | 2014-01-14 | Cleversafe, Inc. | System and method for accessing a data object stored in a distributed storage network |
US7958112B2 (en) | 2008-08-08 | 2011-06-07 | Oracle International Corporation | Interleaving query transformations for XML indexes |
US8219563B2 (en) * | 2008-12-30 | 2012-07-10 | Oracle International Corporation | Indexing mechanism for efficient node-aware full-text search over XML |
US8126932B2 (en) * | 2008-12-30 | 2012-02-28 | Oracle International Corporation | Indexing strategy with improved DML performance and space usage for node-aware full-text search over XML |
US10104045B2 (en) | 2009-04-20 | 2018-10-16 | International Business Machines Corporation | Verifying data security in a dispersed storage network |
US10447474B2 (en) * | 2009-04-20 | 2019-10-15 | Pure Storage, Inc. | Dispersed data storage system data decoding and decryption |
US11868498B1 (en) | 2009-04-20 | 2024-01-09 | Pure Storage, Inc. | Storage integrity processing in a storage network |
US8504847B2 (en) * | 2009-04-20 | 2013-08-06 | Cleversafe, Inc. | Securing data in a dispersed storage network using shared secret slices |
US9092294B2 (en) * | 2009-04-20 | 2015-07-28 | Cleversafe, Inc. | Systems, apparatus, and methods for utilizing a reachability set to manage a network upgrade |
US8601259B2 (en) * | 2009-04-20 | 2013-12-03 | Cleversafe, Inc. | Securing data in a dispersed storage network using security sentinel value |
US8819781B2 (en) * | 2009-04-20 | 2014-08-26 | Cleversafe, Inc. | Management of network devices within a dispersed data storage network |
US9483656B2 (en) | 2009-04-20 | 2016-11-01 | International Business Machines Corporation | Efficient and secure data storage utilizing a dispersed data storage system |
US8656187B2 (en) * | 2009-04-20 | 2014-02-18 | Cleversafe, Inc. | Dispersed storage secure data decoding |
US8744071B2 (en) * | 2009-04-20 | 2014-06-03 | Cleversafe, Inc. | Dispersed data storage system data encryption and encoding |
US10230692B2 (en) * | 2009-06-30 | 2019-03-12 | International Business Machines Corporation | Distributed storage processing module |
US10108492B2 (en) | 2009-07-30 | 2018-10-23 | International Business Machines Corporation | Rebuilding data stored in a dispersed storage network |
US8706980B2 (en) * | 2009-07-30 | 2014-04-22 | Cleversafe, Inc. | Method and apparatus for slice partial rebuilding in a dispersed storage network |
US8275744B2 (en) * | 2009-07-30 | 2012-09-25 | Cleversafe, Inc. | Dispersed storage network virtual address fields |
US9009575B2 (en) | 2009-07-30 | 2015-04-14 | Cleversafe, Inc. | Rebuilding a data revision in a dispersed storage network |
US9558059B2 (en) | 2009-07-30 | 2017-01-31 | International Business Machines Corporation | Detecting data requiring rebuilding in a dispersed storage network |
US8489915B2 (en) * | 2009-07-30 | 2013-07-16 | Cleversafe, Inc. | Method and apparatus for storage integrity processing based on error types in a dispersed storage network |
US9207870B2 (en) | 2009-07-30 | 2015-12-08 | Cleversafe, Inc. | Allocating storage units in a dispersed storage network |
US9208025B2 (en) | 2009-07-30 | 2015-12-08 | Cleversafe, Inc. | Virtual memory mapping in a dispersed storage network |
US8352719B2 (en) * | 2009-07-31 | 2013-01-08 | Cleversafe, Inc. | Computing device booting utilizing dispersed storage |
US9167277B2 (en) * | 2009-08-03 | 2015-10-20 | Cleversafe, Inc. | Dispersed storage network data manipulation |
US9690513B2 (en) * | 2009-08-27 | 2017-06-27 | International Business Machines Corporation | Dispersed storage processing unit and methods with operating system diversity for use in a dispersed storage system |
US9047217B2 (en) * | 2009-08-27 | 2015-06-02 | Cleversafe, Inc. | Nested distributed storage unit and applications thereof |
US8560855B2 (en) | 2009-08-27 | 2013-10-15 | Cleversafe, Inc. | Verification of dispersed storage network access control information |
US9411810B2 (en) | 2009-08-27 | 2016-08-09 | International Business Machines Corporation | Method and apparatus for identifying data inconsistency in a dispersed storage network |
US20110078343A1 (en) * | 2009-09-29 | 2011-03-31 | Cleversafe, Inc. | Distributed storage network including memory diversity |
US8924387B2 (en) * | 2009-09-29 | 2014-12-30 | Cleversafe, Inc. | Social networking utilizing a dispersed storage network |
US8689354B2 (en) * | 2009-09-29 | 2014-04-01 | Cleversafe, Inc. | Method and apparatus for accessing secure data in a dispersed storage system |
US8281181B2 (en) * | 2009-09-30 | 2012-10-02 | Cleversafe, Inc. | Method and apparatus for selectively active dispersed storage memory device utilization |
US8402344B2 (en) * | 2009-10-05 | 2013-03-19 | Cleversafe, Inc. | Method and apparatus for controlling dispersed storage of streaming data |
US10757187B2 (en) | 2009-10-29 | 2020-08-25 | Pure Storage, Inc. | Streaming all-or-nothing encoding with random offset support |
US10389845B2 (en) | 2009-10-29 | 2019-08-20 | Pure Storage, Inc. | Determining how to service requests based on several indicators |
US9774678B2 (en) | 2009-10-29 | 2017-09-26 | International Business Machines Corporation | Temporarily storing data in a dispersed storage network |
US9661356B2 (en) | 2009-10-29 | 2017-05-23 | International Business Machines Corporation | Distribution of unique copies of broadcast data utilizing fault-tolerant retrieval from dispersed storage |
US8291277B2 (en) * | 2009-10-29 | 2012-10-16 | Cleversafe, Inc. | Data distribution utilizing unique write parameters in a dispersed storage system |
US8966194B2 (en) | 2009-10-29 | 2015-02-24 | Cleversafe, Inc. | Processing a write request in a dispersed storage network |
US9900150B2 (en) * | 2009-10-30 | 2018-02-20 | International Business Machines Corporation | Dispersed storage camera device and method of operation |
US9413529B2 (en) | 2009-10-30 | 2016-08-09 | International Business Machines Corporation | Distributed storage network and method for storing and retrieving encryption keys |
US8769035B2 (en) * | 2009-10-30 | 2014-07-01 | Cleversafe, Inc. | Distributed storage network for storing a data object based on storage requirements |
US8589637B2 (en) * | 2009-10-30 | 2013-11-19 | Cleversafe, Inc. | Concurrent set storage in distributed storage network |
US9098376B2 (en) | 2009-10-30 | 2015-08-04 | Cleversafe, Inc. | Distributed storage network for modification of a data object |
US8464133B2 (en) * | 2009-10-30 | 2013-06-11 | Cleversafe, Inc. | Media content distribution in a social network utilizing dispersed storage |
US10073737B2 (en) | 2009-10-30 | 2018-09-11 | International Business Machines Corporation | Slice location identification |
US9311185B2 (en) | 2009-10-30 | 2016-04-12 | Cleversafe, Inc. | Dispersed storage unit solicitation method and apparatus |
US9195408B2 (en) | 2009-10-30 | 2015-11-24 | Cleversafe, Inc. | Highly autonomous dispersed storage system retrieval method |
US9270298B2 (en) | 2009-11-24 | 2016-02-23 | International Business Machines Corporation | Selecting storage units to rebuild an encoded data slice |
US8918897B2 (en) | 2009-11-24 | 2014-12-23 | Cleversafe, Inc. | Dispersed storage network data slice integrity verification |
US9152514B2 (en) | 2009-11-24 | 2015-10-06 | Cleversafe, Inc. | Rebuilding a data segment in a dispersed storage network |
US9501349B2 (en) | 2009-11-24 | 2016-11-22 | International Business Machines Corporation | Changing dispersed storage error encoding parameters |
US10015141B2 (en) | 2009-11-25 | 2018-07-03 | International Business Machines Corporation | Dispersed data storage in a VPN group of devices |
US8688907B2 (en) * | 2009-11-25 | 2014-04-01 | Cleversafe, Inc. | Large scale subscription based dispersed storage network |
US8458233B2 (en) * | 2009-11-25 | 2013-06-04 | Cleversafe, Inc. | Data de-duplication in a dispersed storage network utilizing data characterization |
US9672109B2 (en) | 2009-11-25 | 2017-06-06 | International Business Machines Corporation | Adaptive dispersed storage network (DSN) and system |
US8621268B2 (en) * | 2009-11-25 | 2013-12-31 | Cleversafe, Inc. | Write threshold utilization in a dispersed storage system |
US9836352B2 (en) | 2009-11-25 | 2017-12-05 | International Business Machines Corporation | Detecting a utilization imbalance between dispersed storage network storage units |
US8527807B2 (en) | 2009-11-25 | 2013-09-03 | Cleversafe, Inc. | Localized dispersed storage memory system |
US9996548B2 (en) | 2009-11-25 | 2018-06-12 | International Business Machines Corporation | Dispersed storage using localized peer-to-peer capable wireless devices in a peer-to-peer or femto cell supported carrier served fashion |
US9489264B2 (en) | 2009-11-25 | 2016-11-08 | International Business Machines Corporation | Storing an encoded data slice as a set of sub-slices |
US9626248B2 (en) | 2009-11-25 | 2017-04-18 | International Business Machines Corporation | Likelihood based rebuilding of missing encoded data slices |
US10001923B2 (en) | 2009-12-29 | 2018-06-19 | International Business Machines Corporation | Generation collapse |
US10158648B2 (en) | 2009-12-29 | 2018-12-18 | International Business Machines Corporation | Policy-based access in a dispersed storage network |
US10067831B2 (en) | 2009-12-29 | 2018-09-04 | International Business Machines Corporation | Slice migration in a dispersed storage network |
US9330241B2 (en) | 2009-12-29 | 2016-05-03 | International Business Machines Corporation | Applying digital rights management to multi-media file playback |
US10289505B2 (en) | 2009-12-29 | 2019-05-14 | International Business Machines Corporation | Dispersed multi-media content for a centralized digital video storage system |
US9369526B2 (en) | 2009-12-29 | 2016-06-14 | International Business Machines Corporation | Distributed storage time synchronization based on retrieval delay |
US10372686B2 (en) | 2009-12-29 | 2019-08-06 | International Business Machines Corporation | Policy-based storage in a dispersed storage network |
US10031669B2 (en) | 2009-12-29 | 2018-07-24 | International Business Machines Corporation | Scheduling migration related traffic to be non-disruptive and performant |
US8762343B2 (en) | 2009-12-29 | 2014-06-24 | Cleversafe, Inc. | Dispersed storage of software |
US9866595B2 (en) | 2009-12-29 | 2018-01-09 | International Busines Machines Corporation | Policy based slice deletion in a dispersed storage network |
US9727266B2 (en) | 2009-12-29 | 2017-08-08 | International Business Machines Corporation | Selecting storage units in a dispersed storage network |
US9798467B2 (en) | 2009-12-29 | 2017-10-24 | International Business Machines Corporation | Security checks for proxied requests |
US10133632B2 (en) | 2009-12-29 | 2018-11-20 | International Business Machines Corporation | Determining completion of migration in a dispersed storage network |
US20180335967A1 (en) | 2009-12-29 | 2018-11-22 | International Business Machines Corporation | User customizable data processing plan in a dispersed storage network |
US8352831B2 (en) * | 2009-12-29 | 2013-01-08 | Cleversafe, Inc. | Digital content distribution utilizing dispersed storage |
US10148788B2 (en) | 2009-12-29 | 2018-12-04 | International Business Machines Corporation | Method for providing schedulers in a distributed storage network |
US10237281B2 (en) | 2009-12-29 | 2019-03-19 | International Business Machines Corporation | Access policy updates in a dispersed storage network |
US9922063B2 (en) | 2009-12-29 | 2018-03-20 | International Business Machines Corporation | Secure storage of secret data in a dispersed storage network |
US8468368B2 (en) * | 2009-12-29 | 2013-06-18 | Cleversafe, Inc. | Data encryption parameter dispersal |
US9305597B2 (en) | 2009-12-29 | 2016-04-05 | Cleversafe, Inc. | Accessing stored multi-media content based on a subscription priority level |
US8990585B2 (en) | 2009-12-29 | 2015-03-24 | Cleversafe, Inc. | Time based dispersed storage access |
US9507735B2 (en) | 2009-12-29 | 2016-11-29 | International Business Machines Corporation | Digital content retrieval utilizing dispersed storage |
US9413393B2 (en) | 2009-12-29 | 2016-08-09 | International Business Machines Corporation | Encoding multi-media content for a centralized digital video storage system |
US9672108B2 (en) | 2009-12-29 | 2017-06-06 | International Business Machines Corporation | Dispersed storage network (DSN) and system with improved security |
US8959366B2 (en) * | 2010-01-28 | 2015-02-17 | Cleversafe, Inc. | De-sequencing encoded data slices |
US8918674B2 (en) | 2010-01-28 | 2014-12-23 | Cleversafe, Inc. | Directory file system in a dispersed storage network |
US8954667B2 (en) * | 2010-01-28 | 2015-02-10 | Cleversafe, Inc. | Data migration in a dispersed storage network |
US9201732B2 (en) | 2010-01-28 | 2015-12-01 | Cleversafe, Inc. | Selective activation of memory to retrieve data in a dispersed storage network |
US9760440B2 (en) | 2010-01-28 | 2017-09-12 | International Business Machines Corporation | Site-based namespace allocation |
US8522113B2 (en) * | 2010-01-28 | 2013-08-27 | Cleversafe, Inc. | Selecting storage facilities and dispersal parameters in a dispersed storage network |
US11301592B2 (en) | 2010-01-28 | 2022-04-12 | Pure Storage, Inc. | Distributed storage with data obfuscation and method for use therewith |
US9043548B2 (en) | 2010-01-28 | 2015-05-26 | Cleversafe, Inc. | Streaming content storage |
US10324791B2 (en) | 2010-11-01 | 2019-06-18 | International Business Machines Corporation | Selectable parallel processing of dispersed storage error encoding |
US9135115B2 (en) | 2010-02-27 | 2015-09-15 | Cleversafe, Inc. | Storing data in multiple formats including a dispersed storage format |
US10216647B2 (en) | 2010-02-27 | 2019-02-26 | International Business Machines Corporation | Compacting dispersed storage space |
US10268374B2 (en) | 2010-02-27 | 2019-04-23 | International Business Machines Corporation | Redundant array of independent discs and dispersed storage network system re-director |
US11429486B1 (en) | 2010-02-27 | 2022-08-30 | Pure Storage, Inc. | Rebuilding data via locally decodable redundancy in a vast storage network |
US20180365105A1 (en) | 2014-06-05 | 2018-12-20 | International Business Machines Corporation | Establishing an operation execution schedule in a dispersed storage network |
US10007575B2 (en) | 2010-02-27 | 2018-06-26 | International Business Machines Corporation | Alternative multiple memory format storage in a storage network |
US9311184B2 (en) | 2010-02-27 | 2016-04-12 | Cleversafe, Inc. | Storing raid data as encoded data slices in a dispersed storage network |
US8370600B2 (en) * | 2010-03-12 | 2013-02-05 | Cleversafe, Inc. | Dispersed storage unit and method for configuration thereof |
US8683119B2 (en) * | 2010-03-15 | 2014-03-25 | Cleversafe, Inc. | Access control in a dispersed storage network |
US8527705B2 (en) * | 2010-03-16 | 2013-09-03 | Cleversafe, Inc. | Temporarily caching an encoded data slice |
US9229824B2 (en) | 2010-03-16 | 2016-01-05 | International Business Machines Corporation | Caching rebuilt encoded data slices in a dispersed storage network |
US9170884B2 (en) | 2010-03-16 | 2015-10-27 | Cleversafe, Inc. | Utilizing cached encoded data slices in a dispersed storage network |
US10447767B2 (en) | 2010-04-26 | 2019-10-15 | Pure Storage, Inc. | Resolving a performance issue within a dispersed storage network |
US8566354B2 (en) | 2010-04-26 | 2013-10-22 | Cleversafe, Inc. | Storage and retrieval of required slices in a dispersed storage network |
US8938552B2 (en) | 2010-08-02 | 2015-01-20 | Cleversafe, Inc. | Resolving a protocol issue within a dispersed storage network |
US8914669B2 (en) | 2010-04-26 | 2014-12-16 | Cleversafe, Inc. | Secure rebuilding of an encoded data slice in a dispersed storage network |
US9092386B2 (en) | 2010-04-26 | 2015-07-28 | Cleversafe, Inc. | Indicating an error within a dispersed storage network |
US10956292B1 (en) | 2010-04-26 | 2021-03-23 | Pure Storage, Inc. | Utilizing integrity information for data retrieval in a vast storage system |
US9898373B2 (en) | 2010-04-26 | 2018-02-20 | International Business Machines Corporation | Prioritizing rebuilding of stored data in a dispersed storage network |
US9495117B2 (en) | 2010-04-26 | 2016-11-15 | International Business Machines Corporation | Storing data in a dispersed storage network |
US8625635B2 (en) | 2010-04-26 | 2014-01-07 | Cleversafe, Inc. | Dispersed storage network frame protocol header |
US9606858B2 (en) | 2010-04-26 | 2017-03-28 | International Business Machines Corporation | Temporarily storing an encoded data slice |
US11740972B1 (en) | 2010-05-19 | 2023-08-29 | Pure Storage, Inc. | Migrating data in a vast storage network |
US10193689B2 (en) | 2010-05-19 | 2019-01-29 | International Business Machines Corporation | Storing access information in a dispersed storage network |
US8874868B2 (en) | 2010-05-19 | 2014-10-28 | Cleversafe, Inc. | Memory utilization balancing in a dispersed storage network |
US8521697B2 (en) | 2010-05-19 | 2013-08-27 | Cleversafe, Inc. | Rebuilding data in multiple dispersed storage networks |
US8621580B2 (en) | 2010-05-19 | 2013-12-31 | Cleversafe, Inc. | Retrieving access information in a dispersed storage network |
US10911230B2 (en) | 2010-05-19 | 2021-02-02 | Pure Storage, Inc. | Securely activating functionality of a computing device in a dispersed storage network |
US10353774B2 (en) | 2015-10-30 | 2019-07-16 | International Business Machines Corporation | Utilizing storage unit latency data in a dispersed storage network |
US8909858B2 (en) | 2010-06-09 | 2014-12-09 | Cleversafe, Inc. | Storing encoded data slices in a dispersed storage network |
US8782227B2 (en) | 2010-06-22 | 2014-07-15 | Cleversafe, Inc. | Identifying and correcting an undesired condition of a dispersed storage network access request |
US8612831B2 (en) | 2010-06-22 | 2013-12-17 | Cleversafe, Inc. | Accessing data stored in a dispersed storage memory |
US10162524B2 (en) | 2010-08-02 | 2018-12-25 | International Business Machines Corporation | Determining whether to compress a data segment in a dispersed storage network |
US9077734B2 (en) | 2010-08-02 | 2015-07-07 | Cleversafe, Inc. | Authentication of devices of a dispersed storage network |
US20190095101A1 (en) | 2010-08-02 | 2019-03-28 | International Business Machines Corporation | Authenticating a credential in a dispersed storage network |
US9063968B2 (en) | 2010-08-02 | 2015-06-23 | Cleversafe, Inc. | Identifying a compromised encoded data slice |
US9940195B2 (en) | 2010-08-25 | 2018-04-10 | International Business Machines Corporation | Encryption of slice partials |
US10255135B2 (en) | 2010-08-25 | 2019-04-09 | International Business Machines Corporation | Method and apparatus for non-interactive information dispersal |
US10157002B2 (en) | 2010-08-26 | 2018-12-18 | International Business Machines Corporation | Migrating an encoded data slice based on an end-of-life memory level of a memory device |
US9843412B2 (en) | 2010-10-06 | 2017-12-12 | International Business Machines Corporation | Optimizing routing of data across a communications network |
US9037937B2 (en) | 2010-10-06 | 2015-05-19 | Cleversafe, Inc. | Relaying data transmitted as encoded data slices |
US9116831B2 (en) | 2010-10-06 | 2015-08-25 | Cleversafe, Inc. | Correcting an errant encoded data slice |
US9571230B2 (en) | 2010-10-06 | 2017-02-14 | International Business Machines Corporation | Adjusting routing of data within a network path |
US10970168B2 (en) | 2010-10-06 | 2021-04-06 | Pure Storage, Inc. | Adjusting dispersed storage error encoding parameters based on path performance |
US10298957B2 (en) | 2010-10-06 | 2019-05-21 | International Business Machines Corporation | Content-based encoding in a multiple routing path communications system |
US10289318B2 (en) | 2010-11-01 | 2019-05-14 | International Business Machines Corporation | Adjusting optimistic writes in a dispersed storage network |
US10082970B2 (en) | 2010-11-01 | 2018-09-25 | International Business Machines Corporation | Storing an effective dynamic width of encoded data slices |
US8707105B2 (en) | 2010-11-01 | 2014-04-22 | Cleversafe, Inc. | Updating a set of memory devices in a dispersed storage network |
US10805042B2 (en) | 2010-11-01 | 2020-10-13 | Pure Storage, Inc. | Creating transmission data slices for use in a dispersed storage network |
US10768833B2 (en) | 2010-11-01 | 2020-09-08 | Pure Storage, Inc. | Object dispersal load balancing |
US9015499B2 (en) | 2010-11-01 | 2015-04-21 | Cleversafe, Inc. | Verifying data integrity utilizing dispersed storage |
US10146645B2 (en) | 2010-11-01 | 2018-12-04 | International Business Machines Corporation | Multiple memory format storage in a storage network |
US8627065B2 (en) | 2010-11-09 | 2014-01-07 | Cleversafe, Inc. | Validating a certificate chain in a dispersed storage network |
US11061597B2 (en) | 2010-11-09 | 2021-07-13 | Pure Storage, Inc. | Supporting live migrations and re-balancing with a virtual storage unit |
US9590838B2 (en) | 2010-11-09 | 2017-03-07 | International Business Machines Corporation | Transferring data of a dispersed storage network |
US9454431B2 (en) | 2010-11-29 | 2016-09-27 | International Business Machines Corporation | Memory selection for slice storage in a dispersed storage network |
US11789631B2 (en) | 2010-11-29 | 2023-10-17 | Pure Storage, Inc. | Utilizing metadata storage trees in a vast storage network |
US11307930B1 (en) | 2010-11-29 | 2022-04-19 | Pure Storage, Inc. | Optimized selection of participants in distributed data rebuild/verification |
US10802763B2 (en) | 2010-11-29 | 2020-10-13 | Pure Storage, Inc. | Remote storage verification |
US9336139B2 (en) | 2010-11-29 | 2016-05-10 | Cleversafe, Inc. | Selecting a memory for storage of an encoded data slice in a dispersed storage network |
US10922179B2 (en) | 2010-11-29 | 2021-02-16 | Pure Storage, Inc. | Post rebuild verification |
US10372350B2 (en) | 2010-11-29 | 2019-08-06 | Pure Storage, Inc. | Shared ownership of namespace ranges |
US9170882B2 (en) | 2010-12-22 | 2015-10-27 | Cleversafe, Inc. | Retrieving data segments from a dispersed storage network |
US8683231B2 (en) | 2010-12-27 | 2014-03-25 | Cleversafe, Inc. | Obfuscating data stored in a dispersed storage network |
US8688949B2 (en) | 2011-02-01 | 2014-04-01 | Cleversafe, Inc. | Modifying data storage in response to detection of a memory system imbalance |
US9081714B2 (en) | 2011-02-01 | 2015-07-14 | Cleversafe, Inc. | Utilizing a dispersed storage network access token module to store data in a dispersed storage network memory |
US8868695B2 (en) | 2011-03-02 | 2014-10-21 | Cleversafe, Inc. | Configuring a generic computing device utilizing specific computing device operation information |
US9183073B2 (en) | 2011-03-02 | 2015-11-10 | Cleversafe, Inc. | Maintaining data concurrency with a dispersed storage network |
US8880978B2 (en) * | 2011-04-01 | 2014-11-04 | Cleversafe, Inc. | Utilizing a local area network memory and a dispersed storage network memory to access data |
US11418580B2 (en) | 2011-04-01 | 2022-08-16 | Pure Storage, Inc. | Selective generation of secure signatures in a distributed storage network |
US10298684B2 (en) | 2011-04-01 | 2019-05-21 | International Business Machines Corporation | Adaptive replication of dispersed data to improve data access performance |
US8627091B2 (en) | 2011-04-01 | 2014-01-07 | Cleversafe, Inc. | Generating a secure signature utilizing a plurality of key shares |
US9219604B2 (en) | 2011-05-09 | 2015-12-22 | Cleversafe, Inc. | Generating an encrypted message for storage |
US8707393B2 (en) | 2011-05-09 | 2014-04-22 | Cleversafe, Inc. | Providing dispersed storage network location information of a hypertext markup language file |
US20170192684A1 (en) | 2011-05-09 | 2017-07-06 | International Business Machines Corporation | Auditing a transaction in a dispersed storage network |
US9298550B2 (en) | 2011-05-09 | 2016-03-29 | Cleversafe, Inc. | Assigning a dispersed storage network address range in a maintenance free storage container |
US9141458B2 (en) | 2011-05-09 | 2015-09-22 | Cleversafe, Inc. | Adjusting a data storage address mapping in a maintenance free storage container |
US8656253B2 (en) | 2011-06-06 | 2014-02-18 | Cleversafe, Inc. | Storing portions of data in a dispersed storage network |
US10042709B2 (en) | 2011-06-06 | 2018-08-07 | International Business Machines Corporation | Rebuild prioritization during a plurality of concurrent data object write operations |
US10061650B2 (en) | 2011-06-06 | 2018-08-28 | International Business Machines Corporation | Priority based rebuilding |
US8756480B2 (en) | 2011-06-06 | 2014-06-17 | Cleversafe, Inc. | Prioritized deleting of slices stored in a dispersed storage network |
US10949301B2 (en) | 2011-06-06 | 2021-03-16 | Pure Storage, Inc. | Pre-positioning pre-stored content in a content distribution system |
US8688635B2 (en) * | 2011-07-01 | 2014-04-01 | International Business Machines Corporation | Data set connection manager having a plurality of data sets to represent one data set |
US9244770B2 (en) | 2011-07-06 | 2016-01-26 | International Business Machines Corporation | Responding to a maintenance free storage container security threat |
US9460148B2 (en) | 2011-07-06 | 2016-10-04 | International Business Machines Corporation | Completing distribution of multi-media content to an accessing device |
US20230176790A1 (en) * | 2011-07-27 | 2023-06-08 | Pure Storage, Inc. | Error Prediction Based on Correlation Using Event Records |
US10678619B2 (en) | 2011-07-27 | 2020-06-09 | Pure Storage, Inc. | Unified logs and device statistics |
US11016702B2 (en) | 2011-07-27 | 2021-05-25 | Pure Storage, Inc. | Hierarchical event tree |
US8914667B2 (en) | 2011-07-27 | 2014-12-16 | Cleversafe, Inc. | Identifying a slice error in a dispersed storage network |
US10454678B2 (en) | 2011-08-17 | 2019-10-22 | Pure Storage, Inc. | Accesor-based audit trails |
US9229823B2 (en) | 2011-08-17 | 2016-01-05 | International Business Machines Corporation | Storage and retrieval of dispersed storage network access information |
US10120756B2 (en) | 2011-08-17 | 2018-11-06 | International Business Machines Corporation | Audit object generation in a dispersed storage network |
US9971802B2 (en) | 2011-08-17 | 2018-05-15 | International Business Machines Corporation | Audit record transformation in a dispersed storage network |
US8930649B2 (en) | 2011-09-06 | 2015-01-06 | Cleversafe, Inc. | Concurrent coding of data streams |
US11907060B2 (en) | 2011-09-06 | 2024-02-20 | Pure Storage, Inc. | Coding of data streams in a vast storage network |
US20190179696A1 (en) | 2011-09-06 | 2019-06-13 | International Business Machines Corporation | Demultiplexing decoded data streams in a distributed storage network |
US10235237B2 (en) | 2011-09-06 | 2019-03-19 | Intertnational Business Machines Corporation | Decoding data streams in a distributed storage network |
US8856617B2 (en) | 2011-10-04 | 2014-10-07 | Cleversafe, Inc. | Sending a zero information gain formatted encoded data slice |
US9785491B2 (en) | 2011-10-04 | 2017-10-10 | International Business Machines Corporation | Processing a certificate signing request in a dispersed storage network |
US8555130B2 (en) | 2011-10-04 | 2013-10-08 | Cleversafe, Inc. | Storing encoded data slices in a dispersed storage unit |
US9798616B2 (en) | 2011-11-01 | 2017-10-24 | International Business Machines Corporation | Wireless sending a set of encoded data slices |
US10496500B2 (en) | 2011-11-01 | 2019-12-03 | Pure Storage, Inc. | Preemptively reading extra encoded data slices |
US11329830B1 (en) | 2011-11-01 | 2022-05-10 | Pure Storage, Inc. | Dispersed credentials |
US10365969B2 (en) | 2011-11-01 | 2019-07-30 | International Business Machines Corporation | Multiple wireless communication systems stream slices based on geography |
US10437678B2 (en) | 2011-11-01 | 2019-10-08 | Pure Storage, Inc. | Updating an encoded data slice |
US8683286B2 (en) | 2011-11-01 | 2014-03-25 | Cleversafe, Inc. | Storing data in a dispersed storage network |
US8627066B2 (en) | 2011-11-03 | 2014-01-07 | Cleversafe, Inc. | Processing a dispersed storage network access request utilizing certificate chain validation information |
US10558592B2 (en) | 2011-11-28 | 2020-02-11 | Pure Storage, Inc. | Priority level adaptation in a dispersed storage network |
US11474958B1 (en) | 2011-11-28 | 2022-10-18 | Pure Storage, Inc. | Generating and queuing system messages with priorities in a storage network |
US10318445B2 (en) | 2011-11-28 | 2019-06-11 | International Business Machines Corporation | Priority level adaptation in a dispersed storage network |
US10977194B2 (en) * | 2011-11-28 | 2021-04-13 | Pure Storage, Inc. | Securely storing random keys in a dispersed storage network |
US10387071B2 (en) | 2011-11-28 | 2019-08-20 | Pure Storage, Inc. | On-the-fly cancellation of unnecessary read requests |
US9584326B2 (en) | 2011-11-28 | 2017-02-28 | International Business Machines Corporation | Creating a new file for a dispersed storage network |
US10055283B2 (en) | 2011-11-28 | 2018-08-21 | International Business Machines Corporation | Securely distributing random keys in a dispersed storage network |
US8848906B2 (en) | 2011-11-28 | 2014-09-30 | Cleversafe, Inc. | Encrypting data for storage in a dispersed storage network |
US9009567B2 (en) | 2011-12-12 | 2015-04-14 | Cleversafe, Inc. | Encrypting distributed computing data |
US10146621B2 (en) | 2011-12-12 | 2018-12-04 | International Business Machines Corporation | Chaining computes in a distributed computing system |
US9430286B2 (en) | 2011-12-12 | 2016-08-30 | International Business Machines Corporation | Authorizing distributed task processing in a distributed storage network |
US9584359B2 (en) | 2011-12-12 | 2017-02-28 | International Business Machines Corporation | Distributed storage and computing of interim data |
US10176045B2 (en) | 2011-12-12 | 2019-01-08 | International Business Machines Corporation | Internet based shared memory in a distributed computing system |
US9141468B2 (en) | 2011-12-12 | 2015-09-22 | Cleversafe, Inc. | Managing memory utilization in a distributed storage and task network |
US10666596B2 (en) | 2011-12-12 | 2020-05-26 | Pure Storage, Inc. | Messaging via a shared memory of a distributed computing system |
US20130238900A1 (en) | 2011-12-12 | 2013-09-12 | Cleversafe, Inc. | Dispersed storage network secure hierarchical file directory |
US9817701B2 (en) | 2011-12-12 | 2017-11-14 | International Business Machines Corporation | Threshold computing in a distributed computing system |
US10348640B2 (en) | 2011-12-12 | 2019-07-09 | International Business Machines Corporation | Partial task execution in a dispersed storage network |
US9674155B2 (en) | 2011-12-12 | 2017-06-06 | International Business Machines Corporation | Encrypting segmented data in a distributed computing system |
US9304858B2 (en) | 2011-12-12 | 2016-04-05 | International Business Machines Corporation | Analyzing found data in a distributed storage and task network |
US10104168B2 (en) | 2011-12-12 | 2018-10-16 | International Business Machines Corporation | Method for managing throughput in a distributed storage network |
US10346218B2 (en) | 2011-12-12 | 2019-07-09 | International Business Machines Corporation | Partial task allocation in a dispersed storage network |
US20180083930A1 (en) | 2011-12-12 | 2018-03-22 | International Business Machines Corporation | Reads for dispersed computation jobs |
US10360106B2 (en) | 2011-12-12 | 2019-07-23 | International Business Machines Corporation | Throttled real-time writes |
US9514132B2 (en) | 2012-01-31 | 2016-12-06 | International Business Machines Corporation | Secure data migration in a dispersed storage network |
US10671585B2 (en) | 2012-01-31 | 2020-06-02 | Pure Storage, Inc. | Storing indexed data to a dispersed storage network |
US9146810B2 (en) | 2012-01-31 | 2015-09-29 | Cleversafe, Inc. | Identifying a potentially compromised encoded data slice |
US10140177B2 (en) | 2012-01-31 | 2018-11-27 | International Business Machines Corporation | Transferring a partial task in a distributed computing system |
US9465861B2 (en) | 2012-01-31 | 2016-10-11 | International Business Machines Corporation | Retrieving indexed data from a dispersed storage network |
US9891995B2 (en) | 2012-01-31 | 2018-02-13 | International Business Machines Corporation | Cooperative decentralized rebuild scanning |
US8935256B2 (en) | 2012-03-02 | 2015-01-13 | Cleversafe, Inc. | Expanding a hierarchical dispersed storage index |
US10402393B2 (en) | 2012-03-02 | 2019-09-03 | Pure Storage, Inc. | Slice migration in a dispersed storage network |
US10157051B2 (en) | 2012-03-02 | 2018-12-18 | International Business Machines Corporation | Upgrading devices in a dispersed storage network |
US9588994B2 (en) | 2012-03-02 | 2017-03-07 | International Business Machines Corporation | Transferring task execution in a distributed storage and task network |
US11232093B2 (en) | 2012-03-02 | 2022-01-25 | Pure Storage, Inc. | Slice migration in a dispersed storage network |
US10795766B2 (en) | 2012-04-25 | 2020-10-06 | Pure Storage, Inc. | Mapping slice groupings in a dispersed storage network |
US10621044B2 (en) | 2012-04-25 | 2020-04-14 | Pure Storage, Inc. | Mapping slice groupings in a dispersed storage network |
US9380032B2 (en) | 2012-04-25 | 2016-06-28 | International Business Machines Corporation | Encrypting data for storage in a dispersed storage network |
US10002047B2 (en) | 2012-06-05 | 2018-06-19 | International Business Machines Corporation | Read-if-not-revision-equals protocol message |
US10447471B2 (en) | 2012-06-05 | 2019-10-15 | Pure Storage, Inc. | Systematic secret sharing |
US10073638B2 (en) | 2012-06-05 | 2018-09-11 | International Business Machines Corporation | Automatic namespace ordering determination |
US10474395B2 (en) | 2012-06-05 | 2019-11-12 | Pure Storage, Inc. | Abstracting namespace mapping in a dispersed storage network through multiple hierarchies |
US9613052B2 (en) | 2012-06-05 | 2017-04-04 | International Business Machines Corporation | Establishing trust within a cloud computing system |
US20180336097A1 (en) | 2012-06-25 | 2018-11-22 | International Business Machines Corporation | Namespace affinity and failover for processing units in a dispersed storage network |
US10157011B2 (en) | 2012-06-25 | 2018-12-18 | International Business Machines Corporation | Temporary suspension of vault access |
US10120574B2 (en) | 2012-06-25 | 2018-11-06 | International Business Machines Corporation | Reversible data modifications within DS units |
US9141297B2 (en) | 2012-06-25 | 2015-09-22 | Cleversafe, Inc. | Verifying encoded data slice integrity in a dispersed storage network |
US9110833B2 (en) | 2012-06-25 | 2015-08-18 | Cleversafe, Inc. | Non-temporarily storing temporarily stored data in a dispersed storage network |
US10430276B2 (en) | 2012-06-25 | 2019-10-01 | Pure Storage, Inc. | Optimal orderings of processing unit priorities in a dispersed storage network |
US10114697B2 (en) | 2012-06-25 | 2018-10-30 | International Business Machines Corporation | Large object parallel writing |
US11093327B1 (en) | 2012-06-25 | 2021-08-17 | Pure Storage, Inc. | Failure abatement approach for failed storage units common to multiple vaults |
US9258177B2 (en) | 2012-08-02 | 2016-02-09 | International Business Machines Corporation | Storing a data stream in a set of storage devices |
US10651975B2 (en) | 2012-08-02 | 2020-05-12 | Pure Storage, Inc. | Forwarding data amongst cooperative DSTN processing units of a massive data ingestion system |
US9154298B2 (en) | 2012-08-31 | 2015-10-06 | Cleversafe, Inc. | Securely storing data in a dispersed storage network |
US10409679B2 (en) | 2012-08-31 | 2019-09-10 | Pure Storage, Inc. | Migrating data slices in a dispersed storage network |
US10409678B2 (en) | 2012-08-31 | 2019-09-10 | Pure Storage, Inc. | Self-optimizing read-ahead |
US10241863B2 (en) | 2012-08-31 | 2019-03-26 | International Business Machines Corporation | Slice rebuilding in a dispersed storage network |
US9875158B2 (en) | 2012-08-31 | 2018-01-23 | International Business Machines Corporation | Slice storage in a dispersed storage network |
US10331518B2 (en) | 2012-08-31 | 2019-06-25 | International Business Machines Corporation | Encoding data in a dispersed storage network |
US11360851B2 (en) | 2012-08-31 | 2022-06-14 | Pure Storage, Inc. | Duplicating authentication information between connections |
US10402423B2 (en) | 2012-09-13 | 2019-09-03 | Pure Storage, Inc. | Sliding windows for batching index updates |
US10057351B2 (en) | 2012-09-13 | 2018-08-21 | International Business Machines Corporation | Modifying information dispersal algorithm configurations in a dispersed storage network |
US9483539B2 (en) | 2012-09-13 | 2016-11-01 | International Business Machines Corporation | Updating local data utilizing a distributed storage network |
US10331698B2 (en) | 2012-09-13 | 2019-06-25 | International Business Machines Corporation | Rebuilding data in a dispersed storage network |
US10417253B2 (en) | 2012-09-13 | 2019-09-17 | Pure Storage, Inc. | Multi-level data storage in a dispersed storage network |
US10318549B2 (en) | 2012-09-13 | 2019-06-11 | International Business Machines Corporation | Batching modifications to nodes in a dispersed index |
US10606700B2 (en) | 2012-10-08 | 2020-03-31 | Pure Storage, Inc. | Enhanced dispersed storage error encoding using multiple encoding layers |
US10331519B2 (en) | 2012-10-08 | 2019-06-25 | International Business Machines Corporation | Application of secret sharing schemes at multiple levels of a dispersed storage network |
US10127111B2 (en) | 2012-10-08 | 2018-11-13 | International Business Machines Corporation | Client provided request prioritization hints |
US9503513B2 (en) | 2012-10-08 | 2016-11-22 | International Business Machines Corporation | Robust transmission of data utilizing encoded data slices |
US9936020B2 (en) | 2012-10-30 | 2018-04-03 | International Business Machines Corporation | Access control of data in a dispersed storage network |
US9311179B2 (en) | 2012-10-30 | 2016-04-12 | Cleversafe, Inc. | Threshold decoding of data based on trust levels |
US10587691B2 (en) | 2012-12-05 | 2020-03-10 | Pure Storage, Inc. | Impatient writes |
US10558621B2 (en) | 2012-12-05 | 2020-02-11 | Pure Storage, Inc. | Lock stealing writes for improved reliability |
US9811533B2 (en) | 2012-12-05 | 2017-11-07 | International Business Machines Corporation | Accessing distributed computing functions in a distributed computing system |
US9521197B2 (en) | 2012-12-05 | 2016-12-13 | International Business Machines Corporation | Utilizing data object storage tracking in a dispersed storage network |
US10642992B2 (en) | 2013-01-04 | 2020-05-05 | Pure Storage, Inc. | Password augmented all-or-nothin transform |
US20190250823A1 (en) | 2013-01-04 | 2019-08-15 | International Business Machines Corporation | Efficient computation of only the required slices |
US10229002B2 (en) | 2013-01-04 | 2019-03-12 | International Business Machines Corporation | Process to migrate named objects to a dispersed or distributed storage network (DSN) |
US10204009B2 (en) | 2013-01-04 | 2019-02-12 | International Business Machines Corporation | Prioritized rebuilds using dispersed indices |
US9311187B2 (en) | 2013-01-04 | 2016-04-12 | Cleversafe, Inc. | Achieving storage compliance in a dispersed storage network |
US10013203B2 (en) | 2013-01-04 | 2018-07-03 | International Business Machines Corporation | Achieving storage compliance in a dispersed storage network |
US10241866B2 (en) | 2013-01-04 | 2019-03-26 | International Business Machines Corporation | Allocating rebuilding queue entries in a dispersed storage network |
US10423491B2 (en) | 2013-01-04 | 2019-09-24 | Pure Storage, Inc. | Preventing multiple round trips when writing to target widths |
US11416340B1 (en) | 2013-01-04 | 2022-08-16 | Pure Storage, Inc. | Storage system with multiple storage types in a vast storage network |
US10402270B2 (en) | 2013-01-04 | 2019-09-03 | Pure Storage, Inc. | Deterministically determining affinity for a source name range |
US9558067B2 (en) | 2013-01-04 | 2017-01-31 | International Business Machines Corporation | Mapping storage of data in a dispersed storage network |
US10268554B2 (en) | 2013-02-05 | 2019-04-23 | International Business Machines Corporation | Using dispersed computation to change dispersal characteristics |
US9043499B2 (en) | 2013-02-05 | 2015-05-26 | Cleversafe, Inc. | Modifying a dispersed storage network memory data access response plan |
US10055441B2 (en) | 2013-02-05 | 2018-08-21 | International Business Machines Corporation | Updating shared group information in a dispersed storage network |
US10664360B2 (en) | 2013-02-05 | 2020-05-26 | Pure Storage, Inc. | Identifying additional resources to accelerate rebuildling |
US10621021B2 (en) | 2013-02-05 | 2020-04-14 | Pure Storage, Inc. | Using dispersed data structures to point to slice or date source replicas |
US10430122B2 (en) | 2013-02-05 | 2019-10-01 | Pure Storage, Inc. | Using partial rebuilding to change information dispersal algorithm (IDA) |
US10310763B2 (en) | 2013-02-05 | 2019-06-04 | International Business Machines Corporation | Forming a distributed storage network memory without namespace aware distributed storage units |
US9274908B2 (en) | 2013-02-26 | 2016-03-01 | International Business Machines Corporation | Resolving write conflicts in a dispersed storage network |
US11036392B2 (en) | 2013-02-26 | 2021-06-15 | Pure Storage, Inc. | Determining when to use convergent encryption |
US10642489B2 (en) | 2013-02-26 | 2020-05-05 | Pure Storage, Inc. | Determining when to initiate an intra-distributed storage unit rebuild vs. an inter-distributed storage unit rebuild |
US10075523B2 (en) | 2013-04-01 | 2018-09-11 | International Business Machines Corporation | Efficient storage of data in a dispersed storage network |
US9456035B2 (en) | 2013-05-03 | 2016-09-27 | International Business Machines Corporation | Storing related data in a dispersed storage network |
US10223213B2 (en) | 2013-05-03 | 2019-03-05 | International Business Machines Corporation | Salted zero expansion all or nothing transformation |
US9405609B2 (en) | 2013-05-22 | 2016-08-02 | International Business Machines Corporation | Storing data in accordance with a performance threshold |
US9424132B2 (en) | 2013-05-30 | 2016-08-23 | International Business Machines Corporation | Adjusting dispersed storage network traffic due to rebuilding |
US9432341B2 (en) | 2013-05-30 | 2016-08-30 | International Business Machines Corporation | Securing data in a dispersed storage network |
US11226860B1 (en) | 2013-05-30 | 2022-01-18 | Pure Storage, Inc. | Difference based rebuild list scanning |
US11221916B2 (en) | 2013-07-01 | 2022-01-11 | Pure Storage, Inc. | Prioritized data reconstruction in a dispersed storage network |
US10133635B2 (en) | 2013-07-01 | 2018-11-20 | International Business Machines Corporation | Low-width vault in distributed storage system |
US9501360B2 (en) | 2013-07-01 | 2016-11-22 | International Business Machines Corporation | Rebuilding data while reading data in a dispersed storage network |
US9652470B2 (en) | 2013-07-01 | 2017-05-16 | International Business Machines Corporation | Storing data in a dispersed storage network |
US10169369B2 (en) | 2013-07-01 | 2019-01-01 | International Business Machines Corporation | Meeting storage requirements with limited storage resources |
US9848044B2 (en) | 2013-07-31 | 2017-12-19 | International Business Machines Corporation | Distributed storage network with coordinated partial task execution and methods for use therewith |
US20180188964A1 (en) | 2013-07-31 | 2018-07-05 | International Business Machines Corporation | Managed storage unit shutdown in a distributed storage network |
US10180880B2 (en) | 2013-07-31 | 2019-01-15 | International Business Machines Corporation | Adaptive rebuilding rates based on sampling and inference |
US20150039660A1 (en) | 2013-07-31 | 2015-02-05 | Cleversafe, Inc. | Co-locate objects request |
US10681134B2 (en) | 2013-07-31 | 2020-06-09 | Pure Storage, Inc. | Accelerated learning in adaptive rebuilding by applying observations to other samples |
US10514857B2 (en) | 2013-08-29 | 2019-12-24 | Pure Storage, Inc. | Dynamic adjusting of parameters based on resource scoring |
US10489071B2 (en) | 2013-08-29 | 2019-11-26 | Pure Storage, Inc. | Vault provisioning within dispersed or distributed storage network (DSN) |
US10484474B2 (en) | 2013-08-29 | 2019-11-19 | Pure Storage, Inc. | Rotating offline DS units |
US9438675B2 (en) | 2013-08-29 | 2016-09-06 | International Business Machines Corporation | Dispersed storage with variable slice length and methods for use therewith |
US10601918B2 (en) | 2013-08-29 | 2020-03-24 | Pure Storage, Inc. | Rotating inactive storage units in a distributed storage network |
US9661074B2 (en) * | 2013-08-29 | 2017-05-23 | International Business Machines Corporations | Updating de-duplication tracking data for a dispersed storage network |
US9857974B2 (en) | 2013-10-03 | 2018-01-02 | International Business Machines Corporation | Session execution decision |
US10304096B2 (en) | 2013-11-01 | 2019-05-28 | International Business Machines Corporation | Renting a pipe to a storage system |
US10182115B2 (en) | 2013-11-01 | 2019-01-15 | International Business Machines Corporation | Changing rebuild priority for a class of data |
US9781208B2 (en) | 2013-11-01 | 2017-10-03 | International Business Machines Corporation | Obtaining dispersed storage network system registry information |
US9900316B2 (en) | 2013-12-04 | 2018-02-20 | International Business Machines Corporation | Accessing storage units of a dispersed storage network |
US10922181B2 (en) | 2014-01-06 | 2021-02-16 | Pure Storage, Inc. | Using storage locations greater than an IDA width in a dispersed storage network |
US11340993B2 (en) | 2014-01-06 | 2022-05-24 | Pure Storage, Inc. | Deferred rebuilding with alternate storage locations |
US9594639B2 (en) | 2014-01-06 | 2017-03-14 | International Business Machines Corporation | Configuring storage resources of a dispersed storage network |
US11204836B1 (en) | 2014-01-31 | 2021-12-21 | Pure Storage, Inc. | Using trap slices for anomaly detection in a distributed storage network |
US9778987B2 (en) | 2014-01-31 | 2017-10-03 | International Business Machines Corporation | Writing encoded data slices in a dispersed storage network |
US9552261B2 (en) | 2014-01-31 | 2017-01-24 | International Business Machines Corporation | Recovering data from microslices in a dispersed storage network |
US10318382B2 (en) | 2014-01-31 | 2019-06-11 | International Business Machines Corporation | Determining missing encoded data slices |
US10678638B2 (en) | 2014-02-26 | 2020-06-09 | Pure Storage, Inc. | Resolving write conflicts in a dispersed storage network |
US9529834B2 (en) | 2014-02-26 | 2016-12-27 | International Business Machines Corporation | Concatenating data objects for storage in a dispersed storage network |
US10769016B2 (en) | 2014-02-26 | 2020-09-08 | Pure Storage, Inc. | Storing a plurality of correlated data in a dispersed storage network |
US9665429B2 (en) | 2014-02-26 | 2017-05-30 | International Business Machines Corporation | Storage of data with verification in a dispersed storage network |
US10592109B2 (en) | 2014-02-26 | 2020-03-17 | Pure Storage, Inc. | Selecting storage resources in a dispersed storage network |
US10140182B2 (en) | 2014-02-26 | 2018-11-27 | International Business Machines Corporation | Modifying allocation of storage resources in a dispersed storage network |
US10635312B2 (en) | 2014-02-26 | 2020-04-28 | Pure Storage, Inc. | Recovering data in a dispersed storage network |
US10020826B2 (en) | 2014-04-02 | 2018-07-10 | International Business Machines Corporation | Generating molecular encoding information for data storage |
US10761917B2 (en) | 2014-04-02 | 2020-09-01 | Pure Storage, Inc. | Using global namespace addressing in a dispersed storage network |
US11347590B1 (en) | 2014-04-02 | 2022-05-31 | Pure Storage, Inc. | Rebuilding data in a distributed storage network |
US10015152B2 (en) | 2014-04-02 | 2018-07-03 | International Business Machines Corporation | Securing data in a dispersed storage network |
US20190087599A1 (en) | 2014-04-02 | 2019-03-21 | International Business Machines Corporation | Compressing a slice name listing in a dispersed storage network |
US10681138B2 (en) | 2014-04-02 | 2020-06-09 | Pure Storage, Inc. | Storing and retrieving multi-format content in a distributed storage network |
US10628245B2 (en) | 2014-04-02 | 2020-04-21 | Pure Storage, Inc. | Monitoring of storage units in a dispersed storage network |
US20150288680A1 (en) | 2014-04-02 | 2015-10-08 | Cleversafe, Inc. | Distributing registry information in a dispersed storage network |
US10802732B2 (en) | 2014-04-30 | 2020-10-13 | Pure Storage, Inc. | Multi-level stage locality selection on a large system |
US9612882B2 (en) | 2014-04-30 | 2017-04-04 | International Business Machines Corporation | Retrieving multi-generational stored data in a dispersed storage network |
US10296263B2 (en) | 2014-04-30 | 2019-05-21 | International Business Machines Corporation | Dispersed bloom filter for determining presence of an object |
US9735967B2 (en) | 2014-04-30 | 2017-08-15 | International Business Machines Corporation | Self-validating request message structure and operation |
US10394476B2 (en) | 2014-04-30 | 2019-08-27 | Pure Storage, Inc. | Multi-level stage locality selection on a large system |
US10095872B2 (en) | 2014-06-05 | 2018-10-09 | International Business Machines Corporation | Accessing data based on a dispersed storage network rebuilding issue |
US10509577B2 (en) | 2014-06-05 | 2019-12-17 | Pure Storage, Inc. | Reliable storage in a dispersed storage network |
US10140178B2 (en) | 2014-06-05 | 2018-11-27 | International Business Machines Corporation | Verifying a status level of stored encoded data slices |
US9690520B2 (en) | 2014-06-30 | 2017-06-27 | International Business Machines Corporation | Recovering an encoded data slice in a dispersed storage network |
US10042564B2 (en) | 2014-06-30 | 2018-08-07 | International Business Machines Corporation | Accessing data while migrating storage of the data |
US10459797B2 (en) | 2014-06-30 | 2019-10-29 | Pure Storage, Inc. | Making trade-offs between rebuild scanning and failing memory device flexibility |
US10673946B2 (en) | 2014-06-30 | 2020-06-02 | Pure Storage, Inc. | Using separate weighting scores for different types of data in a decentralized agreement protocol |
US10447612B2 (en) | 2014-06-30 | 2019-10-15 | Pure Storage, Inc. | Migrating encoded data slices in a dispersed storage network |
US11606431B2 (en) | 2014-06-30 | 2023-03-14 | Pure Storage, Inc. | Maintaining failure independence for storage of a set of encoded data slices |
US11398988B1 (en) | 2014-06-30 | 2022-07-26 | Pure Storage, Inc. | Selection of access resources in a distributed storage network |
US11099763B1 (en) | 2014-06-30 | 2021-08-24 | Pure Storage, Inc. | Migrating generational storage to a decentralized agreement protocol paradigm |
US10440105B2 (en) | 2014-06-30 | 2019-10-08 | Pure Storage, Inc. | Using a decentralized agreement protocol to rank storage locations for target width |
US9841925B2 (en) | 2014-06-30 | 2017-12-12 | International Business Machines Corporation | Adjusting timing of storing data in a dispersed storage network |
US9838478B2 (en) | 2014-06-30 | 2017-12-05 | International Business Machines Corporation | Identifying a task execution resource of a dispersed storage network |
US11728964B2 (en) | 2014-07-31 | 2023-08-15 | Pure Storage, Inc. | Performance aided data migration in a distributed storage network |
US10613936B2 (en) | 2014-07-31 | 2020-04-07 | Pure Storage, Inc. | Fractional slices in a distributed storage system |
US10089036B2 (en) | 2014-07-31 | 2018-10-02 | International Business Machines Corporation | Migrating data in a distributed storage network |
US10644874B2 (en) | 2014-07-31 | 2020-05-05 | Pure Storage, Inc. | Limiting brute force attacks against dispersed credentials in a distributed storage system |
US10049120B2 (en) | 2014-09-05 | 2018-08-14 | International Business Machines Corporation | Consistency based access of data in a dispersed storage network |
US11442921B1 (en) | 2014-09-05 | 2022-09-13 | Pure Storage, Inc. | Data access in a dispersed storage network with consistency |
US10402395B2 (en) | 2014-09-05 | 2019-09-03 | Pure Storage, Inc. | Facilitating data consistency in a dispersed storage network |
US10176191B2 (en) | 2014-09-05 | 2019-01-08 | International Business Machines Corporation | Recovering from conflicts that emerge from eventually consistent operations |
US9591076B2 (en) * | 2014-09-08 | 2017-03-07 | International Business Machines Corporation | Maintaining a desired number of storage units |
US10146622B2 (en) | 2014-09-08 | 2018-12-04 | International Business Machines Corporation | Combining deduplication with locality for efficient and fast storage |
US10268545B2 (en) | 2014-09-08 | 2019-04-23 | International Business Machines Corporation | Using reinforcement learning to select a DS processing unit |
US10095582B2 (en) | 2014-10-29 | 2018-10-09 | International Business Machines Corporation | Partial rebuilding techniques in a dispersed storage unit |
US10282135B2 (en) | 2014-10-29 | 2019-05-07 | International Business Machines Corporation | Strong consistency write threshold |
US20180101457A1 (en) | 2014-10-29 | 2018-04-12 | International Business Machines Corporation | Retrying failed write operations in a dispersed storage network |
US10459792B2 (en) | 2014-10-29 | 2019-10-29 | Pure Storage, Inc. | Using an eventually consistent dispersed memory to implement storage tiers |
US10223033B2 (en) | 2014-10-29 | 2019-03-05 | International Business Machines Corporation | Coordinating arrival times of data slices in a dispersed storage network |
US10481833B2 (en) | 2014-10-29 | 2019-11-19 | Pure Storage, Inc. | Transferring data encoding functions in a distributed storage network |
US9916114B2 (en) | 2014-10-29 | 2018-03-13 | International Business Machines Corporation | Deterministically sharing a plurality of processing resources |
US10120739B2 (en) | 2014-12-02 | 2018-11-06 | International Business Machines Corporation | Prioritized data rebuilding in a dispersed storage network |
US10503592B2 (en) | 2014-12-02 | 2019-12-10 | Pure Storage, Inc. | Overcoming bottlenecks in partial and traditional rebuild operations |
US10481832B2 (en) | 2014-12-02 | 2019-11-19 | Pure Storage, Inc. | Applying a probability function to avoid storage operations for already-deleted data |
US10558527B2 (en) | 2014-12-02 | 2020-02-11 | Pure Storage, Inc. | Rebuilding strategy in memory managed multi-site duplication |
US10402271B2 (en) | 2014-12-02 | 2019-09-03 | Pure Storage, Inc. | Overcoming bottlenecks in zero information gain (ZIG) rebuild operations |
US10521298B2 (en) | 2014-12-02 | 2019-12-31 | Pure Storage, Inc. | Temporarily storing dropped and rebuilt slices in a DSN memory |
US9727275B2 (en) | 2014-12-02 | 2017-08-08 | International Business Machines Corporation | Coordinating storage of data in dispersed storage networks |
US10387252B2 (en) | 2014-12-31 | 2019-08-20 | Pure Storage, Inc. | Synchronously storing data in a plurality of dispersed storage networks |
US11604707B2 (en) | 2014-12-31 | 2023-03-14 | Pure Storage, Inc. | Handling failures when synchronizing objects during a write operation |
US10656866B2 (en) | 2014-12-31 | 2020-05-19 | Pure Storage, Inc. | Unidirectional vault synchronization to support tiering |
US10452317B2 (en) | 2014-12-31 | 2019-10-22 | Pure Storage, Inc. | DAP redistribution operation within a dispersed storage network |
US10623495B2 (en) | 2014-12-31 | 2020-04-14 | Pure Storage, Inc. | Keeping synchronized writes from getting out of synch |
US10489247B2 (en) | 2014-12-31 | 2019-11-26 | Pure Storage, Inc. | Generating time-ordered globally unique revision numbers |
US9727427B2 (en) | 2014-12-31 | 2017-08-08 | International Business Machines Corporation | Synchronizing storage of data copies in a dispersed storage network |
US10423359B2 (en) | 2014-12-31 | 2019-09-24 | Pure Storage, Inc. | Linking common attributes among a set of synchronized vaults |
US10642687B2 (en) | 2014-12-31 | 2020-05-05 | Pure Storage, Inc. | Pessimistic reads and other smart-read enhancements with synchronized vaults |
US10621042B2 (en) | 2014-12-31 | 2020-04-14 | Pure Storage, Inc. | Vault transformation within a dispersed storage network |
US10126974B2 (en) | 2014-12-31 | 2018-11-13 | International Business Machines Corporation | Redistributing encoded data slices in a dispersed storage network |
US10802915B2 (en) | 2015-01-30 | 2020-10-13 | Pure Storage, Inc. | Time based storage of encoded data slices |
US10440116B2 (en) | 2015-01-30 | 2019-10-08 | Pure Storage, Inc. | Minimizing data movement through rotation of spare memory devices |
US10530862B2 (en) | 2015-01-30 | 2020-01-07 | Pure Storage, Inc. | Determining slices to rebuild from low-level failures |
US10169123B2 (en) | 2015-01-30 | 2019-01-01 | International Business Machines Corporation | Distributed data rebuilding |
US10506045B2 (en) | 2015-01-30 | 2019-12-10 | Pure Storage, Inc. | Memory access using deterministic function and secure seed |
US10498823B2 (en) | 2015-01-30 | 2019-12-03 | Pure Storage, Inc. | Optimally apportioning rebuilding resources |
US10620878B2 (en) | 2015-01-30 | 2020-04-14 | Pure Storage, Inc. | Write threshold plus value in dispersed storage network write operations |
US10592132B2 (en) | 2015-01-30 | 2020-03-17 | Pure Storage, Inc. | Read-foreign-slices request for improved read efficiency with bundled writes |
US10740180B2 (en) | 2015-01-30 | 2020-08-11 | Pure Storage, Inc. | Storing and retrieving data using proxies |
US9826038B2 (en) | 2015-01-30 | 2017-11-21 | International Business Machines Corporation | Selecting a data storage resource of a dispersed storage network |
US10594793B2 (en) | 2015-01-30 | 2020-03-17 | Pure Storage, Inc. | Read-prepare requests to multiple memories |
US9740547B2 (en) | 2015-01-30 | 2017-08-22 | International Business Machines Corporation | Storing data using a dual path storage approach |
US10423490B2 (en) | 2015-01-30 | 2019-09-24 | Pure Storage, Inc. | Read-source requests to support bundled writes in a distributed storage system |
US10511665B2 (en) | 2015-01-30 | 2019-12-17 | Pure Storage, Inc. | Efficient resource reclamation after deletion of slice from common file |
US10498822B2 (en) | 2015-01-30 | 2019-12-03 | Pure Storage, Inc. | Adaptive scanning rates |
US10289342B2 (en) | 2015-01-30 | 2019-05-14 | International Business Machines Corporation | Data access optimization protocol in a dispersed storage network |
US10657000B2 (en) | 2015-02-27 | 2020-05-19 | Pure Storage, Inc. | Optimizing data storage in a dispersed storage network |
US10387067B2 (en) | 2015-02-27 | 2019-08-20 | Pure Storage, Inc. | Optimizing data storage in a dispersed storage network |
US10534668B2 (en) | 2015-02-27 | 2020-01-14 | Pure Storage, Inc. | Accessing data in a dispersed storage network |
US10423502B2 (en) | 2015-02-27 | 2019-09-24 | Pure Storage, Inc. | Stand-by distributed storage units |
US10437676B2 (en) | 2015-02-27 | 2019-10-08 | Pure Storage, Inc. | Urgent reads and using data source health to determine error recovery procedures |
US10069915B2 (en) | 2015-02-27 | 2018-09-04 | International Business Machines Corporation | Storing data in a dispersed storage network |
US11836369B1 (en) | 2015-02-27 | 2023-12-05 | Pure Storage, Inc. | Storing data in an expanded storage pool of a vast storage network |
US10078472B2 (en) | 2015-02-27 | 2018-09-18 | International Business Machines Corporation | Rebuilding encoded data slices in a dispersed storage network |
US10503591B2 (en) | 2015-02-27 | 2019-12-10 | Pure Storage, Inc. | Selecting retrieval locations in a dispersed storage network |
US10440115B2 (en) | 2015-02-27 | 2019-10-08 | Pure Storage, Inc. | Write intent messaging in a dispersed storage network |
US11188665B2 (en) | 2015-02-27 | 2021-11-30 | Pure Storage, Inc. | Using internal sensors to detect adverse interference and take defensive actions |
US10437677B2 (en) | 2015-02-27 | 2019-10-08 | Pure Storage, Inc. | Optimized distributed rebuilding within a dispersed storage network |
US10275185B2 (en) | 2015-02-27 | 2019-04-30 | International Business Machines Corporation | Fail-in-place supported via decentralized or Distributed Agreement Protocol (DAP) |
US10404410B2 (en) | 2015-02-27 | 2019-09-03 | Pure Storage, Inc. | Storage unit (SU) report cards |
US10579451B2 (en) | 2015-02-27 | 2020-03-03 | Pure Storage, Inc. | Pro-actively preparing a dispersed storage network memory for higher-loads |
US10409772B2 (en) | 2015-02-27 | 2019-09-10 | Pure Storage, Inc. | Accessing serially stored data in a dispersed storage network |
US10530861B2 (en) | 2015-02-27 | 2020-01-07 | Pure Storage, Inc. | Utilizing multiple storage pools in a dispersed storage network |
US10528425B2 (en) | 2015-02-27 | 2020-01-07 | Pure Storage, Inc. | Transitioning to an optimized data storage approach in a dispersed storage network |
US10528282B2 (en) | 2015-03-31 | 2020-01-07 | Pure Storage, Inc. | Modifying and utilizing a file structure in a dispersed storage network |
US11055177B2 (en) | 2015-03-31 | 2021-07-06 | Pure Storage, Inc. | Correlating operational information with an error condition in a dispersed storage network |
US10915261B2 (en) | 2015-03-31 | 2021-02-09 | Pure Storage, Inc. | Selecting a set of storage units in a distributed storage network |
US10963180B2 (en) | 2015-03-31 | 2021-03-30 | Pure Storage, Inc. | Adding incremental storage resources in a dispersed storage network |
US10852957B2 (en) | 2015-03-31 | 2020-12-01 | Pure Storage, Inc. | Migration agent employing moveslice request |
US10713374B2 (en) | 2015-03-31 | 2020-07-14 | Pure Storage, Inc. | Resolving detected access anomalies in a dispersed storage network |
US10437515B2 (en) | 2015-03-31 | 2019-10-08 | Pure Storage, Inc. | Selecting storage units in a dispersed storage network |
US10387070B2 (en) | 2015-03-31 | 2019-08-20 | Pure Storage, Inc. | Migrating data in response to adding incremental storage resources in a dispersed storage network |
US10079887B2 (en) | 2015-03-31 | 2018-09-18 | International Business Machines Corporation | Expanding storage capacity of a set of storage units in a distributed storage network |
US10331384B2 (en) | 2015-03-31 | 2019-06-25 | International Business Machines Corporation | Storing data utilizing a maximum accessibility approach in a dispersed storage network |
US10534661B2 (en) | 2015-03-31 | 2020-01-14 | Pure Storage, Inc. | Selecting a storage error abatement alternative in a dispersed storage network |
US10216594B2 (en) | 2015-04-30 | 2019-02-26 | International Business Machines Corporation | Automated stalled process detection and recovery |
US10067998B2 (en) | 2015-04-30 | 2018-09-04 | International Business Machines Corporation | Distributed sync list |
US10168904B2 (en) | 2015-04-30 | 2019-01-01 | International Business Machines Corporation | Quasi-error notifications in a dispersed storage network |
US10037171B2 (en) | 2015-04-30 | 2018-07-31 | International Business Machines Corporation | Accessing common data in a dispersed storage network |
US10055170B2 (en) | 2015-04-30 | 2018-08-21 | International Business Machines Corporation | Scheduling storage unit maintenance tasks in a dispersed storage network |
US10157094B2 (en) | 2015-04-30 | 2018-12-18 | International Business Machines Corporation | Validating system registry files in a dispersed storage network |
US10268376B2 (en) * | 2015-04-30 | 2019-04-23 | International Business Machines Corporation | Automated deployment and assignment of access devices in a dispersed storage network |
US10078561B2 (en) | 2015-04-30 | 2018-09-18 | International Business Machines Corporation | Handling failing memory devices in a dispersed storage network |
US10254992B2 (en) | 2015-04-30 | 2019-04-09 | International Business Machines Corporation | Rebalancing data storage in a dispersed storage network |
US10324657B2 (en) | 2015-05-29 | 2019-06-18 | International Business Machines Corporation | Accounting for data whose rebuilding is deferred |
US10838664B2 (en) | 2015-05-29 | 2020-11-17 | Pure Storage, Inc. | Determining a storage location according to legal requirements |
US10169125B2 (en) | 2015-05-29 | 2019-01-01 | International Business Machines Corporation | Re-encoding data in a dispersed storage network |
US10891058B2 (en) | 2015-05-29 | 2021-01-12 | Pure Storage, Inc. | Encoding slice verification information to support verifiable rebuilding |
US11115221B2 (en) | 2015-05-29 | 2021-09-07 | Pure Storage, Inc. | Verifying a rebuilt encoded data slice using slice verification information |
US10523241B2 (en) | 2015-05-29 | 2019-12-31 | Pure Storage, Inc. | Object fan out write operation |
US10402122B2 (en) | 2015-05-29 | 2019-09-03 | Pure Storage, Inc. | Transferring encoded data slices in a dispersed storage network |
US10430107B2 (en) | 2015-05-29 | 2019-10-01 | Pure Storage, Inc. | Identifying stored data slices during a slice migration activity in a dispersed storage network |
US10789128B2 (en) | 2015-05-29 | 2020-09-29 | Pure Storage, Inc. | External healing mode for a dispersed storage network memory |
US10409522B2 (en) | 2015-05-29 | 2019-09-10 | Pure Storage, Inc. | Reclaiming storage capacity in a dispersed storage network |
US10613798B2 (en) | 2015-05-29 | 2020-04-07 | Pure Storage, Inc. | Slice fanout write request |
US11669546B2 (en) | 2015-06-30 | 2023-06-06 | Pure Storage, Inc. | Synchronizing replicated data in a storage network |
US10437671B2 (en) | 2015-06-30 | 2019-10-08 | Pure Storage, Inc. | Synchronizing replicated stored data |
US10055291B2 (en) | 2015-06-30 | 2018-08-21 | International Business Machines Corporation | Method and system for processing data access requests during data transfers |
US11782789B2 (en) | 2015-07-31 | 2023-10-10 | Pure Storage, Inc. | Encoding data and associated metadata in a storage network |
US20170034184A1 (en) | 2015-07-31 | 2017-02-02 | International Business Machines Corporation | Proxying data access requests |
US10466914B2 (en) | 2015-08-31 | 2019-11-05 | Pure Storage, Inc. | Verifying authorized access in a dispersed storage network |
US10073652B2 (en) | 2015-09-24 | 2018-09-11 | International Business Machines Corporation | Performance optimized storage vaults in a dispersed storage network |
US10169147B2 (en) | 2015-10-30 | 2019-01-01 | International Business Machines Corporation | End-to-end secure data storage in a dispersed storage network |
US10346246B2 (en) | 2015-11-30 | 2019-07-09 | International Business Machines Corporation | Recovering data copies in a dispersed storage network |
US10409514B2 (en) | 2015-11-30 | 2019-09-10 | International Business Machines Corporation | IP multicast message transmission for event notifications |
US20170192688A1 (en) | 2015-12-30 | 2017-07-06 | International Business Machines Corporation | Lazy deletion of vaults in packed slice storage (pss) and zone slice storage (zss) |
US10855759B2 (en) | 2016-01-26 | 2020-12-01 | Pure Storage, Inc. | Utilizing a hierarchical index in a dispersed storage network |
US10089178B2 (en) | 2016-02-29 | 2018-10-02 | International Business Machines Corporation | Developing an accurate dispersed storage network memory performance model through training |
US10387248B2 (en) | 2016-03-29 | 2019-08-20 | International Business Machines Corporation | Allocating data for storage by utilizing a location-based hierarchy in a dispersed storage network |
US10831381B2 (en) | 2016-03-29 | 2020-11-10 | International Business Machines Corporation | Hierarchies of credential and access control sharing between DSN memories |
US10419538B2 (en) | 2016-04-26 | 2019-09-17 | International Business Machines Corporation | Selecting memory for data access in a dispersed storage network |
US10169082B2 (en) | 2016-04-27 | 2019-01-01 | International Business Machines Corporation | Accessing data in accordance with an execution deadline |
US10628399B2 (en) | 2016-04-29 | 2020-04-21 | International Business Machines Corporation | Storing data in a dispersed storage network with consistency |
US10007444B2 (en) | 2016-04-29 | 2018-06-26 | International Business Machines Corporation | Batching access requests in a dispersed storage network |
US10091298B2 (en) | 2016-05-27 | 2018-10-02 | International Business Machines Corporation | Enhancing performance of data storage in a dispersed storage network |
US10122795B2 (en) | 2016-05-31 | 2018-11-06 | International Business Machines Corporation | Consistency level driven data storage in a dispersed storage network |
US10353772B2 (en) | 2016-05-31 | 2019-07-16 | International Business Machines Corporation | Selecting data for storage in a dispersed storage network |
US10027755B2 (en) | 2016-06-01 | 2018-07-17 | International Business Machines Corporation | Selecting storage units in one or more dispersed storage networks |
US10394650B2 (en) | 2016-06-03 | 2019-08-27 | International Business Machines Corporation | Multiple writes using inter-site storage unit relationship |
US10334045B2 (en) | 2016-06-06 | 2019-06-25 | International Business Machines Corporation | Indicating multiple encoding schemes in a dispersed storage network |
US10735545B2 (en) | 2016-06-06 | 2020-08-04 | International Business Machines Corporation | Routing vault access requests in a dispersed storage network |
US10719499B2 (en) | 2016-06-06 | 2020-07-21 | INTERNATIONAL BUSINESS MACHINES CORPORATIOb | Establishing distributed consensus via alternate voting strategies in a dispersed storage network |
US10652350B2 (en) | 2016-06-06 | 2020-05-12 | International Business Machines Corporation | Caching for unique combination reads in a dispersed storage network |
US10007438B2 (en) | 2016-06-25 | 2018-06-26 | International Business Machines Corporation | Method and system for achieving consensus using alternate voting strategies (AVS) with incomplete information |
US10564852B2 (en) | 2016-06-25 | 2020-02-18 | International Business Machines Corporation | Method and system for reducing memory device input/output operations |
US10235085B2 (en) | 2016-06-27 | 2019-03-19 | International Business Machines Corporation | Relocating storage unit data in response to detecting hotspots in a dispersed storage network |
US11115469B2 (en) | 2016-06-28 | 2021-09-07 | International Business Machines Corporation | Efficient updates within a dispersed storage network |
US10157021B2 (en) | 2016-06-29 | 2018-12-18 | International Business Machines Corporation | Processing incomplete data access transactions |
US10025505B2 (en) | 2016-06-29 | 2018-07-17 | International Business Machines Corporation | Accessing data in a dispersed storage network during write operations |
US10387286B2 (en) | 2016-06-30 | 2019-08-20 | International Business Machines Corporation | Managing configuration updates in a dispersed storage network |
US9934092B2 (en) | 2016-07-12 | 2018-04-03 | International Business Machines Corporation | Manipulating a distributed agreement protocol to identify a desired set of storage units |
US10102067B2 (en) | 2016-07-14 | 2018-10-16 | International Business Machines Corporation | Performing a desired manipulation of an encoded data slice based on a metadata restriction and a storage operational condition |
US10534666B2 (en) | 2016-07-14 | 2020-01-14 | International Business Machines Corporation | Determining storage requirements based on licensing right in a dispersed storage network |
US10114696B2 (en) * | 2016-07-14 | 2018-10-30 | International Business Machines Corporation | Tracking data access in a dispersed storage network |
US10360103B2 (en) | 2016-07-18 | 2019-07-23 | International Business Machines Corporation | Focused storage pool expansion to prevent a performance degradation |
US9992063B2 (en) | 2016-07-18 | 2018-06-05 | International Business Machines Corporation | Utilizing reallocation via a decentralized, or distributed, agreement protocol (DAP) for storage unit (SU) replacement |
US10277490B2 (en) | 2016-07-19 | 2019-04-30 | International Business Machines Corporation | Monitoring inter-site bandwidth for rebuilding |
US10769015B2 (en) | 2016-07-19 | 2020-09-08 | International Business Machines Corporation | Throttling access requests at different layers of a DSN memory |
US10554752B2 (en) | 2016-07-20 | 2020-02-04 | International Business Machines Corporation | Efficient transfer of encoded data slice sets to new or alternate storage units |
US10031809B2 (en) | 2016-07-20 | 2018-07-24 | International Business Machines Corporation | Efficient method for rebuilding a set of encoded data slices |
US10459796B2 (en) | 2016-07-20 | 2019-10-29 | International Business Machines Corporation | Prioritizing rebuilding based on a longevity estimate of the rebuilt slice |
US10127112B2 (en) | 2016-07-20 | 2018-11-13 | International Business Machines Corporation | Assigning prioritized rebuild resources optimally |
US10416930B2 (en) | 2016-07-21 | 2019-09-17 | International Business Machines Corporation | Global access permit listing |
US10379744B2 (en) | 2016-07-21 | 2019-08-13 | International Business Machines Corporation | System for collecting end-user feedback and usability metrics |
US10459790B2 (en) | 2016-07-26 | 2019-10-29 | International Business Machines Corporation | Elastic storage in a dispersed storage network |
US10395043B2 (en) | 2016-07-29 | 2019-08-27 | International Business Machines Corporation | Securely storing data in an elastically scalable dispersed storage network |
US10031805B2 (en) | 2016-08-09 | 2018-07-24 | International Business Machines Corporation | Assigning slices to storage locations based on a predicted lifespan |
US10223036B2 (en) | 2016-08-10 | 2019-03-05 | International Business Machines Corporation | Expanding a dispersed storage network (DSN) |
US10129023B2 (en) | 2016-08-11 | 2018-11-13 | International Business Machines Corporation | Enhancing security for multiple storage configurations |
US10348829B2 (en) | 2016-08-15 | 2019-07-09 | International Business Machines Corporation | Auto indexing with customizable metadata |
US10013309B2 (en) | 2016-08-17 | 2018-07-03 | International Business Machines Corporation | Missing slice reconstruction in a dispersed storage network |
US10078468B2 (en) | 2016-08-18 | 2018-09-18 | International Business Machines Corporation | Slice migration in a dispersed storage network |
US10379778B2 (en) | 2016-08-18 | 2019-08-13 | International Business Machines Corporation | Using a master encryption key to sanitize a dispersed storage network memory |
US10389683B2 (en) | 2016-08-26 | 2019-08-20 | International Business Machines Corporation | Securing storage units in a dispersed storage network |
US10581807B2 (en) | 2016-08-29 | 2020-03-03 | International Business Machines Corporation | Using dispersal techniques to securely store cryptographic resources and respond to attacks |
US10379773B2 (en) | 2016-08-29 | 2019-08-13 | International Business Machines Corporation | Storage unit for use in a dispersed storage network |
US10061524B2 (en) | 2016-09-01 | 2018-08-28 | International Business Machines Corporation | Wear-leveling of memory devices |
US10169149B2 (en) | 2016-09-06 | 2019-01-01 | International Business Machines Corporation | Standard and non-standard dispersed storage network data access |
US10387079B2 (en) | 2016-09-09 | 2019-08-20 | International Business Machines Corporation | Placement of dispersed storage data based on requestor properties |
US10225271B2 (en) | 2016-09-09 | 2019-03-05 | International Business Machines Corporation | Distributed storage network with enhanced security monitoring |
US10547615B2 (en) | 2016-09-12 | 2020-01-28 | International Business Machines Corporation | Security response protocol based on security alert encoded data slices of a distributed storage network |
US10558396B2 (en) | 2016-09-14 | 2020-02-11 | International Business Machines Corporation | Pre-caching data according to a current or predicted requester location |
US10558389B2 (en) | 2016-09-20 | 2020-02-11 | International Business Machines Corporation | Per-storage class quality of service (QoS) management within a distributed storage network (DSN) where the DSN stores data using dispersed storage error decoding/encoding |
US10067822B2 (en) | 2016-09-26 | 2018-09-04 | International Business Machines Corporation | Combined slice objects in alternate memory locations |
US10448062B2 (en) | 2016-10-26 | 2019-10-15 | International Business Machines Corporation | Pre-fetching media content to reduce peak loads |
US10394630B2 (en) | 2016-10-26 | 2019-08-27 | International Business Machines Corporation | Estimating relative data importance in a dispersed storage network |
US10481977B2 (en) | 2016-10-27 | 2019-11-19 | International Business Machines Corporation | Dispersed storage of error encoded data objects having multiple resolutions |
US10585751B2 (en) | 2016-10-27 | 2020-03-10 | International Business Machines Corporation | Partial rebuild operation within a dispersed storage network including local memory and cloud-based alternative memory |
US11169731B2 (en) | 2016-10-31 | 2021-11-09 | International Business Machines Corporation | Managing storage resources in a dispersed storage network |
US10540247B2 (en) | 2016-11-10 | 2020-01-21 | International Business Machines Corporation | Handling degraded conditions using a redirect module |
US10585607B2 (en) | 2016-11-10 | 2020-03-10 | International Business Machines Corporation | Determining an optimum selection of functions for units in a DSN memory |
US10114698B2 (en) | 2017-01-05 | 2018-10-30 | International Business Machines Corporation | Detecting and responding to data loss events in a dispersed storage network |
US10782921B2 (en) | 2017-01-25 | 2020-09-22 | International Business Machines Corporation | Non-writing device finalization of a write operation initiated by another device |
US10180787B2 (en) | 2017-02-09 | 2019-01-15 | International Business Machines Corporation | Dispersed storage write process with lock/persist |
US10241865B2 (en) | 2017-02-15 | 2019-03-26 | International Business Machines Corporation | Handling storage unit failure in a dispersed storage network |
US10579309B2 (en) | 2017-02-16 | 2020-03-03 | International Business Machines Corporation | Method for increasing throughput in a distributed storage network |
US10248495B2 (en) | 2017-02-17 | 2019-04-02 | International Business Machines Corporation | Eventual consistency intent cleanup in a dispersed storage network |
US10552341B2 (en) | 2017-02-17 | 2020-02-04 | International Business Machines Corporation | Zone storage—quickly returning to a state of consistency following an unexpected event |
US10382553B2 (en) | 2017-02-20 | 2019-08-13 | International Business Machines Corporation | Zone storage—resilient and efficient storage transactions |
US10394468B2 (en) | 2017-02-23 | 2019-08-27 | International Business Machines Corporation | Handling data slice revisions in a dispersed storage network |
US10241677B2 (en) | 2017-02-24 | 2019-03-26 | International Business Machines Corporation | Ensuring consistency between content and metadata with intents |
US9998147B1 (en) | 2017-02-27 | 2018-06-12 | International Business Machines Corporation | Method for using write intents in a distributed storage network |
US10642532B2 (en) | 2017-02-28 | 2020-05-05 | International Business Machines Corporation | Storing data sequentially in zones in a dispersed storage network |
US10372380B2 (en) | 2017-03-01 | 2019-08-06 | International Business Machines Corporation | Asserting integrity with a verifiable codec |
US10169392B2 (en) | 2017-03-08 | 2019-01-01 | International Business Machines Corporation | Persistent data structures on a dispersed storage network memory |
US11226980B2 (en) | 2017-03-13 | 2022-01-18 | International Business Machines Corporation | Replicating containers in object storage using intents |
US10235241B2 (en) | 2017-03-15 | 2019-03-19 | International Business Machines Corporation | Method for partial updating data content in a distributed storage network |
US10693640B2 (en) | 2017-03-17 | 2020-06-23 | International Business Machines Corporation | Use of key metadata during write and read operations in a dispersed storage network memory |
US10241861B2 (en) | 2017-03-23 | 2019-03-26 | International Business Machines Corporation | Method for tenant isolation in a distributed computing system |
US10133634B2 (en) | 2017-03-30 | 2018-11-20 | International Business Machines Corporation | Method for performing in-place disk format changes in a distributed storage network |
US10360391B2 (en) | 2017-04-03 | 2019-07-23 | International Business Machines Corporation | Verifiable keyed all-or-nothing transform |
US10379961B2 (en) | 2017-04-11 | 2019-08-13 | International Business Machines Corporation | Ensuring metadata and index consistency using write intents |
US10545699B2 (en) | 2017-04-11 | 2020-01-28 | International Business Machines Corporation | Dynamic retention policies and optional deletes |
US10567509B2 (en) | 2017-05-15 | 2020-02-18 | International Business Machines Corporation | Rebuilding derived content |
US10339003B2 (en) | 2017-06-01 | 2019-07-02 | International Business Machines Corporation | Processing data access transactions in a dispersed storage network using source revision indicators |
US10491386B2 (en) | 2017-06-01 | 2019-11-26 | International Business Machines Corporation | Slice-level keyed encryption with support for efficient rekeying |
US10467097B2 (en) | 2017-06-02 | 2019-11-05 | International Business Machines Corporation | Indicating data health in a DSN memory |
US10372381B2 (en) | 2017-06-05 | 2019-08-06 | International Business Machines Corporation | Implicit leader election in a distributed storage network |
US10361813B2 (en) | 2017-06-16 | 2019-07-23 | International Business Machine Corporation | Using slice routers for improved storage placement determination |
US10534548B2 (en) | 2017-06-20 | 2020-01-14 | International Business Machines Corporation | Validating restricted operations on a client using trusted environments |
US10324855B2 (en) | 2017-06-23 | 2019-06-18 | International Business Machines Corporation | Associating a processing thread and memory section to a memory device |
US10594790B2 (en) | 2017-06-28 | 2020-03-17 | International Business Machines Corporation | Data compression in a dispersed storage network |
US10540111B2 (en) | 2017-06-28 | 2020-01-21 | International Business Machines Corporation | Managing data container instances in a dispersed storage network |
US10509699B2 (en) | 2017-08-07 | 2019-12-17 | International Business Machines Corporation | Zone aware request scheduling and data placement |
US10599502B2 (en) | 2017-08-07 | 2020-03-24 | International Business Machines Corporation | Fault detection and recovery in a distributed storage network |
US10671746B2 (en) | 2017-08-28 | 2020-06-02 | International Business Machines Corporation | Controlling access when processing intents in a dispersed storage network |
US10379942B2 (en) | 2017-09-27 | 2019-08-13 | International Business Machines Corporation | Efficient transfer of objects between containers on the same vault |
US10585748B2 (en) | 2017-09-29 | 2020-03-10 | International Business Machines Corporation | Scalable cloud—assigning scores to requesters and treating requests differently based on those scores |
US10409661B2 (en) | 2017-09-29 | 2019-09-10 | International Business Machines Corporation | Slice metadata for optimized dispersed storage network memory storage strategies |
US10802713B2 (en) | 2017-09-29 | 2020-10-13 | International Business Machines Corporation | Requester-associated storage entity data |
US10540120B2 (en) | 2017-11-14 | 2020-01-21 | International Business Machines Corporation | Contention avoidance on associative commutative updates |
US10423497B2 (en) | 2017-11-28 | 2019-09-24 | International Business Machines Corporation | Mechanism for representing system configuration changes as a series of objects writable to an object storage container |
US10565392B2 (en) | 2017-11-28 | 2020-02-18 | International Business Machines Corporation | Secure and verifiable update operations |
US10785194B2 (en) | 2017-12-07 | 2020-09-22 | International Business Machines Corporation | Processing intents using trusted entities in a dispersed storage network |
US10681135B2 (en) | 2017-12-08 | 2020-06-09 | International Business Machines Corporation | Generating slices from a broadcast message and a recipient identity |
US11412041B2 (en) | 2018-06-25 | 2022-08-09 | International Business Machines Corporation | Automatic intervention of global coordinator |
US10936452B2 (en) | 2018-11-14 | 2021-03-02 | International Business Machines Corporation | Dispersed storage network failover units used to improve local reliability |
CN111723050A (en) * | 2019-03-22 | 2020-09-29 | 伊姆西Ip控股有限责任公司 | Method, electronic device and computer program product for file management |
US11593026B2 (en) | 2020-03-06 | 2023-02-28 | International Business Machines Corporation | Zone storage optimization using predictive protocol patterns |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4325120A (en) * | 1978-12-21 | 1982-04-13 | Intel Corporation | Data processing system |
US4408273A (en) * | 1980-05-27 | 1983-10-04 | International Business Machines Corporation | Method and means for cataloging data sets using dual keyed data sets and direct pointers |
US4498132A (en) * | 1981-05-22 | 1985-02-05 | Data General Corporation | Data processing system using object-based information and a protection scheme for determining access rights to such information and using multilevel microcode techniques |
US4498131A (en) * | 1981-05-22 | 1985-02-05 | Data General Corporation | Data processing system having addressing mechanisms for processing object-based information and a protection scheme for determining access rights to such information |
US4468728A (en) * | 1981-06-25 | 1984-08-28 | At&T Bell Laboratories | Data structure and search method for a data base management system |
US4714989A (en) * | 1982-02-19 | 1987-12-22 | Billings Roger E | Funtionally structured distributed data processing system |
US4638424A (en) * | 1984-01-12 | 1987-01-20 | International Business Machines Corporation | Managing data storage devices connected to a digital computer |
US4945475A (en) * | 1986-10-30 | 1990-07-31 | Apple Computer, Inc. | Hierarchical file system to provide cataloging and retrieval of data |
US5175852A (en) * | 1987-02-13 | 1992-12-29 | International Business Machines Corporation | Distributed file access structure lock |
JPH01502861A (en) * | 1987-09-04 | 1989-09-28 | ディジタル イクイプメント コーポレーション | Session control within circuitry for digital processing systems supporting multiple transfer protocols |
US5115504A (en) * | 1988-11-01 | 1992-05-19 | Lotus Development Corporation | Information management system |
US5129083A (en) * | 1989-06-29 | 1992-07-07 | Digital Equipment Corporation | Conditional object creating system having different object pointers for accessing a set of data structure objects |
US5129084A (en) * | 1989-06-29 | 1992-07-07 | Digital Equipment Corporation | Object container transfer system and method in an object based computer operating system |
EP0410210A3 (en) * | 1989-07-24 | 1993-03-17 | International Business Machines Corporation | Method for dynamically expanding and rapidly accessing file directories |
US5276793A (en) * | 1990-05-14 | 1994-01-04 | International Business Machines Corporation | System and method for editing a structured document to preserve the intended appearance of document elements |
US5239647A (en) * | 1990-09-07 | 1993-08-24 | International Business Machines Corporation | Data storage hierarchy with shared storage level |
US5303367A (en) * | 1990-12-04 | 1994-04-12 | Applied Technical Systems, Inc. | Computer driven systems and methods for managing data which use two generic data elements and a single ordered file |
-
1992
- 1992-09-15 US US07/945,266 patent/US5454101A/en not_active Expired - Lifetime
-
1993
- 1993-08-25 WO PCT/CA1993/000338 patent/WO1994007209A1/en active IP Right Grant
- 1993-08-25 DE DE69302908T patent/DE69302908D1/en not_active Expired - Lifetime
- 1993-08-25 EP EP93917505A patent/EP0662228B1/en not_active Expired - Lifetime
- 1993-08-25 CA CA002143288A patent/CA2143288A1/en not_active Abandoned
Also Published As
Publication number | Publication date |
---|---|
US5454101A (en) | 1995-09-26 |
EP0662228B1 (en) | 1996-05-29 |
EP0662228A1 (en) | 1995-07-12 |
DE69302908D1 (en) | 1996-07-04 |
WO1994007209A1 (en) | 1994-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0662228B1 (en) | Apparatus for data storage and retrieval | |
US6484181B2 (en) | Method and system for handling foreign key update in an object-oriented database environment | |
US7136867B1 (en) | Metadata format for hierarchical data storage on a raw storage device | |
US7702637B2 (en) | Systems and methods for fragment-based serialization | |
US5261088A (en) | Managing locality in space reuse in a shadow written B-tree via interior node free space list | |
US6209000B1 (en) | Tracking storage for data items | |
US5999943A (en) | Lob locators | |
US6823338B1 (en) | Method, mechanism and computer program product for processing sparse hierarchical ACL data in a relational database | |
US6438549B1 (en) | Method for storing sparse hierarchical data in a relational database | |
US6738790B1 (en) | Approach for accessing large objects | |
US7103588B2 (en) | Range-clustered tables in a database management system | |
US6029170A (en) | Hybrid tree array data structure and method | |
US7058639B1 (en) | Use of dynamic multi-level hash table for managing hierarchically structured information | |
US5555409A (en) | Data management systems and methods including creation of composite views of data | |
US5560007A (en) | B-tree key-range bit map index optimization of database queries | |
US7716182B2 (en) | Version-controlled cached data store | |
KR100981857B1 (en) | System and method for scoping searches using index keys | |
US5857182A (en) | Database management system, method and program for supporting the mutation of a composite object without read/write and write/write conflicts | |
US7720869B2 (en) | Hierarchical structured abstract file system | |
US8271530B2 (en) | Method and mechanism for managing and accessing static and dynamic data | |
US20050015674A1 (en) | Method, apparatus, and program for converting, administering, and maintaining access control lists between differing filesystem types | |
JPH0652531B2 (en) | Relay database management system | |
US7024434B2 (en) | Method and system for modifying schema definitions | |
US7672945B1 (en) | Mechanism for creating member private data in a global namespace | |
JP2004505380A (en) | Methods, systems and data structures for implementing nested databases |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
FZDE | Discontinued |