Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS20070088769 A1
Publication typeApplication
Application numberUS 11/537,971
Publication dateApr 19, 2007
Filing dateOct 2, 2006
Priority dateFeb 20, 2004
Also published asCA2556414A1, EP1723565A2, US7158999, US20050187990, WO2005083596A2, WO2005083596A3
Publication number11537971, 537971, US 2007/0088769 A1, US 2007/088769 A1, US 20070088769 A1, US 20070088769A1, US 2007088769 A1, US 2007088769A1, US-A1-20070088769, US-A1-2007088769, US2007/0088769A1, US2007/088769A1, US20070088769 A1, US20070088769A1, US2007088769 A1, US2007088769A1
InventorsRichard Pace, Martin Hasegawa, Ronald Ferguson
Original AssigneeMainstar Software Corporation
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Reorganization and repair of an icf catalog while open and in-use in a digital data storage system
US 20070088769 A1
Abstract
MVS mainframe computer systems employ the ICF (Integrated Catalog Facility) catalog environment to manage numerous data sets. To provide nearly continuous availability of those data sets, the BCS catalog (250,270) must be re-organized while leaving the catalog open to access by applications. To perform a re-org while open, a data CI correlation table (500) can be constructed (314) and used to lay the data CIs into a backup file in logical order (316), so that they can be loaded into the new BCS catalog (324) without sorting, thereby reducing downtime.
Images(8)
Previous page
Next page
Claims(17)
1. A computer-implemented data structure stored in machine-readable memory for use in re-organizing or restoring an ICF catalog in a VSAM mainframe storage environment, the data structure comprising:
a correlation table having a series of entries, each entry corresponding to a physical CI number in a BCS data component; and
wherein each entry in the table contains a first pointer to a CI containing lower keys and a second pointer to a CI containing higher keys, so that the table contents together form a logical list of the CIs.
2. The computer-implemented data structure of claim 1 wherein a predetermined value is used in the table to indicate an end of a chain of key values.
3. The computer-implemented data structure of claim 1 wherein a predetermined pointer value is used in the table to indicate an empty data CI.
4. The computer-implemented data structure of claim 1 wherein a predetermined pointer value is used in the table to indicate an end of a chain of key values.
5. The computer-implemented data structure of claim 1 wherein each entry further includes a backward pointer.
6. The computer-implemented data structure of claim 1 wherein each table entry that does not correspond to a data CI containing the high key includes a forward pointer identifying a data CI having a next key in a predetermined key sequence.
7. A data structure stored in machine-readable memory for storing a correlation between a number of logical keys of a BCS data set and one or more respective physical locations of the data set, for use in reorganizing the BCS while open.
8. A data structure according to claim 7 wherein the data structure is comprised of a table for storing a correlation between the logical order of the data CIs of a BCS data set and the physical location of the data CIs, for use in reorganizing the BCS while open.
9. A data structure according to claim 7 wherein the correlation comprises a correlation between the logical order of the keyed records of a BCS data set and the respective data CIs that contain them.
10. A logical CI correlation table stored in machine-readable memory for use in re-organizing or restoring an ICF catalog in a VSAM storage environment, the correlation table comprising:
a series of entries, each entry corresponding to a physical CI number in the BCS data component; and
each entry including a first pointer to a CI containing lower keys and a second pointer to a CI containing higher keys, so that the table contents together form a logical list of the CIs.
11. A correlation table for use in re-organizing an ICF catalog in a VSAM storage environment, the correlation table comprising:
a series of entries, each entry corresponding to one of the data CI's in the catalog, and wherein each entry in the table comprises a forward pointer to support reading the data records in key sequence.
12. A correlation table according to claim 11, wherein a predetermined value is used in the table to indicate either end of a chain of key values.
13. A correlation table according to claim 11, wherein a predetermined pointer value is used in the table to indicate an empty data CI.
14. A correlation table according to claim 11, wherein a predetermined pointer value is used in the table to indicate an end of a chain of key values.
15. A correlation table according to claim 11, wherein each entry further includes a backward pointer.
16. A correlation table according to claim 11, wherein each table entry that does not correspond to a data CI containing the high key includes a forward pointer identifying the data CI having a next key in a predetermined key sequence.
17. A computer-implemented method of reorganizing an ICF catalog comprising a key sequential data set (KSDS) BCS in a VSAM system while the catalog is open, the BCS comprising a data component and an index component, the reorganization method comprising the steps of:
(a) the catalog with exclusive control;
(b) creating a backup of the index component;
(c) based on sequence set records of the index component, determining, for each data CI in a data set, a physical location of the CI in the data component and a logical location of the CI in key sequence order;
(d) reading data CIs from the data component out to an internal backup data set in potential logical key sequence order so as to form an ordered backup of the data without sorting;
(e) clearing all data CI's in the data component and clearing the index component;
(f) updating the BCS VSAM control blocks so as to reflect an empty BCS;
(g) reloading the data set from the ordered backup, sorting records if necessary, and using standard VSAM I/O to reload the data records and reconstruct the index; and
(h) closing the catalog so as to update VVRs in the VVDS and re-sync it with CAS.
Description
RELATED APPLICATIONS

This application is a continuation of and claims priority to co-pending U.S. patent application Ser. No. 10/783,835, filed Feb. 20, 2004.

COPYRIGHT NOTICE

© 2003-2004 Mainstar Software Corporation. A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. 37 CFR §1.71(d).

TECHNICAL FIELD

This invention is in the field of digital data storage systems and, more specifically, pertains to storage systems that employ ICF catalogs for managing and using data sets.

BACKGROUND OF THE INVENTION

The IBM MVS® mainframe operating system has evolved into the primary data server for very large enterprise computer system environments. This new and critical function has increased the availability of the MVS mainframe system, and all data stored within it, to essentially a 24×7 or “always available” level.

The ICF (Integrated Catalog Facility) catalog environment is a critical component of MVS, as virtually all data or “data sets” within the system must be cataloged within an ICF catalog; they cannot be located for access unless there is a successful search for the data set through the catalog. If the data is not cataloged, or if the necessary catalog is not available, the application attempting to locate it cannot gain access. In even the smallest MVS systems, there are often hundreds of thousands of data sets, and in a large MVS system, the number of data sets is in the tens of millions. All of these data sets are catalogued, except for a very few special cases not important here.

An ICF catalog consists of two components, the Basic Catalog Structure (BCS) catalog, where the data set's name and disk storage volume location is stored, and the VSAM Volume Data Set (VVDS), where the physical and logical attributes of the data set are stored. Due to MVS system design, most MVS environments, regardless of their size, have a fairly small number of BCS catalogs, typically between 10 and 100 BCS catalogs. A BCS catalog can be physically stored on any disk volume, regardless of the location of the data that it catalogs. Each VVDS is physically stored on the volume on which the data resides that it defines, and therefore, the number of VVDSs is typically equal to the number of volumes that an MVS system has assigned to it. Each VVDS comprises a series of VVRs (VSAM Volume Records).

As a simple example, FIG. 1 shows three data storage volumes 110, 112, and 114. A BCS catalog 120 resides on volume Vol.001 (110), along with a VVDS 124 which, in turn, defines one or more data sets stored on that volume. Here, one record in the BCS 120 points to a corresponding record in VVDS 124, as indicated by the dashed line, and a record in the VVDS in turn points to a corresponding VSAM data set 126 (“VSAM”) also on volume 110, as indicated by a dashed line as well. (in general, we will use dashed lines in such drawings to indicate references or pointers.) The BCS catalog 120 also includes a second record that points to a second VVDS 132 on Vol.002 (112), which in turn defines a data set 134 on volume 112.

As is well known, an MVS system has two types of BCS catalogs—one master catalog that identifies and locates the operating system data sets used by the MVS system, and one or more user catalogs that identify and locate all other data sets that are to be accessible to the system. (BCS 120 is a user catalog.) In order to be usable and accessible on a system, a user catalog must be “connected” to the system via a special record in the master catalog, called a UCAT Connector Record (not shown). Upon first use by the system following IPL, a user catalog is opened to the Catalog Address Space (CAS) catalog management program routines, and unless explicitly closed by an operator command, it remains open to the system for the life of the IPL. As further explained below, one important aspect of the present invention is a method of reorganizing the BCS catalog while it remains open. When data storage devices are shared, that is, concurrently accessible and updateable by multiple operating systems, mechanisms exist to prevent unsynchronized sequences of events from occurring. When the serialization protocols are not adhered to, then the integrity of the physical data can be compromised.

Because an MVS system has few BCS catalogs, but a very large number of cataloged data sets, a single BCS will often have hundreds of thousands, possibly millions of data sets cataloged within it. These data sets are used by online data base systems that remain in use for weeks at a time, or even longer, and the data sets are not closed throughout that time. The same, or other, data sets are also used by “batch” job streams that are usually scheduled for execution on a daily basis.

The BCS is considered “in use” as soon as it is opened by CAS, regardless of any open data sets cataloged in the BCS. As mentioned, the BCS remains open for the life of the IPL, unless explicitly closed. As long as it is in use, repair or cleanup catalog management functions cannot be performed.

MVS data set names are comprised of 1 to 44 characters. A period in a name, counted as a character, has special meaning. Periods are used to separate a name into nodes, called qualifiers. Qualifiers serve the purpose of grouping names for visual identification and masking capabilities. Also, the left most (high level) 1 to 4 qualifiers may be used to identify which catalog the “locate” pointers to a data set are to be recorded in. If a specific catalog is desired for a group of data sets, a special entry called an “alias” is created in the “master catalog,” causing all data sets whose high level qualifier(s) match the defined alias to be cataloged in the corresponding “user catalog.” Subsequent locates for an existing data set will begin by searching the master catalog for an alias match, which in turn directs the locate to the associated user catalog.

Historically on MVS systems, the data set name high-level qualifier derives from a short-form name of an application, such as PARTS for a manufacturing organization, and therefore, every data set in the PARTS application will have a name that begins with PARTS. The number of data sets in this application could number in the thousands, or tens of thousands.

Referring again FIG. 1, to facilitate the search for a data set (in MVS terminology, this is called the “locate” operation), the data set name is used as a keyed search argument. The BCS catalog 120 is physically a VSAM Key Sequenced Data Set (KSDS), and the key of its records is the data set name. When the catalog record for the data set is located, the volume cell(s) inside the record identifies the disk storage (DASD) volume(s) on which the data set resides, for example 110 or 112, and points by relative address to the data set's descriptive record within the WDS 124, 132 (if the data set is VSAM or nonVSAM SMS managed) or the VTOC 128, 130 (if the data set is nonVSAM nonSMS managed). In the drawing, catalog record 202 in BCS 120 points to volume 001, and specifically to record 208 in the VVDS 124. Another data set record 204 points to volume 002 and specifically to record 216 in the VVDS 216, which in turn defines the data set 134. Volume 114 has associated VTOC 140, WDS 142 and a representative KSDS 144.

FIG. 2 illustrates a BCS catalog in greater detail. The catalog generally comprises a data component 250 and an index component 270. The data component 250 conceptually comprises a series of columns 252, each column representing one Control Area or CA. The CAs are numbered from left to right beginning with CA.01. Each CA corresponds to a cylinder in DASD storage, and contains at least one Control Interval or CI, for example 254, 256 which can be thought of as a block of data storage records (the minimum I/O block size). Within each CI, there may be gaps, resulting from record deletions or allocated free space.

By standard VSAM design, all records within a KSDS are maintained in logical ascending key sequence—and the BCS catalog is a standard VSAM KSDS. Accordingly, when new cataloged data sets are added, the value of the data set name key determines the location within the BCS data component where the records will be inserted. Pre-allocated “free space” can be reserved in a BCS with the FREESPACE keyword on the IDCAMS DEFINE USERCATALOG command, and this space can be utilized when a new data set is cataloged.

This pre-allocated free space can be reserved at either (or both) the CI (Control Interval) or CA (Control Area) level, specified as a percentage, and by standard VSAM design will be evenly distributed across the entire file when it is initially loaded (and therefore, is beneficial when record insertions are in the same distribution across the file). For example, in FIG. 2, CI level free space 257 in CA.00 and CA level free space 258 are illustrated. When an existing data set is deleted, its record in the BCS catalog is physically removed, dynamically creating free space that can be utilized if a subsequent new data set of a similar name is cataloged.

When a record insertion is necessary, and sufficient free space at the appropriate location is not available, VSAM automatically performs a “CI split,” moving half of the records in the affected CI to a free CI within the same CA. If a free CI within that CA is not found, VSAM performs a CA split, moving half of the CIs from the affected CA to an entirely new CA at the end of file (EOF) location. If a new CA cannot be found, VSAM allocates a new physical secondary extent of the BCS. The extent limit count for a BCS is 123, and unlike other VSAM data sets, the BCS catalog is restricted to a single volume allocation. If the 123 maximum extent limit is reached, Catalog Management fails the new data set allocation. Subsequent to this, other data sets might still be able to allocate, if the value of their data set name “points” to a location within the BCS catalog where there is sufficient free space.

The index component 270 of the BCS is illustrated in simplified conceptual form in FIG. 2. In the illustration, the first level index or “sequence set” 272 comprises a series of records, e.g. 274, 276 etc., each of which corresponds to a respective CA in the data component. Index record 274 corresponds to CA.00, index record 276 corresponds to CA.01 and so on. Each index record has an entry for each data CI in the corresponding CA, including an indication of the highest key value in that CI. A second level index (if needed) comprises another series of records, e.g. 282, 288 each of which corresponds to a plurality of first level index records. In a second level index record, there is an entry for each first level index record associated with that record, including an indication of the highest key value in the corresponding first level index record. A third level index record 290 is shown, and there may be more, as needed, associated in this hierarchical fashion, sometimes called a B-tree. At a given index level, horizontal pointers such as 292 form a linked list or chain in key sequential order.

Generally, the BCS catalog is a relatively stable structure, in that the process of cataloging new data sets and deleting existing data sets balances out. After a few CI and CA splits to open up free space in the volatile areas of the catalog are completed, it generally settles down and doesn't grow very much. In situations such as that, a BCS catalog can survive months or years without attention.

Many MVS systems, though, have one or more catalogs that do not fit this pattern; rather, some catalogs grow very rapidly due to a concentration of new data allocations in one location within the BCS catalog. When this occurs, close attention must be paid to the catalog to minimize the risk of system or application outage when the catalog “fills.” The most frequent cause of this is a data set naming convention for an application that consists of some type of sequence number value within the data set name, resulting in any new data set name being added immediately after the previous ones within the BCS catalog, and if any data sets are deleted, chances are they are the oldest (lowest numbered) data set names. If the sequence numbers are never (or rarely) reused within the data set name, the BCS catalog gets larger and larger, yet emptier and emptier. This is known as the “creeping key” problem in VSAM KSDS files, and is endemic to application files as well as BCS catalogs.

For any type of VSAM KSDS, whether it's an application file or a BCS catalog, the solution to this problem is a file reorganization (generally called a “re-org”). A file reorganization generally begins with executing a utility program to unload all records in the file to a backup copy; the data set is then physically deleted and redefined (thereby making it empty), and the backed up records are then reloaded. The result is a “reorganization” of the data blocks (CIs and CAs) within the data set, and its physical size is now in proper proportion to the records contained within it. A re-org, for either an application data set or BCS catalog, can be accomplished by IBM or other vendor utility programs.

In all cases, the re-org process requires that all external points of access to the object dataset or BCS catalog be “quiesced”. In this context, the term “quiesced” means that all other software functions that can physically access the object dataset must be temporarily inhibited from doing so. In the case of the BCS catalog, the accessing programs include the MVS Catalog Management functions available from any active system that has physical access to it.

In many instances, the BCS catalog resides on a shared DASD volume, allowing Catalog Management on any number of MVS systems to be open to the BCS, updating it with new and deleted data sets, and with applications also open and accessing data sets that are cataloged within the BCS. This sharing across systems complicates the coordination and scheduling necessary to quiescing the BCS as further explained below. For example, various levels of catalog sharing are known in the art, including: (1) Not Shared; (2) Shared only within a single Sysplex; (3) Shared across multiple Sysplexes; and (4) ECS—Enhanced Catalog Sharing. The present invention provides catalog integrity across these various levels of catalog sharing.

Prior Art BCS Catalog Re-Org Methodologies

Re-org methodologies for performing a BCS catalog re-org exist. Prior art includes IBM IDCAMS EXPORT/IMPORT, EMC Catalog Solution DUMP/REBUILD, and BACKUP/RESTORE, implemented in software products commercially available from Mainstar Software Corporation, Bellevue, Wash. (assignee of the present invention). These known methodologies do not satisfy the requirements for a re-org while open, as they inherently require the BCS to be quiesced and closed throughout the re-org process. They also require the BCS to be physically deleted and re-defined between the backup and restore processes. If an attempt is made to utilize one of these methodologies for re-org while the BCS catalog is open and active, serious damage will almost certainly occur to the internal structure of the BCS catalog, and any of the jobs (including Catalog Management) may ABEND (abort or end abnormally) with unpredictable results at a subsequent time.

In the BCS re-org process with these methodologies, the user schedules a quiesce period during which time the BCS catalog will be inaccessible. To ensure that the BCS is not updated between the time of the backup and restore, the MVS operator MODIFY command is issued on all systems sharing access to the catalog, forcing it to close and un-allocate to CAS. At this point, user knowledge of, and strict adherence to procedures, across all systems, must be maintained, as the BCS catalog can automatically re-open if any job is executed that requests access to it.

When the user is comfortable that the BCS catalog is quiesced, a system backup utility program is executed, to write a copy of the catalog's data records to a physical sequential file. If EXPORT is used, the records are retrieved in ascending key sequence, as standard VSAM sequential read-access through the BCS catalog's index is utilized—this methodology preserves the catalog's necessary record sequence for restoring into the newly-defined and empty BCS. If any of the non-IBM system backup utilities are used, the records are retrieved in physical record sequence, bypassing the BCS catalog's index structure—this methodology protects against potential index structure damage, ensuring that all records from the BCS catalog are retrieved, but it requires that the records are sorted prior to restoring them into the newly-defined and empty BCS.

When the backup is complete, the existing BCS catalog is deleted, re-defined as a new, empty BCS catalog, and the records from the backup file are then restored with the appropriate utility function with regard to the backup. In some of the prior art methodologies, the delete/define function is automatically performed for the user, while in others the user must do it manually. Also, some prior art methodologies allow certain physical attributes of the BCS catalog to be changed on the new allocation from the existing BCS catalog's attributes. Regardless of the methodology, when the restore operation is complete, system and application processing that might use the catalog can be restarted, and the catalog will re-open as necessary.

For the BCS, it is very difficult to schedule and perform a re-org. Most BCS catalogs have 24×7 availability requirements, from at least one of the MVS systems that are sharing access to it, and the “down-time” to re-org the BCS is disruptive to production application processing. “Down-time” is defined as the time between closing and re-opening of the catalog, enabling application jobs and online systems to once again resume access to the data sets cataloged within the BCS catalog. During down-time, access to the BCS catalog must be stopped, including allocation of new data sets, deletion of existing data sets, and application access to data within existing data sets is denied. Even if the downtime is planned and scheduled, it represents an outage that might not be acceptable for 24×7 environments. If it is unplanned and a forced situation, it can result in disastrous business disruption. For this reason, many BCS catalogs are re-orged very infrequently. What is needed is a way to re-org a BCS catalog while keeping it open so that catalogued data sets remain continuously available to applications for processing.

SUMMARY OF THE INVENTION

One important aspect of the invention is the ability to re-org a BCS catalog while it is open and accessible to Catalog Management on any number of MVS operating systems that are sharing access to it. In doing so, the invention ensures that any Catalog Management control block information that might require alteration is properly updated as a result of the BCS catalog structural changes that take place during the re-org.

A primary objective of a BCS catalog re-org process is minimizing Catalog Management down-time (i.e., the necessary window of its unavailability) which in turn results in minimum impact on application jobs that require use of the catalog, while still complying with the various constraints and requirements mentioned above.

Another objective is to properly re-organize a BCS catalog while open, allowing continued access to data sets listed in the catalog during the re-org process. A further object is to ensure that the re-org results in a valid, usable BCS.

One especially important objective of the invention is to enable recovering from a failed re-org attempt without catalog data loss or corruption.

In accordance with the present invention, the BCS catalog can be open and accessible to any number of MVS systems throughout the re-org process. The invention logic suspends all Catalog Management (and other program) access to the BCS catalog during the re-org process, ensuring that content and structural changes to the catalog do not occur throughout the process, but application program access to needed data sets is uninterrupted.

To provide a “safety net” in the event a structural error is encountered inside the BCS catalog during the re-org, an optional mirror-image backup of the catalog can be taken at the same time as the re-org backup. This backup can subsequently be restored if the re-org process signals an error from which it cannot recover.

Additional aspects, features and advantages of this invention will be apparent from the following detailed description of preferred embodiments, which proceeds with reference to the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a conceptual diagram illustrating MVS data storage volumes, including a BCS catalog and associated WDS, VTOC and VSAM structures.

FIG. 2 is a simplified, conceptual diagram of the internal structure of a BCS catalog, depicting the data and index components.

FIG. 3 is a process flow diagram depicting a method of BCS catalog re-org while open in accordance with the present invention.

FIG. 4 is a flow diagram depicting an alternative method of BCS catalog re-org while open in accordance with the present invention.

FIG. 5A is a conceptual illustration of a series of data Control Intervals.

FIG. 5B is a conceptual illustration of a Data CI Table in accordance with the present invention.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

Referring now to FIG. 3, the first step 302 of a presently preferred embodiment establishes the ESTAE environment, so that any ABENDs that might occur during processing will be trapped and handled within the logic. Step 304 obtains catalog environment information, validating that the named catalog is appropriate, and establishing the environment in which the process will execute. In a presently preferred embodiment, much of this information will be the result of specifications on the user command that initiates the process. Some options are:

    • The name of the single BCS catalog to be re-org'ed. The implementation will require that the BCS to be re-orged must be currently connected to the master catalog as a user catalog. It cannot be the master catalog of the MVS operating system on which the process is run, although it can be a master catalog of another system if it is not currently an active master catalog at the time of the process.
    • The catalog is not required to already be open to Catalog Management at the time the re-org process is run, although the user will be required to explicitly state that the process is permitted to execute if the catalog is open.
    • The solution will allow the user to optionally request a standard catalog backup be created (in addition to the temporary re-org backup). If the re-org processing fails after the fail-safe point, this backup can be input to a subsequent Mainstar Catalog RecoveryPlus RECOVER command to restore the catalog's records.
    • A simulation of the re-org process can be requested. This step will create the switches within the logic for all processes to be accomplished, without performing the actual internal re-org of the catalog. This mode may result in a certain level of system contention, as a shared RESERVE on the SYSIGGV2 resource name will cause update requests against the BCS catalog to be delayed until the re-org process is complete. This is a requirement In order for the simulation mode to be of any use and value.
    • System macros to access the BCS catalog environment are well known (Bob, we need to talk about this, as I do not know this for a fact, and inserted it because it is in the other patent application write-up—the CAXWA, for example).

Next, step 306 opens the BCS catalog to be re-org'ed, ensures it is in a stabilized state, and obtains exclusive control of it. To accomplish this, one solution performs the following:

    • Issue an MVS store clock instruction or otherwise obtain a current timestamp.
    • Check open flags to determine if the catalog is already open on the system. If it is, ensure the user specified the re-org is to take place on an already-open BCS catalog. If not specified, the process is terminated.
    • Read the BCS VVR records and obtain the latest BCS refresh timestamp.
    • Open the catalog with a standard VSAM OPEN macro.
    • Issue a RESERVE macro for the SYSIGGV2 resource name, for exclusive control if it is an actual re-org process, or shared control if in simulate mode. SYSIGGV2 is the standard resource name used by tasks within the Catalog Address Space for serialization of all BCS catalog accesses. Under exclusive control, all read and update requests from other address spaces, including other systems, will stack up behind this RESERVE. Under shared control, only update requests will stack up.
    • Obtain the VVRs for the data and index component of the BCS catalog from the WDS on the volume where the BCS catalog resides (either two or three WRs are obtained—one each for the data and index component, and one for the index sequence set if the catalog was defined with the IMBED attribute).
    • Compare the BCS refresh timestamp recorded just prior to the OPEN against the latest BCS refresh timestamp in the BCS's data component VVR. If the latest BCS refresh timestamp is greater, that indicates an update to the catalog during our OPEN process, and the OPEN step is performed again. This loop continues until the latest BCS refresh timestamp is equal or less than the one obtained prior to the OPEN process. This process may be repeated up to 10 times at which point the re-org is aborted. (10 is an arbitrary number used to prevent an endless loop).
    • From information within the BCS catalog's VVR records, the solution can determine the extent locations for all physical DASD extents for both components, the CI size for both components, and the data component's maximum allowable logical record length.

Step 308 prepares for the internal backup to be taken, that will subsequently be used to reload the BCS catalog in the re-org operation. (If requested, an optional emergency recovery backup of the BCS catalog's records will be taken at this time, in standard Mainstar® Catalog RecoveryPlus™ BACKUP format.) Taking the internal backup, in accordance with one embodiment of the invention, involves the following detailed steps: First, calculate buffer requirements for reading the index and data component of the BCS catalog, then allocate storage for the buffers. Next, allocate the internal backup data set. To do this, the size of the BCS catalog's data and index components are computed from the respective HURBA divided by CI size. This is then converted into number of DASD tracks, and used for the space allocation quantity of the backup data set. The backup data set is then dynamically allocated and opened. Other techniques can be used for the internal backup, including without limitation using a dataspace rather than a DASD file. Various methods of taking the internal backup should be deemed equivalent to the DASD file approach.

Preferably, to ensure best possible backup processing performance, all program and storage areas are page-fixed (in groups of 100, to avoid excessive spin loops), and the address space is set non-swappable.

Step 310 calls for reading the BCS catalog's self-describing record (always the first physical record within the catalog), and checking it for existence and validity. If the record is missing, or cannot be validated, the process is terminated, on the basis that the re-org cannot be successful.

In Step 312, the BCS catalog's index structure is validated. The speed of the overall re-org will be much greater if the index can be used to retrieve the data component's records in ascending key order, as that will eliminate the requirement to sort the records prior to reload. For that reason, ensuring the validity of the index structure will increase the likelihood of accurately and correctly unloading all records from the data component (index structure errors in a catalog are fairly common, and standard VSAM sequential access to the catalog quite often results in a truncated number of records processed.

The validation begins at the highest level index record. From it, the vertical pointer to the first record at the level below is used to retrieve that record, and then process across that level to check the horizontal chain pointers, at the same time as all vertical pointers from the level above are checked. When that level is complete, the same logic is applied to the next level below, and so on, until the Sequence Set level (272 in FIG. 2) is reached. Comparison between the total number of records processed is made against the computed size of the index. If errors are encountered in the index structure, an appropriate error message is issued, and processing terminates.

All vertical and horizontal pointers in the index are checked to the extent that they fall within the boundaries of the high allocated RBA of the Index. Ultimately, the decision to proceed with the re-org depends on whether the sequence set is intact. This is determined by calculating the number of used and free data CIs represented in the sequence set index records. The total should match the HURBA of the data component; otherwise the sequence set is missing data. If that is the case, the re-org is aborted.

In Step 314, during Sequence Set level processing, a Data CI Table is constructed, encoding the logical sequence of all data component CIs within the BCS catalog. Such a table is illustrated in FIG. 5B, reflecting the data CI's of FIG. 5A, as follows. Referring to FIG. 5A, the letter inside each control interval indicates the starting character of the key for the records contained in the corresponding control interval. For example, CI 2 contains all records that have keys beginning with the letter “B”. Notice CI 3 and 9 are empty. Referring now to the Data CI Table of FIG. 5B, The data CI table has one entry for each Data Control interval. The Data CI table entry number corresponds to the Data CI number (CI #).

The numbers and letters appearing below each entry are for reference purposes and do not constitute any data within the entry. An entry in the Data CI Table consists of a backward and forward pointer. The pointer value is the Data CI number. The backward pointer is the left side of the entry (left of the dash line) and the forward pointer is the right side of the entry. A minus 1 (−1) pointer value means “no pointer value”—i.e. end of the line either forward or backward. An empty, or perhaps “orphaned”, Data CI is represented in the table as having −1 in both forward and backward pointers—i.e. it is an unconnected, CI. If one were to read the Data CIs in sequential order and extract the records, we would have records with keys starting with “A” followed by records with keys starting with “Z” followed by records with keys starting with “B”, and so on. Note that after reading the CI with “B” keys, we would read an empty CI, a wasted I/O operation. The Data CI Table allows us to read the data records in “key sequence”. The first index sequence set record (no shown here, but obtained when reading the index records) tells us that the first Data CI (i.e. the data CI with the lowest keyed records) is CI 0. We read Data CI 0, extracting the records. Next we use “0” value to index into the Data CI Table to retrieve the CI 0 table entry. CI 0 table entry has a backward pointer of −1 and a forward pointer of 2. In this process, we are only interested in forward pointers. The value 2 tells us that the next Data CI we need to read is Data CI 2. Data CI 2 contains records with keys starting with “B”. After reading Data CI 2, we use the value “2” to index into the Data CI table to retrieve the table entry for CI 2. The table entry for CI 2 contains a backward pointer of 0 and a forward pointer of 8. So the next Data CI to read is CI 8. CI 8 contains records with keys starting with “C”. Next, the value “8” is used to index into the CI table to retrieve the CI table entry for CI 8. This table entry has a forward pointer of 4. The reading process continues in this fashion until we finally read CI 1. Now the table entry for CI 1 has a forward pointer of −1. −1 indicates that there are no more Data CIs to read.

Returning now to FIG. 3, step 316, the data CI table (500 in FIG. 5B, for example) is used to retrieve each data CI within the data component in ascending key sequence. The CI is written to the internal backup (which may comprise a file or dataspace), and is written out to the backup file. (If a Catalog RecoveryPlus BACKUP-format was also requested, the unloaded BCS catalog records are also written to this backup file). In Step 318, a backup summary report is created, showing a record count for all record types that were encountered during the BCS catalog unload.

In Step 320, the internal backup is read or otherwise verified, to ensure that it can be successfully used to reload the BCS catalog. The keys are checked to ensure they are in proper ascending key sequence. A verify summary report is created and printed. The record counts between the backup summary and verify summary are compared, and if not equal, the process is terminated. This marks the “point of no return” in that if errors occur from this point forward, the BCS must be recovered from another backup (see 308).

Step 322 begins the actual re-org process of the BCS catalog, and represents the fail-safe point of no-return. This step sets up the environment for the existing BCS catalog data and index structure to be reloaded with its records from the internal backup. The following steps are done:

If the BCS is open on the system where the re-org process is running, there will be a CAXWA entry for this BCS. Set BCS refresh indicators in the CAXWA. Since the presently preferred embodiment code still holds the RESERVE on the SYSIGGV2 resource name, this will take place immediately after the re-org is complete.

The Data CI on a CA boundary is written with a CIDF of binary zeros, to indicate a software EOF in VSAM. All other DATA CIs are written with a CIDF set to indicate an empty CI—i.e. the low order halfword of the CIDF is set with the full free space CI value. Due to the way VSAM update I/O works, each index CI must first be read for update, then re-written. For the data component, they can be written without a prior read.

At end of processing, all buffers associated with any data and index component ACB are invalidated, and a check of all buffers is make to ensure they have all been physically written out to the BCS. Specifically, we invalidate buffers that have I/O complete (BUFCEPT in BUFCFLG1 on) and NOT waiting to be written (BUFCMW in BUFCIOFL off). Thus the current process traverses the buffer chain looking for buffers with BUFCEPT on and BUFCMW off. If we find one, we invalidate it by turning off the BUFCVAL in BUFCFLG1. All asynchronous I/O to the BCS catalog is now complete, or if not, there is an error condition somewhere in this process.

Several AMDSB and ARDB control block fields are now zeroes out, for both the index and data component, making them appear empty (i.e., to appear as if they are newly defined and at the “initial load” state for VSAM). This includes the following fields:

    • The high-level index record pointer—this address points by RBA to the highest level index record. When zero (along with a zero HURBA value), it indicates there are no index records. For an empty BCS with IMBED, the high-level index pointer and the first sequence set pointer may be non-zero. So the “empty” criteria is more specifically when the sequence-set pointer and the HURBA are equal. For an non-IMBED index component, the value is zero; for IMBED index component it is the RBA value of the first sequence set (which is usually non-zero for IMBED).
    • The high-key RBA and high-used RBA—for VSAM, these are the effective end-of-file address pointer, and when zero they indicate an empty component.

The VVRs for the BCS are updated using the VVDS manager to reflect an empty BCS.

More specifically, in some embodiments, following fields within the VVRs may be updated:

    • The DCI ICF refresh timestamp of the data component VVR is incremented to indicate refresh is required.
    • The hi-used RBA in the data set information cell of the index component VVR is set to zero, which indicates an empty index component.
    • The VVR DSB is updated to reflect an empty BCS.
    • The sharing subcell (used for ISC and CDSC catalogs) of the Data DSB VVR is updated with the corresponding INDEX component values for HURBA, HARBA, high-level RBA, and number of index levels. If IMBED, sequence set HURBA and HARBA values are updated. The update count is incremented by the number of slot entries+1 (this forces ISC and CDSC cache flush in CAS). And finally, the shared event table is cleared.
    • The volume type 23 cell is updated with high key, high used and high allocated RBA values.
    • The cross-system sharing information within the VVR for the Data Component is appropriately updated to communicate the alterations of the re-organized BCS to all MVS systems which have access to it.

Step 324 reloads the BCS catalog from the internal backup, with the following detailed steps:

    • The empty BCS is reloaded using standard VSAM I/O from data records obtained from the internal backup.
    • When EOF is reached on the internal backup file, the BCS catalog is closed using a standard VSAM CLOSE macro.
    • R&D testing of the embodiment determined that certain VVR fields are not automatically updated, so prior to the official CLOSE of the catalog in Step 13. The RBA values in the INDEX and Data component WRs that were previously reset (cf [0061], are now updated with new values obtained from the VSAM controls blocks for the BCS—specifically the DSB and ARDB control blocks for both the INDEX and DATA component.” The important fields that CLOSE does not update are the DSI cell creation timestamp and the “25-cell” in the data component. The update of the DSI creation timestamp is crucial to this process.
    • A re-load summary message is created and written, to provide to the user a visual comparison between the number of BCS catalog records before and after the re-org.

Step 326 is performed if any BCS re-load errors are encountered, with this logic. The highest return code from the attempted re-load is saved and formatted into a message that will hopefully be able to alert the user to the source of the problem.

If no errors are detected, the next step 328 closes the BCS catalog.

The Recovery Process

If errors are encountered, an attempt is made to recover the BCS to its pre-REORG state. This is done by first “emptying” the BCS using VSAM Control Interval access; the INDEX component CIs are rewritten with binary zeros; the Data component CIs are rewritten with binary zeros for CIs on a CA boundary and with a CIDF field set to a full free space CI value for CIs not on a CA boundary.

Next, using VSAM Control Interval access and the full CI images from the internal backup, the BCS is rewritten exactly as it was prior to the REORG. Finally, the original VVR records are rewritten back to the VVDS. If this RECOVERY attempt fails, an informative message is issued and the BCS must be recovered external to REORG. If optional CR+ backup file was created during the REORG, this can be used.

If executing in SIMULATE mode, or if the re-org process has terminated prior to the load process because of an error condition, a standard VSAM CLOSE is issued, writing the WRs for the catalog to the VVDS and re-synchronizing CAS as a result of the VVR update.

Step 330 unallocates the BCS and terminates the catalog RESERVE environment. A DEQ macro for the SYSIGGV2 resource name is issued, unblocking access to the newly re-org'ed BCS catalog from the current system, and all other MVS systems that have shared access to it.

Step 332 is a process teardown and clean-up phase, resulting in all storage areas and table areas are freed. The program and dynamic storage areas are page freed, and all other open data sets from the process are closed.

Step 334 terminates the ESTAE environment.

FIG. 4 is a flow diagram of an alternative embodiment of the invention; one that includes only the most essential steps. This diagram indicates the reference numbers of corresponding steps from FIG. 3; the respective explanations of those steps above apply; they need not be repeated here.

Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US7640256 *Jun 17, 2005Dec 29, 2009Real Time Genomics Inc.Data collection cataloguing and searching method and system
US8024519Sep 22, 2008Sep 20, 2011International Business Machines CorporationCatalog recovery through system management facilities reverse transversal
US8234242 *Jan 22, 2009Jul 31, 2012International Business Machines CorporationMaintaining a data structure with data set names and pointers to a plurality of catalogs
US8423505Jan 9, 2010Apr 16, 2013International Business Machines CorporationCatalog reorganization apparatus and method
US8447794 *May 6, 2008May 21, 2013International Business Machines CorporationMethods, systems, and computer program products for viewing file information
US8458519 *Jan 7, 2010Jun 4, 2013International Business Machines CorporationDiagnostic data set component
US8495029 *Apr 17, 2012Jul 23, 2013International Business Machines CorporationMaintaining a data structure with data set names and pointers to a plurality of catalogs
US8527481Mar 29, 2010Sep 3, 2013International Business Machines CorporationMethods and systems for obtaining and correcting an index record for a virtual storage access method keyed sequential data set
US8577890Jan 28, 2009Nov 5, 2013International Business Machines CorporationModifying data set name qualifiers
US8655892 *Sep 29, 2010Feb 18, 2014International Business Machines CorporationData reorganization
US8719300 *Oct 15, 2008May 6, 2014International Business Machines CorporationCatalog performance plus
US8775872Apr 25, 2013Jul 8, 2014International Business Machines CorporationDiagnostic data set component
US20100185697 *Jan 22, 2009Jul 22, 2010International Business Machines CorporationMaintaining a data structure with data set names and pointers to a plurality of catalogs
US20110167302 *Jan 7, 2010Jul 7, 2011International Business Machines CorporationDiagnostic data set component
US20120078922 *Sep 29, 2010Mar 29, 2012International Business Machines CorporationData reorganization
US20120203784 *Apr 17, 2012Aug 9, 2012International Business Machines CorporationMaintaining a data structure with data set names and pointers to a plurality of catalogs
Classifications
U.S. Classification1/1, 707/E17.01, 707/999.204
International ClassificationG06F17/30
Cooperative ClassificationY10S707/99953, Y10S707/99955, G06F17/30067
European ClassificationG06F17/30F